Ship It Weekly - DevOps, SRE, and Platform Engineering News
Ship It Weekly - DevOps, SRE, and Platform Engineering News

Ship It Weekly - DevOps, SRE, and Platform Engineering News

Teller's Tech - DevOps SRE Podcast

Overview
Episodes

Details

Ship It Weekly is a short, practical recap of what actually matters in DevOps, SRE, and platform engineering.Each episode, your host Brian Teller walks through the latest outages, releases, tools, and incident writeups, then translates them into “here’s what this means for your systems” instead of just reading headlines. Expect a couple of main stories with context, a quick hit of tools or releases worth bookmarking, and the occasional segment on on-call, burnout, or team culture.This isn’t a certification prep show or a lab walkthrough. It’s aimed at people who are already working in the space and want to stay sharp without scrolling status pages and blogs all week. You’ll hear about things like cloud provider incidents, Kubernetes and platform trends, Terraform and infrastructure changes, and real postmortems that are actually worth your time.Most episodes are 10–25 minutes, so you can catch up on the way to work or between meetings. Every now and then there will be a “special” focused on a big outage or a specific theme, but the default format is simple: what happened, why it matters, and what you might want to do about it in your own environment.If you’re the person people DM when something is broken in prod, or you’re building the platform everyone else ships on top of, Ship It Weekly is meant to be in your rotation.

Recent Episodes

Ship It Interviews: The WHY Behind DevOps, Upskilling, and Agentic AI (with Maz Islam)
DEC 21, 2025
Ship It Interviews: The WHY Behind DevOps, Upskilling, and Agentic AI (with Maz Islam)
<p>This is a <strong>Ship It Weekly</strong> interview episode. The weekly news recaps are still weekly. These interviews drop in between when I find someone worth talking to and the convo feels useful.</p><p>In this episode I’m joined by Mazharul “Maz” Islam (DevOps with Maz). Maz is a UK-based DevOps Engineer who shares practical, real-world DevOps content on YouTube and LinkedIn. We talk about the stuff that actually matters when you’re building systems, running infrastructure, owning reliability, and living in on-call.</p><p>We hit three big things: the importance of understanding the WHY behind DevOps (not just the tools), how to upskill and keep up with the industry without burning out, and what the agentic AI era might look like for DevOps, SRE, and platform engineering teams. We also touch on MCPs and AI agents, and what “leveling up” looks like for companies that want to move faster without breaking everything.</p><p>If you’re into DevOps culture, SRE practices, platform engineering, CI/CD, infrastructure automation, and how teams should think about reliability and security as things keep changing, this one should land.</p><p><strong>Guest</strong> Mazharul Islam (DevOps with Maz) UK-based DevOps Engineer. Posts a lot of hands-on content around cloud, DevOps fundamentals, and leveling up as an engineer.</p><p><strong>Links (Maz)</strong> YouTube: <a target="_blank" rel="noopener noreferrer nofollow" href="https://m.youtube.com/@devopswithmaz">https://m.youtube.com/@devopswithmaz</a> LinkedIn: <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.linkedin.com/in/mazharul419">https://www.linkedin.com/in/mazharul419</a></p><p><strong>Topics we covered</strong> WHY behind DevOps, and why “tools” is the smallest part of it DevOps fundamentals vs tool-chasing Upskilling strategies for DevOps Engineers and SREs How to keep learning cloud and automation without drowning What strong teams measure and what “good” actually looks like (delivery, reliability, feedback loops) Agentic AI, AI agents in operations, and the next era of DevOps MCPs, automation guardrails, and safe ways to scale change How companies can “level up” their engineering org without turning it into chaos</p><p>We also discussed the previous episode of Ship It Weekly - <strong>GitHub Runner Pricing Pause, Terraform Cloud Limits, and AI in CI</strong> </p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://www.tellerstech.com/ship-it-weekly/github-runner-pricing-pause-terraform-cloud-limits-and-ai-in-ci/">https://www.tellerstech.com/ship-it-weekly/github-runner-pricing-pause-terraform-cloud-limits-and-ai-in-ci/</a></p><p><strong>Book Maz recommended</strong> The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations (Paperback, Oct 6, 2016) Gene Kim, Jez Humble, Patrick Debois, John Willis</p><p><strong>About Ship It Weekly (format)</strong> Ship It Weekly is for people running infrastructure and owning reliability. Most episodes are quick weekly news recaps for DevOps, SRE, and platform engineering. In between those weekly drops, I’ll publish interview episodes like this one.</p><p><strong>Subscribe / help the show</strong> If you want the weekly DevOps news recaps plus these interviews, hit follow or subscribe in your podcast app. And if you’re feeling generous, leave a rating or review and share this episode with a coworker (especially your on-call buddy). That stuff genuinely helps the show get discovered.</p>
play-circle icon
30 MIN
GitHub Runner Pricing Pause, Terraform Cloud Limits, and AI in CI
DEC 20, 2025
GitHub Runner Pricing Pause, Terraform Cloud Limits, and AI in CI
<p>This week on <strong>Ship It Weekly</strong>, Brian looks at how the “platform tax” is showing up everywhere: pricing model shifts, CI dependencies, and new security boundaries thanks to AI agents.</p><p>We start with GitHub Actions. GitHub announced a new “cloud platform” charge for self-hosted runners in private/internal repos… then hit pause after backlash. Hosted runner price reductions for 2026 are still planned. We also got the perfect timing joke: a GitHub incident the same week.</p><p>Next up is HashiCorp. Legacy HCP Terraform (Terraform Cloud) Free is reaching end-of-life in 2026, with orgs moving to the newer Free tier capped at 500 managed resources. If you’re running real infrastructure, this is a good moment to audit what you’re actually managing and decide whether you’re cleaning up, paying, or planning a migration.</p><p>Then we talk PromptPwnd: why stuffing untrusted PR/issue text into AI agent prompts (inside CI) can turn into a supply chain/security problem. The short version: treat AI inputs like hostile user input, keep tokens/permissions minimal, and don’t let agents “run with scissors.”</p><p>We also cover the Home Depot report about long-lived access exposure as a reminder that secrets hygiene, blast radius, and detection still matter more than the shiny tools.</p><p>In the lightning round: CDKTF is sunset/archived, Bitbucket is cleaning up free unused workspaces, and SourceHut is proposing pricing changes. We wrap with a human note on “platform whiplash” and why a simple watchlist beats carrying all this stuff in your head.</p><p><strong>Links from this episode</strong></p><p>GitHub Actions pricing + pause <a target="_blank" rel="noopener noreferrer nofollow" href="https://runs-on.com/blog/github-self-hosted-runner-fee-2026/">https://runs-on.com/blog/github-self-hosted-runner-fee-2026/</a> <a target="_blank" rel="noopener noreferrer nofollow" href="https://x.com/github/status/2001372894882918548">https://x.com/github/status/2001372894882918548</a> <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.githubstatus.com/incidents/x696x0g4t85l">https://www.githubstatus.com/incidents/x696x0g4t85l</a></p><p>HashiCorp / Terraform Cloud free plan changes <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/hashicorp/terraform-cdk?tab=readme-ov-file#sunset-notice">https://github.com/hashicorp/terraform-cdk?tab=readme-ov-file#sunset-notice</a> <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.reddit.com/r/Terraform/s/slYm77wzYr">https://www.reddit.com/r/Terraform/s/slYm77wzYr</a></p><p>PromptPwnd / AI agents in CI <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.aikido.dev/blog/promptpwnd-github-actions-ai-agents">https://www.aikido.dev/blog/promptpwnd-github-actions-ai-agents</a></p><p>Home Depot access exposure report <a target="_blank" rel="noopener noreferrer nofollow" href="https://techcrunch.com/2025/12/12/home-depot-exposed-access-to-internal-systems-for-a-year-says-researcher/">https://techcrunch.com/2025/12/12/home-depot-exposed-access-to-internal-systems-for-a-year-says-researcher/</a></p><p>Bitbucket cleanup <a target="_blank" rel="noopener noreferrer nofollow" href="https://community.atlassian.com/forums/Bitbucket-articles/Bitbucket-cleanup-of-free-unused-workspaces-what-you-need-to/ba-p/3144063">https://community.atlassian.com/forums/Bitbucket-articles/Bitbucket-cleanup-of-free-unused-workspaces-what-you-need-to/ba-p/3144063</a></p><p>SourceHut pricing proposal <a target="_blank" rel="noopener noreferrer nofollow" href="https://sourcehut.org/blog/2025-12-01-proposed-pricing-changes/">https://sourcehut.org/blog/2025-12-01-proposed-pricing-changes/</a></p>
play-circle icon
12 MIN
IBM Buys Confluent, React2Shell, and Netflix on Aurora
DEC 12, 2025
IBM Buys Confluent, React2Shell, and Netflix on Aurora
<p>In this episode of <strong>Ship It Weekly</strong>, Brian powers through a cold and digs into a very “infra grown-up” week in DevOps.</p><p>First up, IBM is buying Confluent for $11B. We talk about what that means if you’re on Confluent Cloud today, still running your own Kafka, or trying to choose between Confluent, MSK, and DIY. It’s part of a bigger pattern after IBM’s HashiCorp deal, and it has real implications for vendor concentration and “plan B” strategies.</p><p>Then we shift to React2Shell, a 10.0 RCE in React Server Components that’s already being exploited in the wild. Even if you never touch React, if you run platforms or Kubernetes for teams using Next.js or RSC, you’re on the hook for patching windows, WAF rules, and blast-radius thinking.</p><p>We also look at Netflix’s write-up on consolidating relational databases onto Aurora PostgreSQL, with big performance gains and cost savings. It’s a good excuse to step back and ask whether your own Postgres fleet still makes sense at the scale you’re at now.</p><p>In the lightning round, we hit OpenTofu 1.11’s new language features, practical Terraform “tips from the trenches,” Ghostty becoming a non-profit project, and two spec-driven dev tools (Spec Kit and OpenSpec) that show what sane AI-assisted development might look like.</p><p>For the human side, we close with “Your Brain on Incidents” and what high-stress outages actually do to people, plus a few concrete ideas for making on-call less brutal.</p><p>If you’re on a platform team, own SLOs, or you’re the person people ping when “something is wrong with prod,” this one should give you a mix of immediate to-dos and longer-term questions for your roadmap.</p><p><strong>Links:</strong></p><p>IBM + Confluent <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.confluent.io/blog/ibm-to-acquire-confluent/">https://www.confluent.io/blog/ibm-to-acquire-confluent/</a> <a target="_blank" rel="noopener noreferrer nofollow" href="https://newsroom.ibm.com/2025-12-08-ibm-to-acquire-confluent-to-create-smart-data-platform-for-enterprise-generative-ai">https://newsroom.ibm.com/2025-12-08-ibm-to-acquire-confluent-to-create-smart-data-platform-for-enterprise-generative-ai</a></p><p>React2Shell (CVE-2025-55182) <a target="_blank" rel="noopener noreferrer nofollow" href="https://react.dev/blog/2025/12/03/critical-security-vulnerability-in-react-server-components">https://react.dev/blog/2025/12/03/critical-security-vulnerability-in-react-server-components</a></p><p>Netflix on Aurora PostgreSQL <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/blogs/database/netflix-consolidates-relational-database-infrastructure-on-amazon-aurora-achieving-up-to-75-improved-performance/">https://aws.amazon.com/blogs/database/netflix-consolidates-relational-database-infrastructure-on-amazon-aurora-achieving-up-to-75-improved-performance/</a></p><p>Tools &amp; tips <a target="_blank" rel="noopener noreferrer nofollow" href="https://opentofu.org/blog/opentofu-1-11-0/">https://opentofu.org/blog/opentofu-1-11-0/</a> <a target="_blank" rel="noopener noreferrer nofollow" href="https://rosesecurity.dev/2025/12/04/terraform-tips-and-tricks.html">https://rosesecurity.dev/2025/12/04/terraform-tips-and-tricks.html</a> <a target="_blank" rel="noopener noreferrer nofollow" href="https://mitchellh.com/writing/ghostty-non-profit">https://mitchellh.com/writing/ghostty-non-profit</a> <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/github/spec-kit">https://github.com/github/spec-kit</a> <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/Fission-AI/OpenSpec">https://github.com/Fission-AI/OpenSpec</a></p><p>Human side <a target="_blank" rel="noopener noreferrer nofollow" href="https://uptimelabs.io/your-brain-on-incidents/">https://uptimelabs.io/your-brain-on-incidents/</a></p>
play-circle icon
16 MIN
AWS re:Invent for Platform Teams, GKE at 130k Nodes, and Killing Staging
DEC 4, 2025
AWS re:Invent for Platform Teams, GKE at 130k Nodes, and Killing Staging
<p>In this episode of <strong>Ship It Weekly</strong>, Brian looks at re:Invent through a platform/SRE lens and pulls out the updates that actually change how you design and run systems.</p><p>We talk about regional NAT Gateways and Route 53 Global Resolver on the networking side, ECS Express Mode and EKS Capabilities as new paved roads for app teams, S3 Vectors GA and 50 TB S3 objects for AI and data lakes, Aurora PostgreSQL dynamic data masking, CodeCommit’s return to full GA, and IAM Policy Autopilot for AI-assisted IAM policies. This was recorded mid–re:Invent, so consider it a “what matters so far” pass, not a full recap.</p><p>Outside AWS, we get into Google’s 130,000-node GKE cluster and what actually applies if you’re running normal-sized clusters, plus the “It’s time to kill staging” argument and what responsible testing in production looks like with feature flags, progressive delivery, and solid observability.</p><p>In the lightning round, we hit Zachary Loeber’s Terraform MCP server and terraform-ingest (letting AI tools speak your real Terraform modules), Runs-On’s EC2 instance rankings so you stop picking instance types by vibes, and Airbnb’s adaptive traffic management for their key-value store. We close with Nolan Lawson’s “The fate of small open source” and what it means when your platform quietly depends on one-maintainer libraries.</p><p><strong>Links from this episode:</strong></p><p><strong>AWS highlights:</strong></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/about-aws/whats-new/2025/11/aws-nat-gateway-regional-availability/">https://aws.amazon.com/about-aws/whats-new/2025/11/aws-nat-gateway-regional-availability</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/blogs/aws/introducing-amazon-route-53-global-resolver-for-secure-anycast-dns-resolution-preview/">https://aws.amazon.com/blogs/aws/introducing-amazon-route-53-global-resolver-for-secure-anycast-dns-resolution-preview</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/about-aws/whats-new/2025/11/announcing-amazon-ecs-express-mode/">https://aws.amazon.com/about-aws/whats-new/2025/11/announcing-amazon-ecs-express-mode</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/about-aws/whats-new/2025/12/amazon-s3-vectors-generally-available/">https://aws.amazon.com/about-aws/whats-new/2025/12/amazon-s3-vectors-generally-available/</a></p><p><strong>Other topics:</strong></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://cloud.google.com/blog/products/containers-kubernetes/how-we-built-a-130000-node-gke-cluster">https://cloud.google.com/blog/products/containers-kubernetes/how-we-built-a-130000-node-gke-cluster</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://thenewstack.io/its-time-to-kill-staging-the-case-for-testing-in-production/">https://thenewstack.io/its-time-to-kill-staging-the-case-for-testing-in-production/</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://blog.zacharyloeber.com/article/terraform-custom-module-mcp-server/">https://blog.zacharyloeber.com/article/terraform-custom-module-mcp-server/</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://go.runs-on.com/instances/ranking">https://go.runs-on.com/instances/ranking</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://medium.com/airbnb-engineering/from-static-rate-limiting-to-adaptive-traffic-management-in-airbnbs-key-value-store-29362764e5c2">https://medium.com/airbnb-engineering/from-static-rate-limiting-to-adaptive-traffic-management-in-airbnbs-key-value-store-29362764e5c2</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://nolanlawson.com/2025/11/16/the-fate-of-small-open-source/">https://nolanlawson.com/2025/11/16/the-fate-of-small-open-source/</a></p>
play-circle icon
22 MIN
Kubernetes Config Reality Check, EKS Control Planes, and GitHub Guardrails
NOV 26, 2025
Kubernetes Config Reality Check, EKS Control Planes, and GitHub Guardrails
<p>In this episode of <strong>Ship It Weekly</strong>, Brian digs into what’s new for people actually running infra: Kubernetes config, EKS control planes and networking, and GitHub’s latest CI/CD and Copilot updates.</p><p>We start with Kubernetes’ new configuration good practices post and how to turn it into a checklist to clean up Helm/Kustomize and kill off “hotfix from my laptop” manifests.</p><p>Then we hit AWS: EKS Provisioned Control Plane to size control plane capacity for big or noisy clusters, plus new network observability so you can see who’s talking to what across clusters and AZs instead of guessing from node metrics.</p><p>On the GitHub side, Actions OIDC tokens now include a check_run_id for tighter access control, and Copilot adds instructions files and custom agents so you can encode platform and security expectations directly into reviews and workflows.</p><p>In the lightning round, we touch on Terrascan being archived, Microsoft’s write-up of a 15.72 Tbps Aisuru DDoS attack against Azure, and AWS flat-rate CloudFront plans that bundle CDN and security into more predictable pricing.</p><p>We close with Lorin Hochstein’s “Two thought experiments” and what it looks like to write incident reports as if an AI (and your future teammates) will rely on them to debug the next outage.</p><p>If run Kubernetes in prod this one should give you a few concrete ideas for your roadmap.</p><p><strong>Links from episode</strong></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://kubernetes.io/blog/2025/11/25/configuration-good-practices/">https://kubernetes.io/blog/2025/11/25/configuration-good-practices/</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/about-aws/whats-new/2025/11/amazon-eks-provisioned-control-plane/">https://aws.amazon.com/about-aws/whats-new/2025/11/amazon-eks-provisioned-control-plane/</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/blogs/aws/monitor-network-performance-and-traffic-across-your-eks-clusters-with-container-network-observability/">https://aws.amazon.com/blogs/aws/monitor-network-performance-and-traffic-across-your-eks-clusters-with-container-network-observability/</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://github.blog/changelog/2025-11-13-github-actions-oidc-token-claims-now-include-check_run_id/">https://github.blog/changelog/2025-11-13-github-actions-oidc-token-claims-now-include-check_run_id/</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://github.blog/ai-and-ml/unlocking-the-full-power-of-copilot-code-review-master-your-instructions-files/">https://github.blog/ai-and-ml/unlocking-the-full-power-of-copilot-code-review-master-your-instructions-files/</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://docs.github.com/en/copilot/how-tos/use-copilot-agents/coding-agent/create-custom-agents">https://docs.github.com/en/copilot/how-tos/use-copilot-agents/coding-agent/create-custom-agents</a></p><p><strong>Lightning Round</strong></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/tenable/terrascan">https://github.com/tenable/terrascan</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://www.bleepingcomputer.com/news/microsoft/microsoft-aisuru-botnet-used-500-000-ips-in-15-tbps-azure-ddos-attack/">https://www.bleepingcomputer.com/news/microsoft/microsoft-aisuru-botnet-used-500-000-ips-in-15-tbps-azure-ddos-attack/</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/about-aws/whats-new/2025/11/aws-flat-rate-pricing-plans/">https://aws.amazon.com/about-aws/whats-new/2025/11/aws-flat-rate-pricing-plans/</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://sreweekly.com/sre-weekly-issue-498/">https://sreweekly.com/sre-weekly-issue-498/</a> (Lorin's Article)</p>
play-circle icon
16 MIN