Ship It Weekly - DevOps, SRE, Platform and Cloud Engineering News
Ship It Weekly - DevOps, SRE, Platform and Cloud Engineering News

Ship It Weekly - DevOps, SRE, Platform and Cloud Engineering News

Teller's Tech - DevOps, SRE and Cloud Podcast

Overview
Episodes

Details

Ship It Weekly is a short, practical recap of what actually matters in DevOps, SRE, cloud infrastructure, and platform engineering.Each episode, your host Brian Teller walks through the latest outages, releases, tools, and incident writeups, then translates them into “here’s what this means for your systems” instead of just reading headlines. Expect a couple of main stories with context, a quick hit of tools or releases worth bookmarking, and the occasional segment on on-call, burnout, or team culture.This isn’t a certification prep show or a lab walkthrough. It’s aimed at people who are already working in the space and want to stay sharp without scrolling status pages, cloud updates, and blogs all week. You’ll hear about things like cloud provider incidents, Kubernetes and platform trends, Terraform and infrastructure changes, and real postmortems that are actually worth your time.Most episodes are 10–25 minutes, so you can catch up on the way to work or between meetings. Every now and then there will be a “special” focused on a big outage or a specific theme, but the default format is simple: what happened, why it matters, and what you might want to do about it in your own environment.If you’re the person people DM when something is broken in prod, or you’re building the cloud and platform everyone else ships on top of, Ship It Weekly is meant to be in your rotation.

Recent Episodes

Cursor Deletes PocketOS Prod DB, .de DNSSEC Outage, Bluesky Postmortem, Argo CD, and Copy Fail
MAY 8, 2026
Cursor Deletes PocketOS Prod DB, .de DNSSEC Outage, Bluesky Postmortem, Argo CD, and Copy Fail
<p>This episode of <strong>Ship It Weekly</strong> is about modern reliability getting squeezed from both directions. Old-school failures still hit hard, like broken DNSSEC, kernel privilege escalation bugs, and GitOps behavior changes. But newer automation layers add a second kind of risk, where AI agents, machine identity, and cloud control planes can do real damage fast when authority is too broad. Brian covers the Cursor and PocketOS production database wipe, the .de DNSSEC outage and Cloudflare’s response, Bluesky’s April outage postmortem, Argo CD v3.1.16 reaching end of life plus the v3.4.1 behavior change, Linux kernel CVE-2026-31431 under active exploitation, and why Google Cloud Agent Identity and AWS MCP Server GA both point to agents becoming first-class infrastructure actors.</p><p></p><p><strong>Sponsored by Guardsquare </strong><a target="_blank" rel="noopener noreferrer nofollow" href="https://hubs.ly/Q04fJgkJ0"><strong>https://hubs.ly/Q04fJgkJ0</strong></a></p><p></p><p><strong>Links</strong></p><p>Cursor / PocketOS production database wipe <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.tellerstech.com/on-call-brief/2026-W19/">https://www.tellerstech.com/on-call-brief/2026-W19/</a></p><p>Cloudflare on the .de DNSSEC outage <a target="_blank" rel="noopener noreferrer nofollow" href="https://blog.cloudflare.com/de-tld-outage-dnssec/">https://blog.cloudflare.com/de-tld-outage-dnssec/</a></p><p>Bluesky April 2026 outage postmortem <a target="_blank" rel="noopener noreferrer nofollow" href="https://pckt.blog/b/jcalabro/april-2026-outage-post-mortem-219ebg2">https://pckt.blog/b/jcalabro/april-2026-outage-post-mortem-219ebg2</a></p><p>Argo CD releases: v3.1.16 final release and v3.4.1 behavior change <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/argoproj/argo-cd/releases">https://github.com/argoproj/argo-cd/releases</a></p><p>Linux kernel CVE-2026-31431 <a target="_blank" rel="noopener noreferrer nofollow" href="https://nvd.nist.gov/vuln/detail/CVE-2026-31431">https://nvd.nist.gov/vuln/detail/CVE-2026-31431</a></p><p>AWS bulletin for CVE-2026-31431 <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/security/security-bulletins/rss/2026-026-aws/">https://aws.amazon.com/security/security-bulletins/rss/2026-026-aws/</a></p><p>Google Cloud Agent Identity <a target="_blank" rel="noopener noreferrer nofollow" href="https://cloud.google.com/blog/products/identity-security/whats-new-in-iam-security-governance-and-runtime-defense">https://cloud.google.com/blog/products/identity-security/whats-new-in-iam-security-governance-and-runtime-defense</a></p><p>AWS MCP Server is now generally available <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/blogs/aws/the-aws-mcp-server-is-now-generally-available/">https://aws.amazon.com/blogs/aws/the-aws-mcp-server-is-now-generally-available/</a></p><p>Cross-region disaster recovery for Amazon EKS using AWS Backup <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/blogs/containers/cross-region-disaster-recovery-for-amazon-eks-using-aws-backup/">https://aws.amazon.com/blogs/containers/cross-region-disaster-recovery-for-amazon-eks-using-aws-backup/</a></p><p>Google Ads new data retention policy starting June 1, 2026 <a target="_blank" rel="noopener noreferrer nofollow" href="https://ads-developers.googleblog.com/2026/05/new-data-retention-policy-for-google.html">https://ads-developers.googleblog.com/2026/05/new-data-retention-policy-for-google.html</a></p><p>This week’s On Call Brief <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.tellerstech.com/on-call-brief/2026-W19/">https://www.tellerstech.com/on-call-brief/2026-W19/</a></p><p>More episodes and show notes <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm/">https://shipitweekly.fm/</a></p>
play-circle icon
21 MIN
Ship It Conversations: Gareth Kersey on IaCConf 2026, AI, and Corey Quinn’s Terraform Keynote
MAY 5, 2026
Ship It Conversations: Gareth Kersey on IaCConf 2026, AI, and Corey Quinn’s Terraform Keynote
<p>This is a guest conversation episode of <strong>Ship It Weekly</strong>, separate from the weekly news recaps.</p><p>This episode is not sponsored. I wanted to cover IaCConf because the theme lines up closely with what Ship It Weekly focuses on: infrastructure, platform engineering, DevOps, SRE, and how teams are adapting to AI-driven change.</p><p>In this Ship It: Conversations episode, I talk with Gareth Kersey about IaCConf 2026, a free virtual conference focused on infrastructure as code, platform engineering, DevOps, SRE, and infrastructure operations. The conference is May 14th 2026.</p><p>The main theme is “keeping pace.” Not just keeping pace with new tools, but keeping pace with the speed of software delivery now that AI is changing how quickly application teams can write, ship, and change code.</p><p>We talk about what that means for the infrastructure teams underneath it all: the people responsible for Terraform, Kubernetes, GitOps, policies, secrets, cost, security, rollback paths, and making sure faster delivery does not turn into faster chaos.</p><p>Gareth walks through the IaCConf 2026 agenda, including Corey Quinn’s keynote, AI and Terraform sessions, platform engineering panels, Kubernetes and Argo CD talks, AI agents managing infrastructure as code, governance challenges, and the risk of 10x code velocity becoming 10x operational risk.</p><p>The bigger theme here is that AI is not just changing how code gets written. It is changing the pressure on the systems around delivery. Infrastructure as code, platform engineering, policy, and operational guardrails matter even more when the pace of change goes up.</p><p><strong>Highlights</strong></p><p>• What “keeping pace” means for infrastructure, DevOps, SRE, and platform teams</p><p>• Why faster application development can create more downstream operational pressure</p><p>• Corey Quinn’s keynote, “AI Speaks Terraform Like a Tourist”</p><p>• How AI-generated infrastructure changes create new governance and review challenges</p><p>• Why infrastructure as code still matters as AI agents and automation become more common</p><p>• Sessions covering Terraform, Kubernetes, Argo CD, GitOps, platform engineering, and AI-driven workflows</p><p>• The risk of 10x code velocity turning into 10x operational risk</p><p>• How platform teams can support faster developers without giving up safety or governance</p><p>• Why IaCConf includes panels, demos, technical talks, and practitioner stories instead of only tool-specific content</p><p>• How IaCConf has grown from its first event in 2025 into a broader infrastructure community</p><p>• Why the event is trying to stay community-focused instead of becoming just another vendor marketing conference</p><p>• The role of feedback, future spotlight events, in-person meetups, and possible community spaces around IaCConf</p><p>• Why registering still makes sense even if you cannot attend live, since sessions are available afterward</p><p><strong>IaCConf links</strong></p><p>• IaCConf 2026 registration page - <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.iacconf.com/iacconf-2026">https://www.iacconf.com/iacconf-2026</a></p><p>• IaCConf LinkedIn page - <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.linkedin.com/showcase/iac-conf/">https://www.linkedin.com/showcase/iac-conf/</a></p><p>• IaCConf: <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.iacconf.com/">https://www.iacconf.com/</a></p><p>• IaCConf is supported by Spacelift: <a target="_blank" rel="noopener noreferrer nofollow" href="https://spacelift.com">https://spacelift.com</a></p><p><strong>Our links</strong></p><p>More episodes + show notes + links: <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm">https://shipitweekly.fm</a></p><p>On Call Brief: <a target="_blank" rel="noopener noreferrer nofollow" href="https://oncallbrief.com">https://oncallbrief.com</a></p>
play-circle icon
31 MIN
GitHub RCE, AI Agent Prompt Injection, and the New Reality: Your Developer Toolchain Is Production Now
MAY 1, 2026
GitHub RCE, AI Agent Prompt Injection, and the New Reality: Your Developer Toolchain Is Production Now
<p>This episode of <strong>Ship It Weekly</strong> is about the developer toolchain becoming part of production. Brian covers GitHub’s critical git push RCE, AI-assisted reverse engineering, prompt injection against AI agents in GitHub workflows, Elementary’s malicious CLI release, GitHub’s merge queue regression, Cal.com going closed source, and Copilot moving toward usage-based billing. Plus: MinIO’s repo archive, Ghostty leaving GitHub, Docker Hardened Images, and Azure DevOps security updates.</p><p><strong>Links</strong></p><p>GitHub git push RCE <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.blog/security/securing-the-git-push-pipeline-responding-to-a-critical-remote-code-execution-vulnerability/">https://github.blog/security/securing-the-git-push-pipeline-responding-to-a-critical-remote-code-execution-vulnerability/</a></p><p>AI-assisted reverse engineering <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.darkreading.com/application-security/reverse-engineering-ai-unearths-high-severity-github-bug">https://www.darkreading.com/application-security/reverse-engineering-ai-unearths-high-severity-github-bug</a></p><p>AI agents + GitHub Actions prompt injection <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.theregister.com/2026/04/15/claude_gemini_copilot_agents_hijacked/">https://www.theregister.com/2026/04/15/claude_gemini_copilot_agents_hijacked/</a></p><p>Elementary malicious CLI release <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.elementary-data.com/post/security-incident-report-malicious-release-of-elementary-oss-python-cli-v0-23-3">https://www.elementary-data.com/post/security-incident-report-malicious-release-of-elementary-oss-python-cli-v0-23-3</a></p><p>GitHub merge queue regression <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.blog/news-insights/company-news/an-update-on-github-availability/">https://github.blog/news-insights/company-news/an-update-on-github-availability/</a></p><p><a target="_blank" rel="noopener noreferrer nofollow" href="http://Cal.com">Cal.com</a> going closed source <a target="_blank" rel="noopener noreferrer nofollow" href="https://cal.com/blog/cal-com-goes-closed-source-why">https://cal.com/blog/cal-com-goes-closed-source-why</a></p><p>GitHub Copilot billing <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.blog/news-insights/company-news/github-copilot-is-moving-to-usage-based-billing/">https://github.blog/news-insights/company-news/github-copilot-is-moving-to-usage-based-billing/</a></p><p>MinIO archived repo <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/minio/minio">https://github.com/minio/minio</a></p><p>Ghostty leaving GitHub <a target="_blank" rel="noopener noreferrer nofollow" href="https://mitchellh.com/writing/ghostty-leaving-github">https://mitchellh.com/writing/ghostty-leaving-github</a></p><p>Docker Hardened Images <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.docker.com/blog/why-we-chose-the-harder-path-docker-hardened-images-one-year-later/">https://www.docker.com/blog/why-we-chose-the-harder-path-docker-hardened-images-one-year-later/</a></p><p>Azure DevOps security updates <a target="_blank" rel="noopener noreferrer nofollow" href="https://devblogs.microsoft.com/devops/one-click-security-scanning-and-org-wide-alert-triage-come-to-advanced-security/">https://devblogs.microsoft.com/devops/one-click-security-scanning-and-org-wide-alert-triage-come-to-advanced-security/</a></p><p>On Call Brief <a target="_blank" rel="noopener noreferrer nofollow" href="https://oncallbrief.com/">https://oncallbrief.com/</a></p><p>More episodes <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm/">https://shipitweekly.fm/</a></p>
play-circle icon
25 MIN
Kubernetes 1.36, Gateway API v1.5, AWS Copilot End of Support, and Cloudflare Non-Human Identities
APR 24, 2026
Kubernetes 1.36, Gateway API v1.5, AWS Copilot End of Support, and Cloudflare Non-Human Identities
<p>This episode of <strong>Ship It Weekly</strong> is about platforms getting sharper about defaults, ownership, and the old paths they are no longer willing to quietly carry forever. Brian covers Kubernetes 1.36 and why it feels more like a cleanup-and-maturity release than a flashy feature dump, Gateway API v1.5 moving more networking behavior into the stable path, AWS Copilot CLI reaching end of support and what that means for teams still sitting on the older “easy” ECS workflow, Airbnb’s alert-development overhaul and why noisy or weak alerts are often a workflow problem long before they become an on-call problem, and Cloudflare’s push to treat scripts, agents, and third-party tools like real identities with real blast radius. He also hits the latest Azure DevOps Server patches and Google’s OTLP metrics support for Cloud Monitoring.</p><p><strong>Links</strong></p><p>Kubernetes v1.36 release <a target="_blank" rel="noopener noreferrer nofollow" href="https://kubernetes.io/blog/2026/04/22/kubernetes-v1-36-release/">https://kubernetes.io/blog/2026/04/22/kubernetes-v1-36-release/</a></p><p>Gateway API v1.5 <a target="_blank" rel="noopener noreferrer nofollow" href="https://kubernetes.io/blog/2026/04/21/gateway-api-v1-5/">https://kubernetes.io/blog/2026/04/21/gateway-api-v1-5/</a></p><p>AWS Copilot CLI end of support <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/blogs/containers/announcing-the-end-of-support-for-the-aws-copilot-cli/">https://aws.amazon.com/blogs/containers/announcing-the-end-of-support-for-the-aws-copilot-cli/</a></p><p>Airbnb on alert development <a target="_blank" rel="noopener noreferrer nofollow" href="https://medium.com/airbnb-engineering/it-wasnt-a-culture-problem-upleveling-alert-development-at-airbnb-01e2290eb0f5">https://medium.com/airbnb-engineering/it-wasnt-a-culture-problem-upleveling-alert-development-at-airbnb-01e2290eb0f5</a></p><p>Cloudflare on non-human identities, OAuth visibility, and scoped permissions <a target="_blank" rel="noopener noreferrer nofollow" href="https://blog.cloudflare.com/improved-developer-security/">https://blog.cloudflare.com/improved-developer-security/</a></p><p>Azure DevOps Server April patches <a target="_blank" rel="noopener noreferrer nofollow" href="https://devblogs.microsoft.com/devops/april-patches-for-azure-devops-server/">https://devblogs.microsoft.com/devops/april-patches-for-azure-devops-server/</a></p><p>OTLP metrics for Google Cloud Monitoring <a target="_blank" rel="noopener noreferrer nofollow" href="https://cloud.google.com/blog/products/management-tools/otlp-opentelemetry-protocol-for-google-cloud-monitoring-metrics">https://cloud.google.com/blog/products/management-tools/otlp-opentelemetry-protocol-for-google-cloud-monitoring-metrics</a></p><p>Past episode where we talked about Cloudflare Mesh <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.tellerstech.com/ship-it-weekly/aws-interconnect-ga-cloudflare-mesh-gitlab-19-eks-auto-mode-and-opentelemetry-config/">https://www.tellerstech.com/ship-it-weekly/aws-interconnect-ga-cloudflare-mesh-gitlab-19-eks-auto-mode-and-opentelemetry-config/</a></p><p>This week’s On Call Brief <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.tellerstech.com/on-call-brief/2026-W16/">https://www.tellerstech.com/on-call-brief/2026-W16/</a></p><p>On Call Brief: <a target="_blank" rel="noopener noreferrer nofollow" href="https://oncallbrief.com/">https://oncallbrief.com/</a></p><p>More episodes and show notes <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm/">https://shipitweekly.fm/</a></p>
play-circle icon
20 MIN
Ship It Conversations: Stephane Moser on Pipedrive’s Jenkins-to-GitHub Actions Migration, Argo CD, and CI/CD at Scale
APR 19, 2026
Ship It Conversations: Stephane Moser on Pipedrive’s Jenkins-to-GitHub Actions Migration, Argo CD, and CI/CD at Scale
<p>This is a guest conversation episode of <strong>Ship It Weekly</strong>, separate from the weekly news recaps.</p><p>In this Ship It: Conversations episode, I talk with Stephane Moser about Pipedrive’s move from Jenkins to GitHub Actions, building self-hosted runners on Kubernetes, shifting deployments toward GitOps with Argo CD, and what it actually takes to roll out a big CI/CD change across a large engineering org.</p><p>We talk about why Jenkins had become painful, from Groovy friction to noisy-neighbor problems on shared VMs, why GitHub Actions fit better, how reusable workflows and custom actions helped, why Argo CD beat out Flux for their use case, and how they had to build better observability and internal deployment visibility around GitHub as they scaled.</p><p>The bigger theme here is that this was not just a tooling swap. It was a product and platform migration. Isolation, repeatability, self-service, rollout strategy, and observability mattered just as much as the actual CI/CD tools.</p><p><strong>Highlights</strong></p><p>• Why Jenkins stopped working well for them: Groovy friction, shared VM contention, and poor predictability </p><p>• Replacing CodeShip pull request validation first as the low-blast-radius starting point </p><p>• Using Actions Runner Controller on Kubernetes with EKS and Karpenter for self-hosted runners </p><p>• Why reusable workflows and custom actions helped cut repetition across hundreds of services </p><p>• Choosing Argo CD over Flux, Argo Workflows, Tekton, and even a short Spinnaker attempt </p><p>• Moving from push-based deploys toward GitOps for better isolation and safer credentials handling </p><p>• Building internal observability because GitHub’s workflow visibility was not enough at their scale </p><p>• Dogfooding first, then rolling migration out in batches until teams could self-serve the move </p><p>• What broke when the new system actually worked too well: bot-driven deploy volume, queueing, and fairness </p><p>• The mobile side of the story: Mac minis, unstable runners, GitHub-hosted runners, and a very different migration path </p><p>• How AI sped up parts of the mobile migration and troubleshooting, without making the migration trivial </p><p>• Stephane’s advice for big CI/CD shifts: start small, reduce blast radius, and use your own platform first</p><p><strong>Stephane’s links</strong></p><p>• LinkedIn: <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.linkedin.com/in/moserss/">https://www.linkedin.com/in/moserss/</a> </p><p>• Talk video: <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.youtube.com/watch?v=VrE1dh-1zEY">https://www.youtube.com/watch?v=VrE1dh-1zEY</a> </p><p>• Blog post Part 1: <a target="_blank" rel="noopener noreferrer nofollow" href="https://medium.com/pipedrive-engineering/so-long-jenkins-hello-github-actions-pipedrives-big-ci-cd-switch-03be29c75f63">https://medium.com/pipedrive-engineering/so-long-jenkins-hello-github-actions-pipedrives-big-ci-cd-switch-03be29c75f63</a> </p><p>• Blog post Part 2: <a target="_blank" rel="noopener noreferrer nofollow" href="https://medium.com/pipedrive-engineering/all-aboard-the-github-actions-express-pipedrives-big-ci-cd-switch-part-2-fcacf834afd2">https://medium.com/pipedrive-engineering/all-aboard-the-github-actions-express-pipedrives-big-ci-cd-switch-part-2-fcacf834afd2</a> </p><p>• GitHub: <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/moser-ss">https://github.com/moser-ss</a></p><p><strong>Our links</strong></p><p>More episodes + show notes + links: <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm">https://shipitweekly.fm</a></p><p>On Call Brief: <a target="_blank" rel="noopener noreferrer nofollow" href="https://oncallbrief.com">https://oncallbrief.com</a></p>
play-circle icon
51 MIN