Ship It Weekly - DevOps, SRE, Platform and Cloud Engineering News
Ship It Weekly - DevOps, SRE, Platform and Cloud Engineering News

Ship It Weekly - DevOps, SRE, Platform and Cloud Engineering News

Teller's Tech - DevOps, SRE and Cloud Podcast

Overview
Episodes

Details

Ship It Weekly is a short, practical recap of what actually matters in DevOps, SRE, cloud infrastructure, and platform engineering.Each episode, your host Brian Teller walks through the latest outages, releases, tools, and incident writeups, then translates them into “here’s what this means for your systems” instead of just reading headlines. Expect a couple of main stories with context, a quick hit of tools or releases worth bookmarking, and the occasional segment on on-call, burnout, or team culture.This isn’t a certification prep show or a lab walkthrough. It’s aimed at people who are already working in the space and want to stay sharp without scrolling status pages, cloud updates, and blogs all week. You’ll hear about things like cloud provider incidents, Kubernetes and platform trends, Terraform and infrastructure changes, and real postmortems that are actually worth your time.Most episodes are 15–30 minutes, so you can catch up on the way to work or between meetings. Every now and then there will be a “special” focused on a big outage or a specific theme, but the default format is simple: what happened, why it matters, and what you might want to do about it in your own environment.If you’re the person people DM when something is broken in prod, or you’re building the cloud and platform everyone else ships on top of, Ship It Weekly is meant to be in your rotation.

Recent Episodes

Coinbase Outage, Meta AI Account Recovery, AWS AgentCore Code Injection, Apigee Tenant Isolation, and the Glue That Breaks Production
JUN 12, 2026
Coinbase Outage, Meta AI Account Recovery, AWS AgentCore Code Injection, Apigee Tenant Isolation, and the Glue That Breaks Production
<p>This episode of <strong>Ship It Weekly</strong> is about the hidden glue holding production together.</p><p>Brian covers Coinbase’s May 7 outage postmortem, where an AWS us-east-1 cooling failure exposed the difference between being “multi-AZ” on paper and actually being able to recover when stateful, low-latency systems are tied to a failed zone.</p><p>Then he looks at Meta’s AI-assisted Instagram support issue and why account recovery is identity infrastructure, not just customer support. If AI can influence password resets, email changes, MFA resets, or account ownership flows, that workflow needs to be treated like a production control plane.</p><p>The episode also covers AWS AgentCore CLI CVE-2026-11393, where collaborator metadata could break out into generated Python code during agent import, and an Apigee cross-tenant issue from Google’s Apigee security bulletins that shows why tenant isolation has to be tested beyond the obvious happy path.</p><p><strong>Links</strong></p><p>Coinbase May 7 outage postmortem <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.coinbase.com/blog/a-postmortem-of-our-may-7-2026-outage">https://www.coinbase.com/blog/a-postmortem-of-our-may-7-2026-outage</a></p><p>Meta AI support / Instagram account recovery reporting <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.theverge.com/tech/945658/meta-ai-support-chatbot-exploit-instagram-accounts">https://www.theverge.com/tech/945658/meta-ai-support-chatbot-exploit-instagram-accounts</a></p><p>AWS AgentCore CLI CVE-2026-11393 <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/security/security-bulletins/2026-040-aws/">https://aws.amazon.com/security/security-bulletins/2026-040-aws/</a></p><p>AgentCore CLI GitHub advisory <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/aws/agentcore-cli/security/advisories/GHSA-m4x6-gwgp-4pm7">https://github.com/aws/agentcore-cli/security/advisories/GHSA-m4x6-gwgp-4pm7</a></p><p>Google Apigee security bulletins <a target="_blank" rel="noopener noreferrer nofollow" href="https://docs.cloud.google.com/apigee/docs/security-bulletins/security-bulletins">https://docs.cloud.google.com/apigee/docs/security-bulletins/security-bulletins</a></p><p>Cloudflare real-time threat intel WAF rules <a target="_blank" rel="noopener noreferrer nofollow" href="https://blog.cloudflare.com/realtime-threat-intel-waf-rules/">https://blog.cloudflare.com/realtime-threat-intel-waf-rules/</a></p><p>AWS Lambda tenant isolation with event source mappings <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/blogs/compute/integrating-event-source-mappings-with-aws-lambda-tenant-isolation-mode/">https://aws.amazon.com/blogs/compute/integrating-event-source-mappings-with-aws-lambda-tenant-isolation-mode/</a></p><p>Amazon OpenSearch Serverless next generation <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/about-aws/whats-new/2026/05/amazon-opensearch-serverless-next-generation-generally-available/">https://aws.amazon.com/about-aws/whats-new/2026/05/amazon-opensearch-serverless-next-generation-generally-available/</a></p><p>GitHub Enterprise Managed Users IP allow list coverage <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.blog/changelog/2026-06-08-ip-allow-list-coverage-for-emu-namespaces-in-general-availability/">https://github.blog/changelog/2026-06-08-ip-allow-list-coverage-for-emu-namespaces-in-general-availability/</a></p><p>This week’s On Call Brief <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.tellerstech.com/on-call-brief-news/2026-W24/">https://www.tellerstech.com/on-call-brief-news/2026-W24/</a></p><p>More episodes and show notes <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm/">https://shipitweekly.fm/</a></p>
play-circle icon
23 MIN
Kiro CLI Approval Bypass, Amazon Braket Pickle Risk, AWS Org Logging, KEDA Upgrades, and Automation’s Hidden Boundaries
JUN 5, 2026
Kiro CLI Approval Bypass, Amazon Braket Pickle Risk, AWS Org Logging, KEDA Upgrades, and Automation’s Hidden Boundaries
<p>This episode of <strong>Ship It Weekly</strong> is about automation’s hidden boundaries. Brian covers Kiro CLI CVE-2026-9255, where piped stdin could act like user approval, Amazon Braket SDK CVE-2026-9291 and the very normal Python pickle risk hiding inside quantum job results, AWS Organizations finally emitting CloudTrail events when accounts join or leave an org, and KEDA updates that remind us autoscaling upgrades are production behavior changes.</p><p>The bigger thread this week is that automation does not remove boundaries. It moves them. Approval paths, trusted data, account membership, scaling signals, platform access, and AI-generated output all need clear ownership and visibility.</p><p>Brian also covers Kubernetes Dashboard being archived with Headlamp as the path forward, Google Cloud Remote MCP Server for AlloyDB, Apache Kafka 4.3.0, and Atlassian’s AI-native SDLC productivity claims.</p><p></p><p><strong>Sponsored by @Scale: Systems &amp; Reliability, happening June 25 at the Meydenbauer Center in Bellevue, Washington. Register at </strong><a target="_blank" rel="noopener noreferrer nofollow" href="https://bit.ly/4xd2FdG"><strong>https://bit.ly/4xd2FdG</strong></a></p><p></p><p><strong>Links</strong></p><p>Kiro CLI CVE-2026-9255 <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/security/security-bulletins/2026-035-aws/">https://aws.amazon.com/security/security-bulletins/2026-035-aws/</a></p><p>Amazon Braket SDK CVE-2026-9291 <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/security/security-bulletins/2026-036-aws/">https://aws.amazon.com/security/security-bulletins/2026-036-aws/</a></p><p>AWS Organizations CloudTrail account events <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/about-aws/whats-new/2026/05/aws-organizations-cloudtrail/">https://aws.amazon.com/about-aws/whats-new/2026/05/aws-organizations-cloudtrail/</a></p><p>KEDA v2.20.0 release <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/kedacore/keda/releases/tag/v2.20.0">https://github.com/kedacore/keda/releases/tag/v2.20.0</a></p><p>KEDA v2.19.0 release <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/kedacore/keda/releases/tag/v2.19.0">https://github.com/kedacore/keda/releases/tag/v2.19.0</a></p><p>Kubernetes Dashboard archived / Headlamp path forward <a target="_blank" rel="noopener noreferrer nofollow" href="https://kubernetes.io/blog/2026/06/04/dashboard-archived-what-now/">https://kubernetes.io/blog/2026/06/04/dashboard-archived-what-now/</a></p><p>Google Cloud Remote MCP Server for AlloyDB <a target="_blank" rel="noopener noreferrer nofollow" href="https://cloud.google.com/blog/products/databases/alloydb-remote-mcp-server-now-ga">https://cloud.google.com/blog/products/databases/alloydb-remote-mcp-server-now-ga</a></p><p>Apache Kafka 4.3.0 <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.confluent.io/blog/apache-kafka-4-3-release-announcement/">https://www.confluent.io/blog/apache-kafka-4-3-release-announcement/</a></p><p>Atlassian AI-native SDLC productivity claims <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.atlassian.com/blog/software-teams/ai-native-sdlc">https://www.atlassian.com/blog/software-teams/ai-native-sdlc</a></p><p>This week’s On Call Brief <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.tellerstech.com/on-call-brief/2026-W23/">https://www.tellerstech.com/on-call-brief/2026-W23/</a></p><p>More episodes and show notes <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm/">https://shipitweekly.fm/</a></p>
play-circle icon
20 MIN
GitHub Supply Chain Attacks, Railway’s GCP Outage, Discord’s Voice Failure, AWS Retry Changes, and Trusted Tool Risk
MAY 29, 2026
GitHub Supply Chain Attacks, Railway’s GCP Outage, Discord’s Voice Failure, AWS Retry Changes, and Trusted Tool Risk
<p>This episode of <strong>Ship It Weekly</strong> is about trusted tools becoming production dependencies. Brian covers a rough GitHub supply chain week, including the compromised Nx Console VS Code extension tied to exposed GitHub internal repositories and the Megalodon campaign abusing GitHub Actions workflows across thousands of public repos.</p><p>The bigger thread this week is that the tools around production are increasingly part of production. Brian also covers Railway’s GCP account suspension outage, Discord’s voice outage during a Kubernetes migration, AWS changing SDK retry behavior, CVE-2026-9133 in the RabbitMQ AWS plugin, and a Reddit story about stolen AWS keys turning into a $14,000 Bedrock bill.</p><p>Brian also touches on OpenTelemetry graduating from the CNCF, Claude Code security risk, GitLab Secrets Manager, Google Cloud AI spend caps, and a Redshift Python driver RCE.</p><p></p><p><strong>Full source list and extra links are available on this episode’s page at </strong><a target="_blank" rel="noopener noreferrer nofollow" href="http://shipitweekly.fm"><strong>shipitweekly.fm</strong></a><strong>.</strong></p><p></p><p><strong>Links</strong></p><p>Nx Console compromise <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.stepsecurity.io/blog/nx-console-vs-code-extension-compromised">https://www.stepsecurity.io/blog/nx-console-vs-code-extension-compromised</a></p><p>Megalodon GitHub Actions attack <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.stepsecurity.io/blog/megalodon-mass-github-actions-secret-exfiltration-across-5-500-public-repositories">https://www.stepsecurity.io/blog/megalodon-mass-github-actions-secret-exfiltration-across-5-500-public-repositories</a></p><p>Railway GCP outage <a target="_blank" rel="noopener noreferrer nofollow" href="https://blog.railway.com/p/incident-report-may-19-2026-gcp-account-outage">https://blog.railway.com/p/incident-report-may-19-2026-gcp-account-outage</a></p><p>Discord voice outage <a target="_blank" rel="noopener noreferrer nofollow" href="https://discord.com/blog/behind-the-scenes-of-the-3-25-26-voice-outage">https://discord.com/blog/behind-the-scenes-of-the-3-25-26-voice-outage</a></p><p>AWS SDK retry changes <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/blogs/developer/announcing-updated-retry-behavior-for-aws-sdks-and-tools/">https://aws.amazon.com/blogs/developer/announcing-updated-retry-behavior-for-aws-sdks-and-tools/</a></p><p>RabbitMQ AWS plugin CVE-2026-9133 <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/security/security-bulletins/2026-034-aws/">https://aws.amazon.com/security/security-bulletins/2026-034-aws/</a></p><p>AWS Bedrock cost spike Reddit thread <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.reddit.com/r/aws/comments/1tm3ydo/aws_bedrock_cost_spike_14000_usd/">https://www.reddit.com/r/aws/comments/1tm3ydo/aws_bedrock_cost_spike_14000_usd/</a></p><p>This week’s On Call Brief <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.tellerstech.com/on-call-brief/2026-W22/">https://www.tellerstech.com/on-call-brief/2026-W22/</a></p><p>More episodes and show notes <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm/">https://shipitweekly.fm/</a></p>
play-circle icon
23 MIN
Ship It Conversations: Jake Warner on Cycle.io, Bare Metal’s Comeback, and Why Private Cloud Is Getting Interesting Again
MAY 26, 2026
Ship It Conversations: Jake Warner on Cycle.io, Bare Metal’s Comeback, and Why Private Cloud Is Getting Interesting Again
<p>This is a guest conversation episode of <strong>Ship It Weekly</strong>, separate from the weekly news recaps.</p><p>In this Ship It: Conversations episode, I talk with Jake Warner, founder and CEO of Cycle.io, about private cloud, bare metal, Kubernetes fatigue, and why some teams are rethinking how much infrastructure complexity they actually want to carry.</p><p>We talk about why bare metal and private cloud are getting interesting again, especially around cost, performance, data sovereignty, compliance, and platform ownership. Jake explains how Cycle approaches infrastructure as a pool of resources, why he thinks in terms of “environments as code” instead of traditional infrastructure as code, and how teams can run containers and VMs together across bare metal, cloud, and hybrid environments.</p><p>The bigger theme here is that this is not really a “cloud versus bare metal” conversation. It is about choosing the right level of abstraction. Sometimes Kubernetes is the right answer. Sometimes managed cloud services make sense. And sometimes teams just need a more opinionated platform that lets developers ship without requiring a large DevOps army to keep everything running.</p><p><strong>Highlights</strong></p><p>• Why some teams are moving back toward private cloud and bare metal</p><p>• The role of cost, data sovereignty, compliance, and performance in infrastructure decisions</p><p>• Why bare metal does not have to mean going back to old-school racking and stacking pain</p><p>• How Cycle turns raw compute into a private cloud-style resource pool</p><p>• Why Jake thinks about “environments as code” instead of only infrastructure as code</p><p>• What “no DevOps army required” means in practice for engineering-heavy teams</p><p>• Why some companies need VMs and containers running together on the same platform</p><p>• Where Kubernetes still makes sense, especially for highly customized infrastructure needs</p><p>• Why opinionated platforms can be valuable when teams want fewer knobs and better defaults</p><p>• Active-active thinking, failover risk, and why application-level replication often matters more than platform-level storage magic</p><p>• Why bandwidth, performance density, and predictable pricing can make bare metal attractive again</p><p>• The weird continued gravity of AWS us-east-1, even for teams trying to move workloads elsewhere</p><p>• How AI workloads, GPUs, and hype cycles fit into the private cloud and platform conversation</p><p>• Jake’s advice for modernizing hybrid or on-prem infrastructure: containerize first, then look hard at your dependencies</p><p><strong>Jake’s links</strong></p><p>• <a target="_blank" rel="noopener noreferrer nofollow" href="http://Cycle.io">Cycle.io</a>: <a target="_blank" rel="noopener noreferrer nofollow" href="https://cycle.io/">https://cycle.io/</a></p><p>• Cycle Slack community: <a target="_blank" rel="noopener noreferrer nofollow" href="https://slack.cycle.io/">https://slack.cycle.io/</a></p><p>• Jake Warner on LinkedIn: <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.linkedin.com/in/jakewarner/">https://www.linkedin.com/in/jakewarner/</a></p><p><strong>Our links</strong></p><p>More episodes + show notes + links: <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm">https://shipitweekly.fm</a></p><p>On Call Brief: <a target="_blank" rel="noopener noreferrer nofollow" href="https://oncallbrief.com">https://oncallbrief.com</a></p>
play-circle icon
36 MIN
CISA’s GitHub Leak, AI Root Cause Analysis, Copilot Agents, Claude Code in CI/CD, and Kubernetes Seccomp Risk
MAY 22, 2026
CISA’s GitHub Leak, AI Root Cause Analysis, Copilot Agents, Claude Code in CI/CD, and Kubernetes Seccomp Risk
<p>This episode of <strong>Ship It Weekly</strong> is about secrets, agents, risky defaults, and follow-up work that never gets done. Brian covers the CISA contractor GitHub leak involving AWS keys, internal docs, Terraform, Kubernetes, Argo CD, and CI/CD context, plus AWS DevOps Agent doing automated RCA across Datadog, Elasticsearch, CloudTrail, and EKS.</p><p>Brian also covers MS Copilot Studio computer-using agents, Claude Code in Bitbucket Agentic Pipelines, CVE-2026-46333 and Kubernetes seccomp defaults, GitHub OIDC for Dependabot, Java pods getting OOMKilled, LLM-generated SQL that can be wrong but still run, and why postmortem action items die without ownership.</p><p></p><p><strong>Sponsored by Guardsquare </strong><a target="_blank" rel="noopener noreferrer nofollow" href="https://hubs.ly/Q04fJgkJ0"><strong>https://hubs.ly/Q04fJgkJ0</strong></a></p><p></p><p><strong>Links</strong></p><p>CISA GitHub leak <a target="_blank" rel="noopener noreferrer nofollow" href="https://blog.gitguardian.com/how-we-got-a-cisa-github-leak-taken-down-in-26-hours/">https://blog.gitguardian.com/how-we-got-a-cisa-github-leak-taken-down-in-26-hours/</a></p><p>AWS DevOps Agent RCA <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/blogs/devops/automate-root-cause-analysis-across-datadog-and-elasticsearch-with-aws-devops-agent/">https://aws.amazon.com/blogs/devops/automate-root-cause-analysis-across-datadog-and-elasticsearch-with-aws-devops-agent/</a></p><p>Microsoft Copilot Studio computer-using agents <a target="_blank" rel="noopener noreferrer nofollow" href="https://techcommunity.microsoft.com/blog/copilot-studio-blog/computer-using-agents-in-microsoft-copilot-studio-are-now-generally-available/4519427">https://techcommunity.microsoft.com/blog/copilot-studio-blog/computer-using-agents-in-microsoft-copilot-studio-are-now-generally-available/4519427</a></p><p>Atlassian Agentic Pipelines with Claude Code <a target="_blank" rel="noopener noreferrer nofollow" href="https://support.atlassian.com/bitbucket-cloud/docs/agentic-pipelines/">https://support.atlassian.com/bitbucket-cloud/docs/agentic-pipelines/</a></p><p>CVE-2026-46333 <a target="_blank" rel="noopener noreferrer nofollow" href="https://nvd.nist.gov/vuln/detail/CVE-2026-46333">https://nvd.nist.gov/vuln/detail/CVE-2026-46333</a></p><p>Kubernetes seccomp <a target="_blank" rel="noopener noreferrer nofollow" href="https://kubernetes.io/docs/reference/node/seccomp/">https://kubernetes.io/docs/reference/node/seccomp/</a></p><p>GitHub OIDC for Dependabot and code scanning <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.blog/changelog/2026-05-19-expanded-oidc-support-for-dependabot-and-code-scanning/">https://github.blog/changelog/2026-05-19-expanded-oidc-support-for-dependabot-and-code-scanning/</a></p><p>Java pods OOMKilled in Kubernetes <a target="_blank" rel="noopener noreferrer nofollow" href="https://dzone.com/articles/java-pod-oomkill-kubernetes">https://dzone.com/articles/java-pod-oomkill-kubernetes</a></p><p>LLM-generated SQL risks <a target="_blank" rel="noopener noreferrer nofollow" href="https://readyset.io/blog/why-llms-write-incorrect-sql-and-what-that-means-for-your-database">https://readyset.io/blog/why-llms-write-incorrect-sql-and-what-that-means-for-your-database</a></p><p>Postmortem action items <a target="_blank" rel="noopener noreferrer nofollow" href="https://incident.io/blog/why-do-post-mortem-action-items-fail-how-to-make-incident-follow-ups-actually-get-done">https://incident.io/blog/why-do-post-mortem-action-items-fail-how-to-make-incident-follow-ups-actually-get-done</a></p><p>On Call Brief <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.tellerstech.com/on-call-brief/2026-W21/">https://www.tellerstech.com/on-call-brief/2026-W21/</a></p><p>More episodes + show notes <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm/">https://shipitweekly.fm/</a></p>
play-circle icon
22 MIN