Ship It Weekly - DevOps, SRE, and Platform Engineering News
Ship It Weekly - DevOps, SRE, and Platform Engineering News

Ship It Weekly - DevOps, SRE, and Platform Engineering News

Teller's Tech - DevOps SRE Podcast

Overview
Episodes

Details

Ship It Weekly is a short, practical recap of what actually matters in DevOps, SRE, and platform engineering.Each episode, your host Brian Teller walks through the latest outages, releases, tools, and incident writeups, then translates them into “here’s what this means for your systems” instead of just reading headlines. Expect a couple of main stories with context, a quick hit of tools or releases worth bookmarking, and the occasional segment on on-call, burnout, or team culture.This isn’t a certification prep show or a lab walkthrough. It’s aimed at people who are already working in the space and want to stay sharp without scrolling status pages and blogs all week. You’ll hear about things like cloud provider incidents, Kubernetes and platform trends, Terraform and infrastructure changes, and real postmortems that are actually worth your time.Most episodes are 10–25 minutes, so you can catch up on the way to work or between meetings. Every now and then there will be a “special” focused on a big outage or a specific theme, but the default format is simple: what happened, why it matters, and what you might want to do about it in your own environment.If you’re the person people DM when something is broken in prod, or you’re building the platform everyone else ships on top of, Ship It Weekly is meant to be in your rotation.

Recent Episodes

When guardrails break prod: GitHub “Too Many Requests” from legacy defenses, Kubernetes nodes/proxy GET RCE, HCP Vault resilience in an AWS regional outage, and PCI DSS scope creep
FEB 13, 2026
When guardrails break prod: GitHub “Too Many Requests” from legacy defenses, Kubernetes nodes/proxy GET RCE, HCP Vault resilience in an AWS regional outage, and PCI DSS scope creep
<p>This week on <strong>Ship It Weekly</strong>, Brian hits four stories where the guardrails become the incident.</p><p>GitHub had “Too Many Requests” caused by legacy abuse protections that outlived their moment. Takeaway: controls need owners, visibility, and a retirement plan.</p><p>Kubernetes has a nasty edge case where nodes/proxy GET can turn into command execution via WebSocket behavior. If you’ve ever handed out “telemetry” RBAC broadly, go audit it.</p><p>HashiCorp shared how HCP Vault handled a real AWS regional disruption: control plane wobbled, Dedicated data planes kept serving. Control plane vs data plane separation paying off.</p><p>AWS expanded its PCI DSS compliance package with more services and the Asia Pacific (Taipei) region. Scope changes don’t break prod today, but they turn into evidence churn later if you don’t standardize proof.</p><p>Human story: “reasonable assurance” turning into busywork.</p><p><strong>Links</strong></p><p>GitHub: When protections outlive their purpose (legacy defenses + lifecycle)</p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://github.blog/engineering/infrastructure/when-protections-outlive-their-purpose-a-lesson-on-managing-defense-systems-at-scale/">https://github.blog/engineering/infrastructure/when-protections-outlive-their-purpose-a-lesson-on-managing-defense-systems-at-scale/</a></p><p>Kubernetes nodes/proxy GET → RCE (analysis)</p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://grahamhelton.com/blog/nodes-proxy-rce">https://grahamhelton.com/blog/nodes-proxy-rce</a></p><p>OpenFaaS guidance / mitigation notes</p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://www.openfaas.com/blog/kubernetes-node-proxy-rce/">https://www.openfaas.com/blog/kubernetes-node-proxy-rce/</a></p><p>HCP Vault resilience during real AWS regional outages</p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://www.hashicorp.com/blog/how-resilient-is-hcp-vault-during-real-aws-regional-outages">https://www.hashicorp.com/blog/how-resilient-is-hcp-vault-during-real-aws-regional-outages</a></p><p>AWS: Fall 2025 PCI DSS compliance package update</p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/blogs/security/fall-2025-pci-dss-compliance-package-available-now/">https://aws.amazon.com/blogs/security/fall-2025-pci-dss-compliance-package-available-now/</a></p><p>GitHub Actions: self-hosted runner minimum version enforcement extended</p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://github.blog/changelog/2026-02-05-github-actions-self-hosted-runner-minimum-version-enforcement-extended/">https://github.blog/changelog/2026-02-05-github-actions-self-hosted-runner-minimum-version-enforcement-extended/</a></p><p>Headlamp in 2025: Project Highlights (SIG UI)</p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://kubernetes.io/blog/2026/01/22/headlamp-in-2025-project-highlights/">https://kubernetes.io/blog/2026/01/22/headlamp-in-2025-project-highlights/</a></p><p>AWS Network Firewall Active Threat Defense (MadPot)</p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/blogs/security/real-time-malware-defense-leveraging-aws-network-firewall-active-threat-defense/">https://aws.amazon.com/blogs/security/real-time-malware-defense-leveraging-aws-network-firewall-active-threat-defense/</a></p><p>Reasonable assurance turning into busywork (r/sre)</p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://www.reddit.com/r/sre/comments/1qvwbgf/at_what_point_does_reasonable_assurance_turn_into/">https://www.reddit.com/r/sre/comments/1qvwbgf/at_what_point_does_reasonable_assurance_turn_into/</a></p><p>More episodes + details: <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm">https://shipitweekly.fm</a></p>
play-circle icon
15 MIN
Azure VM Control Plane Outage, GitHub Agent HQ (Claude + Codex), Claude Opus 4.6, Gemini CLI, MCP
FEB 6, 2026
Azure VM Control Plane Outage, GitHub Agent HQ (Claude + Codex), Claude Opus 4.6, Gemini CLI, MCP
<p>This week on <strong>Ship It Weekly</strong>, Brian hits four “control plane + trust boundary” stories where the glue layer becomes the incident.</p><p>Azure had a platform incident that impacted VM management operations across multiple regions. Your app can be up, but ops is degraded.</p><p>GitHub is pushing Agent HQ (Claude + Codex in the repo/CI flow), and Actions added a case() function so workflow logic is less brittle.</p><p>MCP is becoming platform plumbing: Miro launched an MCP server and Kong launched an MCP Registry.</p><p><strong>Links</strong></p><p>Azure status incident (VM service management issues) <a target="_blank" rel="noopener noreferrer nofollow" href="https://azure.status.microsoft/en-us/status/history/?trackingId=FNJ8-VQZ">https://azure.status.microsoft/en-us/status/history/?trackingId=FNJ8-VQZ</a></p><p>GitHub Agent HQ: Claude + Codex <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.blog/news-insights/company-news/pick-your-agent-use-claude-and-codex-on-agent-hq/">https://github.blog/news-insights/company-news/pick-your-agent-use-claude-and-codex-on-agent-hq/</a></p><p>GitHub Actions update (case() function) <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.blog/changelog/2026-01-29-github-actions-smarter-editing-clearer-debugging-and-a-new-case-function/">https://github.blog/changelog/2026-01-29-github-actions-smarter-editing-clearer-debugging-and-a-new-case-function/</a></p><p>Claude Opus 4.6 <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.anthropic.com/news/claude-opus-4-6">https://www.anthropic.com/news/claude-opus-4-6</a></p><p>How Google SREs use Gemini CLI <a target="_blank" rel="noopener noreferrer nofollow" href="https://cloud.google.com/blog/topics/developers-practitioners/how-google-sres-use-gemini-cli-to-solve-real-world-outages">https://cloud.google.com/blog/topics/developers-practitioners/how-google-sres-use-gemini-cli-to-solve-real-world-outages</a></p><p>Miro MCP server announcement <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.businesswire.com/news/home/20260202411670/en/Miro-Launches-MCP-Server-to-Connect-Visual-Collaboration-With-AI-Coding-Tools">https://www.businesswire.com/news/home/20260202411670/en/Miro-Launches-MCP-Server-to-Connect-Visual-Collaboration-With-AI-Coding-Tools</a></p><p>Kong MCP Registry announcement <a target="_blank" rel="noopener noreferrer nofollow" href="https://konghq.com/company/press-room/press-release/kong-introduces-mcp-registry">https://konghq.com/company/press-room/press-release/kong-introduces-mcp-registry</a></p><p>GitHub Actions hosted runners incident thread <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/orgs/community/discussions/186184">https://github.com/orgs/community/discussions/186184</a></p><p>DockerDash / Ask Gordon research <a target="_blank" rel="noopener noreferrer nofollow" href="https://noma.security/blog/dockerdash-two-attack-paths-one-ai-supply-chain-crisis/">https://noma.security/blog/dockerdash-two-attack-paths-one-ai-supply-chain-crisis/</a></p><p>Terraform 1.15 alpha <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/hashicorp/terraform/releases/tag/v1.15.0-alpha20260204">https://github.com/hashicorp/terraform/releases/tag/v1.15.0-alpha20260204</a></p><p>Wiz Moltbook write-up <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.wiz.io/blog/exposed-moltbook-database-reveals-millions-of-api-keys">https://www.wiz.io/blog/exposed-moltbook-database-reveals-millions-of-api-keys</a></p><p>Chainguard “EmeritOSS” <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.chainguard.dev/unchained/introducing-chainguard-emeritoss">https://www.chainguard.dev/unchained/introducing-chainguard-emeritoss</a></p><p>More episodes + details: <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm">https://shipitweekly.fm</a></p>
play-circle icon
20 MIN
CodeBreach in AWS CodeBuild, Bazel TLS Certificate Expiry Breaks Builds, Helm Charts Reliability Audit, and New n8n Sandbox Escape RCE
JAN 30, 2026
CodeBreach in AWS CodeBuild, Bazel TLS Certificate Expiry Breaks Builds, Helm Charts Reliability Audit, and New n8n Sandbox Escape RCE
<p>This week on <strong>Ship It Weekly,</strong> Brian looks at four “glue failures” that can turn into real outages and real security risk.</p><p>We start with CodeBreach: AWS disclosed a CodeBuild webhook filter misconfig in a small set of AWS-managed repos. The takeaway is simple: CI trigger logic is part of your security boundary now.</p><p>Next is the Bazel TLS cert expiry incident. Cert failures are a binary cliff, and “auto renew” is only one link in the chain.</p><p>Third is Helm chart reliability. Prequel reviewed 105 charts and found a lot of demo-friendly defaults that don’t hold up under real load, rollouts, or node drains.</p><p>Fourth is n8n. Two new high-severity flaws disclosed by JFrog. “Authenticated” still matters because workflow authoring is basically code execution, and these tools sit next to your secrets.</p><p>Lightning round: Fence, HashiCorp agent-skills, marimo, and a cautionary agent-loop story.</p><p><strong>Links</strong></p><p>AWS CodeBreach bulletin <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/security/security-bulletins/2026-002-AWS/">https://aws.amazon.com/security/security-bulletins/2026-002-AWS/</a> </p><p>Wiz research <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.wiz.io/blog/wiz-research-codebreach-vulnerability-aws-codebuild">https://www.wiz.io/blog/wiz-research-codebreach-vulnerability-aws-codebuild</a> </p><p>Bazel postmortem <a target="_blank" rel="noopener noreferrer nofollow" href="https://blog.bazel.build/2026/01/16/ssl-cert-expiry.html">https://blog.bazel.build/2026/01/16/ssl-cert-expiry.html</a> </p><p>Helm report <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.prequel.dev/blog-post/the-real-state-of-helm-chart-reliability-2025-hidden-risks-in-100-open-source-charts">https://www.prequel.dev/blog-post/the-real-state-of-helm-chart-reliability-2025-hidden-risks-in-100-open-source-charts</a> </p><p>n8n coverage <a target="_blank" rel="noopener noreferrer nofollow" href="https://thehackernews.com/2026/01/two-high-severity-n8n-flaws-allow.html">https://thehackernews.com/2026/01/two-high-severity-n8n-flaws-allow.html</a> </p><p>Fence <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/Use-Tusk/fence">https://github.com/Use-Tusk/fence</a> </p><p>agent-skills <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/hashicorp/agent-skills">https://github.com/hashicorp/agent-skills</a> </p><p>marimo <a target="_blank" rel="noopener noreferrer nofollow" href="https://marimo.io/">https://marimo.io/</a> </p><p>Agent loop story <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.theregister.com/2026/01/27/ralph_wiggum_claude_loops/">https://www.theregister.com/2026/01/27/ralph_wiggum_claude_loops/</a> </p><p>Related n8n episodes: </p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://www.tellerstech.com/ship-it-weekly/n8n-critical-cve-cve-2026-21858-aws-gpu-capacity-blocks-price-hike-netflix-temporal/">https://www.tellerstech.com/ship-it-weekly/n8n-critical-cve-cve-2026-21858-aws-gpu-capacity-blocks-price-hike-netflix-temporal/</a> </p><p><a target="_blank" rel="noopener noreferrer nofollow" href="https://www.tellerstech.com/ship-it-weekly/n8n-auth-rce-cve-2026-21877-github-artifact-permissions-and-aws-devops-agent-lessons/">https://www.tellerstech.com/ship-it-weekly/n8n-auth-rce-cve-2026-21877-github-artifact-permissions-and-aws-devops-agent-lessons/</a></p><p></p><p>More episodes + details: <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm">https://shipitweekly.fm</a></p>
play-circle icon
18 MIN
Ship It Conversations: AI Automation for SMBs: What to Automate (And What Not To) (with Austin Reed)
JAN 27, 2026
Ship It Conversations: AI Automation for SMBs: What to Automate (And What Not To) (with Austin Reed)
<p>This is a guest conversation episode of <strong>Ship It Weekly</strong> (separate from the weekly news recaps).</p><p>In this Ship It: Conversations episode I talk with Austin Reed from <a target="_blank" rel="noopener noreferrer nofollow" href="http://horizon.dev">horizon.dev</a> about AI and automation for small and mid-sized businesses, and what actually works once you leave the demo world.</p><p>We get into the most common automation wins he sees (sales and customer service), why a lot of projects fail due to communication and unclear specs more than the tech, and the trap of thinking “AI makes it cheap.” Austin shares how they push teams toward quick wins first, then iterate with prototypes so you don’t spend $10k automating a thing that never even happens.</p><p>We also talk guardrails: when “human-in-the-loop” makes sense, what he avoids automating (finance-heavy logic, HIPAA/medical, government), and why the goal is usually leverage, not replacing people. On the dev side, we nerd out a bit on the tooling they’re using day to day: GPT and Claude, Cursor, PR review help, CI/CD workflows, and why knowing how to architect and validate output matters way more than people think.</p><p>If you’re a DevOps/SRE type helping the business “do AI,” or you’re just tired of automation hype that ignores real constraints like credentials, scope creep, and operational risk, this one is very much about the practical middle ground.</p><p><strong>Links from the episode:</strong></p><p>Austin on LinkedIn: <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.linkedin.com/in/automationsexpert/">https://www.linkedin.com/in/automationsexpert/</a></p><p>horizon.dev: <a target="_blank" rel="noopener noreferrer nofollow" href="http://horizon.dev">horizon.dev</a></p><p>YouTube: <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.youtube.com/@horizonsoftwaredevSkool">https://www.youtube.com/@horizonsoftwaredev</a></p><p>Skool: <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.skool.com/automation-masters">https://www.skool.com/automation-masters</a></p><p>If you found this useful, share it with the person on your team who keeps saying “we should automate that” but hasn’t dealt with the messy parts yet.</p><p>More information on our website: <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm">https://shipitweekly.fm</a></p>
play-circle icon
24 MIN
curl Shuts Down Bug Bounties Due to AI Slop, AWS RDS Blue/Green Cuts Switchover Downtime to ~5 Seconds, and Amazon ECR Adds Cross-Repository Layer Sharing
JAN 24, 2026
curl Shuts Down Bug Bounties Due to AI Slop, AWS RDS Blue/Green Cuts Switchover Downtime to ~5 Seconds, and Amazon ECR Adds Cross-Repository Layer Sharing
<p>This week on <strong>Ship It Weekly</strong>, Brian looks at three different versions of the same problem: systems are getting faster, but human attention is still the bottleneck.</p><p>We start with curl shutting down their bug bounty program after getting flooded with low-quality “AI slop” reports. It’s not a “security vs maintainers” story, it’s an incentives and signal-to-noise story. When the cost to generate reports goes to zero, you basically DoS the people doing triage.</p><p>Next, AWS improved RDS Blue/Green Deployments to cut writer switchover downtime to typically ~5 seconds or less (single-region). That’s a big deal, but “fast switchover” doesn’t automatically mean “safe upgrade.” Your connection pooling, retries, and app behavior still decide whether it’s a blip or a cascade.</p><p>Third, Amazon ECR added cross-repository layer sharing. Sounds small, but if you’ve got a lot of repos and you’re constantly rebuilding/pushing the same base layers, this can reduce storage duplication and speed up pushes in real fleets.</p><p>Lightning round covers a practical Kubernetes clientcmd write-up, a solid “robust Helm charts” post, a traceroute-on-steroids style tool, and Docker Kanvas as another signal that vendors are trying to make “local-to-cloud” workflows feel less painful.</p><p>We wrap with Honeycomb’s interim report on their extended EU outage, and the part that always hits hardest in long incidents: managing engineer energy and coordination over multiple days is a first-class reliability concern.</p><p><strong>Links from this episode</strong></p><p>curl bug bounties shutdown <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/curl/curl/pull/20312">https://github.com/curl/curl/pull/20312</a></p><p>RDS Blue/Green faster switchover <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/about-aws/whats-new/2026/01/amazon-rds-blue-green-deployments-reduces-downtime/">https://aws.amazon.com/about-aws/whats-new/2026/01/amazon-rds-blue-green-deployments-reduces-downtime/</a></p><p>ECR cross-repo layer sharing <a target="_blank" rel="noopener noreferrer nofollow" href="https://aws.amazon.com/about-aws/whats-new/2026/01/amazon-ecr-cross-repository-layer-sharing/">https://aws.amazon.com/about-aws/whats-new/2026/01/amazon-ecr-cross-repository-layer-sharing/</a></p><p>Kubernetes clientcmd apiserver access <a target="_blank" rel="noopener noreferrer nofollow" href="https://kubernetes.io/blog/2026/01/19/clientcmd-apiserver-access/">https://kubernetes.io/blog/2026/01/19/clientcmd-apiserver-access/</a></p><p>Building robust Helm charts <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.willmunn.xyz/devops/helm/kubernetes/2026/01/17/building-robust-helm-charts.html">https://www.willmunn.xyz/devops/helm/kubernetes/2026/01/17/building-robust-helm-charts.html</a></p><p>ttl tool <a target="_blank" rel="noopener noreferrer nofollow" href="https://github.com/lance0/ttl">https://github.com/lance0/ttl</a></p><p>Docker Kanvas (InfoQ) <a target="_blank" rel="noopener noreferrer nofollow" href="https://www.infoq.com/news/2026/01/docker-kanvas-cloud-deployment/">https://www.infoq.com/news/2026/01/docker-kanvas-cloud-deployment/</a></p><p>Honeycomb EU interim report <a target="_blank" rel="noopener noreferrer nofollow" href="https://status.honeycomb.io/incidents/pjzh0mtqw3vt">https://status.honeycomb.io/incidents/pjzh0mtqw3vt</a></p><p>SRE Weekly issue #504 <a target="_blank" rel="noopener noreferrer nofollow" href="https://sreweekly.com/sre-weekly-issue-504/">https://sreweekly.com/sre-weekly-issue-504/</a></p><p></p><p>More episodes + details: <a target="_blank" rel="noopener noreferrer nofollow" href="https://shipitweekly.fm">https://shipitweekly.fm</a></p>
play-circle icon
15 MIN