<description>&lt;p&gt;&lt;em&gt;There’s a company who spent almost $50,000 because an agent went into an infinite loop and they forgot about it for a month.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;It had no failures and I guess no one was monitoring these costs. It’s nice that people do write about that in the database as well. After it happened, they said: watch out for infinite loops. Watch out for cascading tool failures. Watch out for silent failures where the agent reports it has succeeded when it didn’t!&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;We Discuss:&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;* Why the most successful teams are &lt;strong&gt;ripping out and rebuilding their agent systems every few weeks&lt;/strong&gt; as models improve, and why over-engineering now creates technical debt you can’t afford later;&lt;/p&gt;&lt;p&gt;* The &lt;strong&gt;$50,000 infinite loop disaster&lt;/strong&gt; and why “silent failures” are the biggest risk in production: agents confidently report success while spiraling into expensive mistakes;&lt;/p&gt;&lt;p&gt;* How &lt;strong&gt;ELIOS built emergency voice agents&lt;/strong&gt; with sub-400ms response times by aggressively throwing away context every few seconds, and why these extreme patterns are becoming standard practice;&lt;/p&gt;&lt;p&gt;* Why &lt;strong&gt;DoorDash uses a three-tier agent architecture&lt;/strong&gt; (manager, progress tracker, and specialists) with a persistent workspace that lets agents collaborate across hours or days;&lt;/p&gt;&lt;p&gt;* Why simple &lt;strong&gt;text files and markdown&lt;/strong&gt; are emerging as the best “continual learning” layer: human-readable memory that persists across sessions without fine-tuning models;&lt;/p&gt;&lt;p&gt;* The &lt;strong&gt;100-to-1 problem&lt;/strong&gt;: for every useful output, tool-calling agents generate 100 tokens of noise, and the three tactics (reduce, offload, isolate) teams use to manage it;&lt;/p&gt;&lt;p&gt;* Why companies are &lt;strong&gt;choosing Gemini Flash for document processing and Opus for long reasoning chains&lt;/strong&gt;, and how to match models to your actual usage patterns;&lt;/p&gt;&lt;p&gt;* The debate over &lt;strong&gt;vector databases versus simple grep and cat&lt;/strong&gt;, and why giving agents standard command-line tools often beats complex APIs;&lt;/p&gt;&lt;p&gt;* What &lt;strong&gt;“re-architect” as a job title&lt;/strong&gt; reveals about the shift from 70% scaffolding / 30% model to 90% model / 10% scaffolding, and why knowing when to rip things out is the may be the most important skill today.&lt;/p&gt;&lt;p&gt;You can also find the full episode on &lt;a target="_blank" href="https://open.spotify.com/show/3yuz89gqAhcMcdy3SZPe4X?si=AKl2jvIARD2Liw1bBH2Nng&amp;#38;nd=1&amp;#38;dlsi=8dfe7221896c4fc3"&gt;Spotify&lt;/a&gt;, &lt;a target="_blank" href="https://podcasts.apple.com/us/podcast/vanishing-gradients/id1610318868"&gt;Apple Podcasts&lt;/a&gt;, and &lt;a target="_blank" href="https://www.youtube.com/live/uf80BfD70Lw?si=RtkR2C5aYqBea2Us"&gt;YouTube&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;&lt;a target="_blank" href="https://notebooklm.google.com/notebook/ceef53be-ffe8-47d5-8850-07335c434100"&gt;You can also interact directly with the transcript here in NotebookLM&lt;/a&gt;: If you do so, let us know anything you find in the comments!&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;👉 &lt;strong&gt;&lt;em&gt;Want to learn more about Building AI-Powered Software? Check out our &lt;/em&gt;&lt;/strong&gt;&lt;a target="_blank" href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=ss-rav"&gt;&lt;strong&gt;&lt;em&gt;Building AI Applications course&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt;. It’s a live cohort with hands on exercises and office hours. &lt;strong&gt;Our final cohort starts March 10, 2026&lt;/strong&gt;. Here is a &lt;a target="_blank" href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgch"&gt;25% discount code&lt;/a&gt; for readers. 👈&lt;/p&gt;&lt;p&gt;Show Notes Links&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://www.linkedin.com/in/strickvl/"&gt;Alex Strick van Linschoten on LinkedIn&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://x.com/strickvl"&gt;Alex Strick van Linschoten on Twitter/X&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://www.zenml.io/llmops-database"&gt;LLMOps Database&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://huggingface.co/datasets/zenml/llmops-database"&gt;LLMOps Database Dataset on Hugging Face&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://huggingface.co/spaces/hugobowne/llmops-database-mcp"&gt;Hugo’s MCP Server for LLMOps Database&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://www.zenml.io/blog/what-1200-production-deployments-reveal-about-llmops-in-2025"&gt;Alex’s Blog: What 1,200+ Production Deployments Reveal About LLMOps in 2025&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://hugobowne.substack.com/p/practical-lessons-from-750-real-world"&gt;Previous Episode: Practical Lessons from 750 Real-World LLM Deployments&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://hugobowne.substack.com/p/episode-43-tales-from-400-llm-deployments-f60"&gt;Previous Episode: Tales from 400 LLM Deployments&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://research.trychroma.com/context-rot"&gt;Context Rot Research by Chroma&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://hugobowne.substack.com/p/ai-agent-harness-3-principles-for"&gt;Hugo’s Post: AI Agent Harness - 3 Principles for Context Engineering&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://hugobowne.substack.com/p/the-rise-of-agentic-search"&gt;Hugo’s Post: The Rise of Agentic Search&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://high-signal.delphina.ai/episode/the-post-coding-era-what-happens-when-ai-writes-the-system"&gt;Episode with Nick Moy: The Post-Coding Era&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://gist.github.com/hugobowne/959419146f1a8276c78511e801b85e40"&gt;Hugo’s Personal Podcast Prep Skill Gist&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool"&gt;Claude Tool Search Documentation&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://github.com/steveyegge/gastown"&gt;Gastown on GitHub (Steve Yegge)&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04"&gt;Welcome to Gastown by Steve Yegge&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://www.zenml.io"&gt;ZenML - Open Source MLOps &amp; LLMOps Framework&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://luma.com/calendar/cal-8ImWFDQ3IEIxNWk"&gt;Upcoming Events on Luma&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://www.youtube.com/@vanishinggradients"&gt;Vanishing Gradients on YouTube&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://www.youtube.com/live/uf80BfD70Lw?si=RtkR2C5aYqBea2Us"&gt;Watch the podcast livestream on YouTube&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs"&gt;Join the final cohort of our Building AI Applications course in March, 2026 (25% off for listeners)&lt;/a&gt;&lt;/p&gt;&lt;p&gt;👉 &lt;strong&gt;&lt;em&gt;Want to learn more about Building AI-Powered Software? Check out our &lt;/em&gt;&lt;/strong&gt;&lt;a target="_blank" href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=ss-rav"&gt;&lt;strong&gt;&lt;em&gt;Building AI Applications course&lt;/em&gt;&lt;/strong&gt;&lt;/a&gt;. It’s a live cohort with hands on exercises and office hours. &lt;strong&gt;Our final cohort starts March 10, 2026&lt;/strong&gt;. Here is a &lt;a target="_blank" href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgch"&gt;25% discount code&lt;/a&gt; for readers. 👈&lt;/p&gt; &lt;br/&gt;&lt;br/&gt;Get full access to Vanishing Gradients at &lt;a href="https://hugobowne.substack.com/subscribe?utm_medium=podcast&amp;#38;utm_campaign=CTA_4"&gt;hugobowne.substack.com/subscribe&lt;/a&gt;</description>

Vanishing Gradients

Hugo Bowne-Anderson

Episode 70: 1,400 Production AI Deployments

FEB 12, 202669 MIN
Vanishing Gradients

Episode 70: 1,400 Production AI Deployments

FEB 12, 202669 MIN

Description

<p><em>There’s a company who spent almost $50,000 because an agent went into an infinite loop and they forgot about it for a month.</em></p><p><em>It had no failures and I guess no one was monitoring these costs. It’s nice that people do write about that in the database as well. After it happened, they said: watch out for infinite loops. Watch out for cascading tool failures. Watch out for silent failures where the agent reports it has succeeded when it didn’t!</em></p><p><strong>We Discuss:</strong></p><p>* Why the most successful teams are <strong>ripping out and rebuilding their agent systems every few weeks</strong> as models improve, and why over-engineering now creates technical debt you can’t afford later;</p><p>* The <strong>$50,000 infinite loop disaster</strong> and why “silent failures” are the biggest risk in production: agents confidently report success while spiraling into expensive mistakes;</p><p>* How <strong>ELIOS built emergency voice agents</strong> with sub-400ms response times by aggressively throwing away context every few seconds, and why these extreme patterns are becoming standard practice;</p><p>* Why <strong>DoorDash uses a three-tier agent architecture</strong> (manager, progress tracker, and specialists) with a persistent workspace that lets agents collaborate across hours or days;</p><p>* Why simple <strong>text files and markdown</strong> are emerging as the best “continual learning” layer: human-readable memory that persists across sessions without fine-tuning models;</p><p>* The <strong>100-to-1 problem</strong>: for every useful output, tool-calling agents generate 100 tokens of noise, and the three tactics (reduce, offload, isolate) teams use to manage it;</p><p>* Why companies are <strong>choosing Gemini Flash for document processing and Opus for long reasoning chains</strong>, and how to match models to your actual usage patterns;</p><p>* The debate over <strong>vector databases versus simple grep and cat</strong>, and why giving agents standard command-line tools often beats complex APIs;</p><p>* What <strong>“re-architect” as a job title</strong> reveals about the shift from 70% scaffolding / 30% model to 90% model / 10% scaffolding, and why knowing when to rip things out is the may be the most important skill today.</p><p>You can also find the full episode on <a target="_blank" href="https://open.spotify.com/show/3yuz89gqAhcMcdy3SZPe4X?si=AKl2jvIARD2Liw1bBH2Nng&#38;nd=1&#38;dlsi=8dfe7221896c4fc3">Spotify</a>, <a target="_blank" href="https://podcasts.apple.com/us/podcast/vanishing-gradients/id1610318868">Apple Podcasts</a>, and <a target="_blank" href="https://www.youtube.com/live/uf80BfD70Lw?si=RtkR2C5aYqBea2Us">YouTube</a>.</p><p><a target="_blank" href="https://notebooklm.google.com/notebook/ceef53be-ffe8-47d5-8850-07335c434100">You can also interact directly with the transcript here in NotebookLM</a>: If you do so, let us know anything you find in the comments!</p><p></p><p></p><p>👉 <strong><em>Want to learn more about Building AI-Powered Software? Check out our </em></strong><a target="_blank" href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=ss-rav"><strong><em>Building AI Applications course</em></strong></a>. It’s a live cohort with hands on exercises and office hours. <strong>Our final cohort starts March 10, 2026</strong>. Here is a <a target="_blank" href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgch">25% discount code</a> for readers. 👈</p><p>Show Notes Links</p><p>* <a target="_blank" href="https://www.linkedin.com/in/strickvl/">Alex Strick van Linschoten on LinkedIn</a></p><p>* <a target="_blank" href="https://x.com/strickvl">Alex Strick van Linschoten on Twitter/X</a></p><p>* <a target="_blank" href="https://www.zenml.io/llmops-database">LLMOps Database</a></p><p>* <a target="_blank" href="https://huggingface.co/datasets/zenml/llmops-database">LLMOps Database Dataset on Hugging Face</a></p><p>* <a target="_blank" href="https://huggingface.co/spaces/hugobowne/llmops-database-mcp">Hugo’s MCP Server for LLMOps Database</a></p><p>* <a target="_blank" href="https://www.zenml.io/blog/what-1200-production-deployments-reveal-about-llmops-in-2025">Alex’s Blog: What 1,200+ Production Deployments Reveal About LLMOps in 2025</a></p><p>* <a target="_blank" href="https://hugobowne.substack.com/p/practical-lessons-from-750-real-world">Previous Episode: Practical Lessons from 750 Real-World LLM Deployments</a></p><p>* <a target="_blank" href="https://hugobowne.substack.com/p/episode-43-tales-from-400-llm-deployments-f60">Previous Episode: Tales from 400 LLM Deployments</a></p><p>* <a target="_blank" href="https://research.trychroma.com/context-rot">Context Rot Research by Chroma</a></p><p>* <a target="_blank" href="https://hugobowne.substack.com/p/ai-agent-harness-3-principles-for">Hugo’s Post: AI Agent Harness - 3 Principles for Context Engineering</a></p><p>* <a target="_blank" href="https://hugobowne.substack.com/p/the-rise-of-agentic-search">Hugo’s Post: The Rise of Agentic Search</a></p><p>* <a target="_blank" href="https://high-signal.delphina.ai/episode/the-post-coding-era-what-happens-when-ai-writes-the-system">Episode with Nick Moy: The Post-Coding Era</a></p><p>* <a target="_blank" href="https://gist.github.com/hugobowne/959419146f1a8276c78511e801b85e40">Hugo’s Personal Podcast Prep Skill Gist</a></p><p>* <a target="_blank" href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool">Claude Tool Search Documentation</a></p><p>* <a target="_blank" href="https://github.com/steveyegge/gastown">Gastown on GitHub (Steve Yegge)</a></p><p>* <a target="_blank" href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04">Welcome to Gastown by Steve Yegge</a></p><p>* <a target="_blank" href="https://www.zenml.io">ZenML - Open Source MLOps & LLMOps Framework</a></p><p>* <a target="_blank" href="https://luma.com/calendar/cal-8ImWFDQ3IEIxNWk">Upcoming Events on Luma</a></p><p>* <a target="_blank" href="https://www.youtube.com/@vanishinggradients">Vanishing Gradients on YouTube</a></p><p>* <a target="_blank" href="https://www.youtube.com/live/uf80BfD70Lw?si=RtkR2C5aYqBea2Us">Watch the podcast livestream on YouTube</a></p><p>* <a target="_blank" href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgfs">Join the final cohort of our Building AI Applications course in March, 2026 (25% off for listeners)</a></p><p>👉 <strong><em>Want to learn more about Building AI-Powered Software? Check out our </em></strong><a target="_blank" href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=ss-rav"><strong><em>Building AI Applications course</em></strong></a>. It’s a live cohort with hands on exercises and office hours. <strong>Our final cohort starts March 10, 2026</strong>. Here is a <a target="_blank" href="https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=vgch">25% discount code</a> for readers. 👈</p> <br/><br/>Get full access to Vanishing Gradients at <a href="https://hugobowne.substack.com/subscribe?utm_medium=podcast&#38;utm_campaign=CTA_4">hugobowne.substack.com/subscribe</a>