In this episode, I’m talking with Vincent Warmerdam about treating LLMs as just another API in your Python app, with clear boundaries, small focused endpoints, and good monitoring. We’ll dig into patterns for wrapping these calls, caching and inspecting responses, and deciding where an LLM API actually earns its keep in your architecture.<br/>
<br/>
<strong>Episode sponsors</strong><br/>
<br/>
<a href='https://talkpython.fm/seer'>Seer: AI Debugging, Code TALKPYTHON</a><br>
<a href='https://talkpython.fm/nordstellar'>NordStellar</a><br>
<a href='https://talkpython.fm/training'>Talk Python Courses</a><br/>
<br/>
<h2 class="links-heading mb-4">Links from the show</h2>
<div><strong>Vincent on X</strong>: <a href="https://x.com/fishnets88?featured_on=talkpython" target="_blank" >@fishnets88</a><br/>
<strong>Vincent on Mastodon</strong>: <a href="https://fosstodon.org/@koaning" target="_blank" >@koaning</a><br/>
<br/>
<strong>LLM Building Blocks for Python Co-urse</strong>: <a href="https://training.talkpython.fm/courses/llm-building-blocks-for-python" target="_blank" >training.talkpython.fm</a><br/>
<strong>Top Talk Python Episodes of 2024</strong>: <a href="https://talkpython.fm/blog/posts/top-talk-python-podcast-episodes-of-2024/" target="_blank" >talkpython.fm</a><br/>
<strong>LLM Usage - Datasette</strong>: <a href="https://llm.datasette.io/en/stable/usage.html?featured_on=talkpython" target="_blank" >llm.datasette.io</a><br/>
<strong>DiskCache - Disk Backed Cache (Documentation)</strong>: <a href="https://grantjenks.com/docs/diskcache?featured_on=talkpython" target="_blank" >grantjenks.com</a><br/>
<strong>smartfunc - Turn docstrings into LLM-functions</strong>: <a href="https://github.com/koaning/smartfunc?featured_on=talkpython" target="_blank" >github.com</a><br/>
<strong>Ollama</strong>: <a href="https://ollama.com?featured_on=talkpython" target="_blank" >ollama.com</a><br/>
<strong>LM Studio - Local AI</strong>: <a href="https://lmstudio.ai?featured_on=talkpython" target="_blank" >lmstudio.ai</a><br/>
<strong>marimo - A Next-Generation Python Notebook</strong>: <a href="https://marimo.io?featured_on=talkpython" target="_blank" >marimo.io</a><br/>
<strong>Pydantic</strong>: <a href="https://pydantic.dev?featured_on=talkpython" target="_blank" >pydantic.dev</a><br/>
<strong>Instructor - Complex Schemas & Validation (Python)</strong>: <a href="https://python.useinstructor.com/#complex-schemas-validation" target="_blank" >python.useinstructor.com</a><br/>
<strong>Diving into PydanticAI with marimo</strong>: <a href="https://www.youtube.com/watch?v=ujQjqqBka-8" target="_blank" >youtube.com</a><br/>
<strong>Cline - AI Coding Agent</strong>: <a href="https://cline.bot?featured_on=talkpython" target="_blank" >cline.bot</a><br/>
<strong>OpenRouter - The Unified Interface For LLMs</strong>: <a href="https://openrouter.ai?featured_on=talkpython" target="_blank" >openrouter.ai</a><br/>
<strong>Leafcloud</strong>: <a href="https://leaf.cloud?featured_on=talkpython" target="_blank" >leaf.cloud</a><br/>
<strong>OpenAI looks for its "Google Chrome" moment with new Atlas web browser</strong>: <a href="https://arstechnica.com/ai/2025/10/openais-new-atlas-web-browser-wants-to-let-you-chat-with-a-page/?featured_on=talkpython" target="_blank" >arstechnica.com</a><br/>
<br/>
<strong>Watch this episode on YouTube</strong>: <a href="https://www.youtube.com/watch?v=t-ReN9jS9sQ" target="_blank" >youtube.com</a><br/>
<strong>Episode #528 deep-dive</strong>: <a href="https://talkpython.fm/episodes/show/528/python-apps-with-llm-building-blocks#takeaways-anchor" target="_blank" >talkpython.fm/528</a><br/>
<strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/528/python-apps-with-llm-building-blocks" target="_blank" >talkpython.fm</a><br/>
<br/>
<strong>Theme Song: Developer Rap</strong><br/>
<strong>🥁 Served in a Flask 🎸</strong>: <a href="https://talkpython.fm/flasksong" target="_blank" >talkpython.fm/flasksong</a><br/>
<br/>
<strong>---== Don't be a stranger ==---</strong><br/>
<strong>YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" ><i class="fa-brands fa-youtube"></i> youtube.com/@talkpython</a><br/>
<br/>
<strong>Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm</a><br/>
<strong>Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i> @
[email protected]</a><br/>
<strong>X.com</strong>: <a href="https://x.com/talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @talkpython</a><br/>
<br/>
<strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes</a><br/>
<strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i> @
[email protected]</a><br/>
<strong>Michael on X.com</strong>: <a href="https://x.com/mkennedy?featured_on=talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @mkennedy</a><br/></div>