AI Engineering Podcast
AI Engineering Podcast

AI Engineering Podcast

Tobias Macey

Overview
Episodes

Details

This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.

Recent Episodes

Applying AI To The Construction Industry At Buildots
JUN 14, 2025
Applying AI To The Construction Industry At Buildots
Summary<br />In this episode of the Machine Learning Podcast Ori Silberberg, VP of Engineering at Buildots, talks about transforming the construction industry with AI. Ori shares how Buildots uses computer vision and AI to optimize construction projects by providing real-time feedback, reducing delays, and improving efficiency. Learn about the complexities of digitizing the construction industry, the technical architecture of Buildoz, and how its AI-driven solutions create a digital twin of construction sites. Ori emphasizes the importance of explainability and actionable insights in AI decision-making, highlighting the potential of generative AI to further enhance the construction process from planning to execution.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Ori Silberberg about applications of AI for optimizing building construction</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you describe what Buildotds is and the story behind it?</li><li>What types of construction projects are you focused on? (e.g. residential, commercial, industrial, etc.)</li><li>What are the main types of inefficiencies that typically occur on those types of job sites?<ul><li>What are the manual and technical processes that the industry has typically relied on to address those sources of waste and delay?</li></ul></li><li>In many ways the construction industry is as old as civilization. What are the main ways that the information age has transformed construction?<ul><li>What are the elements of the construction industry that make it resistant to digital transformation?</li></ul></li><li>Can you describe how you are applying AI to this complex and messy problem?</li><li>What are the types of data that you are able to collect?<ul><li>How are you automating that data collection so that construction crews don't have to add extra work or distractions to their day?</li></ul></li><li>For construction crews that are using Buildots, can you talk through how it integrates into the overall process from site planning to project completion?</li><li>Can you describe the technical architecture of the Buildots platform?</li><li>Given the safety critical nature of construction, how does that influence the way that you think about the types of AI models that you use and where to apply them?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen Buildots used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on Buildots?</li><li>What do you have planned for the future of AI usage at Buildots?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/ori-silberberg/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://buildots.com/" target="_blank">Buildots</a></li><li><a href="https://en.wikipedia.org/wiki/Computer-aided_design" target="_blank">CAD == Computer Aided Design</a></li><li><a href="https://en.wikipedia.org/wiki/Computer_vision" target="_blank">Computer Vision</a></li><li><a href="https://en.wikipedia.org/wiki/Lidar" target="_blank">LIDAR</a></li><li><a href="https://en.wikipedia.org/wiki/General_contractor" target="_blank">GC == General Contractor</a></li><li><a href="https://kubernetes.io/" target="_blank">Kubernetes</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
49 MIN
The Future of AI Systems: Open Models and Infrastructure Challenges
JUN 1, 2025
The Future of AI Systems: Open Models and Infrastructure Challenges
Summary<br />In this episode of the AI Engineering Podcast Jamie De Guerre, founding SVP of product at Together.ai, explores the role of open models in the AI economy. As a veteran of the AI industry, including his time leading product marketing for AI and machine learning at Apple, Jamie shares insights on the challenges and opportunities of operating open models at speed and scale. He delves into the importance of open source in AI, the evolution of the open model ecosystem, and how Together.ai's AI acceleration cloud is contributing to this movement with a focus on performance and efficiency.<br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Jamie de Guerre about the role of open models in the AI economy and how to operate them at speed and at scale</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you describe what Together AI is and the story behind it?<ul><li>What are the key goals of the company?</li></ul></li><li>The initial rounds of open models were largely driven by massive tech companies. How would you characterize the current state of the ecosystem that is driving the creation and evolution of open models?</li><li>There was also a lot of argument about what "open source" and "open" means in the context of ML/AI models, and the different variations of licenses being attached to them (e.g. the Meta license for Llama models). What is the current state of the language used and understanding of the restrictions/freedoms afforded?</li><li>What are the phases of organizational/technical evolution from initial use of open models through fine-tuning, to custom model development?</li><li>Can you outline the technical challenges companies face when trying to train or run inference on large open models themselves?<ul><li>What factors should a company consider when deciding whether to fine-tune an existing open model versus attempting to train a specialized one from scratch?</li></ul></li><li>While Transformers dominate the LLM landscape, there's ongoing research into alternative architectures. Are you seeing significant interest or adoption of non-Transformer architectures for specific use cases?&nbsp;<ul><li>When might those other architectures be a better choice?</li></ul></li><li>While open models offer tremendous advantages like transparency, control, and cost-effectiveness, are there scenarios where relying solely on them might be disadvantageous?<ul><li>When might proprietary models or a hybrid approach still be the better choice for a specific problem?</li></ul></li><li>Building and scaling AI infrastructure is notoriously complex. What are the most significant technical or strategic challenges you've encountered at Together AI while enabling scalable access to open models for your users?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen open models/the TogetherAI platform used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on powering AI model training and inference?</li><li>Where do you see the open model space heading in the next 1-2 years? Any specific trends or breakthroughs you anticipate?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/jamiedeguerre/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://www.together.ai/" target="_blank">Together AI</a></li><li><a href="https://www.datacamp.com/tutorial/fine-tuning-large-language-models" target="_blank">Fine Tuning</a></li><li><a href="https://towardsdatascience.com/how-llms-work-pre-training-to-post-training-neural-networks-hallucinations-and-inference/" target="_blank">Post-Training</a></li><li><a href="https://www.salesforceairesearch.com/" target="_blank">Salesforce Research</a></li><li><a href="https://mistral.ai/" target="_blank">Mistral</a></li><li><a href="https://www.salesforce.com/agentforce/" target="_blank">Agentforce</a></li><li><a href="https://www.llama.com/" target="_blank">Llama Models</a></li><li><a href="https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback" target="_blank">RLHF == Reinforcement Learning from Human Feedback</a></li><li><a href="https://www.theainavigator.com/blog/what-is-reinforcement-learning-with-verifiable-rewards-rlvr" target="_blank">RLVR == Reinforcement Learning from Verifiable Rewards</a></li><li><a href="https://huggingface.co/blog/Kseniase/testtimecompute" target="_blank">Test Time Compute</a></li><li><a href="https://huggingface.co/" target="_blank">HuggingFace</a></li><li><a href="https://en.wikipedia.org/wiki/Retrieval-augmented_generation" target="_blank">RAG == Retrieval Augmented Generation</a><ul><li><a href="https://www.aiengineeringpodcast.com/retrieval-augmented-generation-implementation-episode-34" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://deepmind.google/models/gemma/" target="_blank">Google Gemma</a></li><li><a href="https://www.llama.com/models/llama-4/" target="_blank">Llama 4 Maverick</a></li><li><a href="https://en.wikipedia.org/wiki/Prompt_engineering" target="_blank">Prompt Engineering</a></li><li><a href="https://github.com/vllm-project/vllm" target="_blank">vLLM</a></li><li><a href="https://docs.sglang.ai/" target="_blank">SGLang</a></li><li><a href="https://hazyresearch.stanford.edu/" target="_blank">Hazy Research</a> lab</li><li><a href="https://huggingface.co/blog/lbourdois/get-on-the-ssm-train" target="_blank">State Space Models</a></li><li><a href="https://hazyresearch.stanford.edu/blog/2023-03-07-hyena" target="_blank">Hyena Model</a></li><li><a href="https://en.wikipedia.org/wiki/Mamba_(deep_learning_architecture)" target="_blank">Mamba Architecture</a></li><li><a href="https://en.wikipedia.org/wiki/Diffusion_model" target="_blank">Diffusion Model Architecture</a></li><li><a href="https://en.wikipedia.org/wiki/Stable_Diffusion" target="_blank">Stable Diffusion</a></li><li><a href="https://bfl.ai/models/flux-kontext" target="_blank">Black Forest Labs Flux Model</a></li><li><a href="https://bfl.ai/models/flux-kontext" target="_blank">Nvidia Blackwell</a></li><li><a href="https://pytorch.org/" target="_blank">PyTorch</a></li><li><a href="https://www.rust-lang.org/" target="_blank">Rust</a></li><li><a href="https://huggingface.co/deepseek-ai/DeepSeek-R1" target="_blank">Deepseek R1</a></li><li><a href="https://huggingface.co/docs/hub/en/gguf" target="_blank">GGUF</a></li><li><a href="https://pikalabsai.org/" target="_blank">Pika Text To Video</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
51 MIN
The Rise of Agentic AI: Transforming Business Operations
MAY 21, 2025
The Rise of Agentic AI: Transforming Business Operations
Summary<br />In this episode of the AI Engineering Podcast, host Tobias Macey sits down with Ben Wilde, Head of Innovation at Georgian, to explore the transformative impact of agentic AI on business operations and the SaaS industry. From his early days working with vintage AI systems to his current focus on product strategy and innovation in AI, Ben shares his expertise on what he calls the "continuum" of agentic AI - from simple function calls to complex autonomous systems. Join them as they discuss the challenges and opportunities of integrating agentic AI into business systems, including organizational alignment, technical competence, and the need for standardization. They also dive into emerging protocols and the evolving landscape of AI-driven products and services, including usage-based pricing models and advancements in AI infrastructure and reliability.<br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Ben Wilde about the impact of agentic AI on business operations and SaaS as we know it</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you start by sharing your definition of what constitutes "agentic AI"?</li><li>There have been several generations of automation for business and product use cases. In your estimation, what are the substantive differences between agentic AI and e.g. RPA (Robotic Process Automation)?<ul><li>How do the inherent risks and operational overhead impact the calculus of whether and where to apply agentic capabilities?</li></ul></li><li>For teams that are aiming for agentic capabilities, what are the stepping stones along that path?</li><li>Beyond the technical capacity, there are numerous elements of organizational alignment that are required to make full use of the capabilities of agentic processes. What are some of the strategic investments that are necessary to get the whole business pointed in the same direction for adopting and benefitting from AI agents?</li><li>The most recent splash in the space of agentic AI is the introduction of the Model Context Protocol, and various responses to it. What do you see as the near and medium term impact of this effort on the ecosystem of AI agents and their architecture?</li><li>Software products have gone through several major evolutions since the days of CD-ROMs in the 90s. The current era has largely been oriented around the model of subscription-based software delivered via browser or mobile-based UIs over the internet. How does the pending age of AI agents upend that model?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen agentic AI used for business and product capabilities?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working with businesses adopting agentic AI capabilities?</li><li>When is agentic AI the wrong choice?</li><li>What are the ongoing developments in agentic capabilities that you are monitoring?</li></ul>Contact Info<br /><ul><li>Email</li><li><a href="https://www.linkedin.com/in/benrwilde/?originalSubdomain=ca" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://georgian.io/" target="_blank">Georgian</a></li><li><a href="https://georgian.io/agentic-platforms-and-applications/" target="_blank">Agentic Platforms And Applications</a></li><li><a href="https://en.wikipedia.org/wiki/Differential_privacy" target="_blank">Differential Privacy</a></li><li><a href="https://en.wikipedia.org/wiki/Agentic_AI" target="_blank">Agentic AI</a></li><li><a href="https://en.wikipedia.org/wiki/Language_model" target="_blank">Language Model</a></li><li><a href="https://en.wikipedia.org/wiki/Reasoning_language_model" target="_blank">Reasoning Model</a></li><li><a href="https://en.wikipedia.org/wiki/Robotic_process_automation" target="_blank">Robotic Process Automation</a></li><li><a href="https://ofac.treasury.gov/" target="_blank">OFAC</a></li><li><a href="https://openai.com/index/introducing-deep-research/" target="_blank">OpenAI Deep Research</a></li><li><a href="https://modelcontextprotocol.io/introduction" target="_blank">Model Context Protocol</a></li><li><a href="https://georgian.io/agentic-ai-adoption-insights-from-600-executives/" target="_blank">Georgian AI Adoption Survey</a></li><li><a href="https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/" target="_blank">Google Agent to Agent Protocol</a></li><li><a href="https://graphql.org/" target="_blank">GraphQL</a></li><li><a href="https://en.wikipedia.org/wiki/Tensor_Processing_Unit" target="_blank">TPU == Tensor Processing Unit</a></li><li><a href="https://en.wikipedia.org/wiki/Chris_Lattner" target="_blank">Chris Lattner</a></li><li><a href="https://en.wikipedia.org/wiki/CUDA" target="_blank">CUDA</a></li><li><a href="https://en.wikipedia.org/wiki/Neuro-symbolic_AI" target="_blank">NeuroSymbolic AI</a></li><li><a href="https://en.wikipedia.org/wiki/Prolog" target="_blank">Prolog</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
61 MIN
Protecting AI Systems: Understanding Vulnerabilities and Attack Surfaces
MAY 3, 2025
Protecting AI Systems: Understanding Vulnerabilities and Attack Surfaces
Summary<br />In this episode of the AI Engineering Podcast Kasimir Schulz, Director of Security Research at HiddenLayer, talks about the complexities and security challenges in AI and machine learning models. Kasimir explains the concept of shadow genes and shadow logic, which involve identifying common subgraphs within neural networks to understand model ancestry and potential vulnerabilities, and emphasizes the importance of understanding the attack surface in AI integrations, scanning models for security threats, and evolving awareness in AI security practices to mitigate risks in deploying AI systems.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Kasimir Schulz about the relationships between the various models on the market and how that information helps with selecting and protecting models for your applications</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you start by outlining the current state of the threat landscape for ML and AI systems?</li><li>What are the main areas of overlap in risk profiles between prediction/classification and generative models? (primarily from an attack surface/methodology perspective)<ul><li>What are the significant points of divergence?</li></ul></li><li>What are some of the categories of potential damages that can be created through the deployment of compromised models?</li><li>How does the landscape of foundation models introduce new challenges around supply chain security for organizations building with AI?</li><li>You recently published your findings on the potential to inject subgraphs into model architectures that are invisible during normal operation of the model. Along with that you wrote about the subgraphs that are shared between different classes of models. What are the key learnings that you would like to highlight from that research?<ul><li>What action items can organizations and engineering teams take in light of that information?</li></ul></li><li>Platforms like HuggingFace offer numerous variations of popular models with variations around quantization, various levels of finetuning, model distillation, etc. That is obviously a benefit to knowledge sharing and ease of access, but how does that exacerbate the potential threat in the face of backdoored models?</li><li>Beyond explicit backdoors in model architectures, there are numerous attack vectors to generative models in the form of prompt injection, "jailbreaking" of system prompts, etc. How does the knowledge of model ancestry help with identifying and mitigating risks from that class of threat?<ul><li>A common response to that threat is the introduction of model guardrails with pre- and post-filtering of prompts and responses. How can that approach help to address the potential threat of backdoored models as well?</li></ul></li><li>For a malicious actor that develops one of these attacks, what is the vector for introducing the compromised model into an organization?<ul><li>Once that model is in use, what are the possible means by which the malicious actor can detect its presence for purposes of exploitation?</li></ul></li><li>What are the most interesting, innovative, or unexpected ways that you have seen the information about model ancestry used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on ShadowLogic/ShadowGenes?</li><li>What are some of the other means by which the operation of ML and AI systems introduce attack vectors to organizations running them?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/kasimir-schulz/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://hiddenlayer.com/" target="_blank">HiddenLayer</a></li><li><a href="https://en.wikipedia.org/wiki/Zero-day_vulnerability" target="_blank">Zero-Day Vulnerability</a></li><li><a href="https://hiddenlayer.com/innovation-hub/mcp-model-context-pitfalls-in-an-agentic-world/" target="_blank">MCP Blog Post</a></li><li><a href="https://docs.python.org/3/library/pickle.html" target="_blank">Python Pickle Object Serialization</a></li><li><a href="https://huggingface.co/docs/safetensors/en/index" target="_blank">SafeTensors</a></li><li><a href="https://en.wikipedia.org/wiki/DeepSeek" target="_blank">Deepseek</a></li><li><a href="https://huggingface.co/docs/transformers/en/index" target="_blank">Huggingface Transformers</a></li><li><a href="https://arxiv.org/pdf/2406.11880" target="_blank">KROP == Knowledge Return Oriented Prompting</a></li><li><a href="https://xkcd.com/327" target="_blank">XKCD "Little Bobby Tables"</a></li><li><a href="https://owasp.org/www-project-top-10-for-large-language-model-applications/" target="_blank">OWASP Top 10 For LLMs</a></li><li><a href="https://www.cve.org/Media/News/item/news/2024/10/15/New-CVE-Artificial-Intelligence-Working-Group" target="_blank">CVE AI Systems Working Group</a></li><li><a href="https://kaushiksp.medium.com/refusal-vector-ablation-in-llms-35aa646ff4a9" target="_blank">Refusal Vector Ablation</a></li><li><a href="https://en.wikipedia.org/wiki/Foundation_model" target="_blank">Foundation Model</a></li><li><a href="https://hiddenlayer.com/innovation-hub/shadowlogic/" target="_blank">ShadowLogic</a></li><li><a href="https://hiddenlayer.com/innovation-hub/shadowgenes-uncovering-model-genealogy/" target="_blank">ShadowGenes</a></li><li><a href="https://en.wikipedia.org/wiki/Bytecode" target="_blank">Bytecode</a></li><li><a href="https://en.wikipedia.org/wiki/Residual_neural_network" target="_blank">ResNet == Resideual Neural Network</a></li><li><a href="https://en.wikipedia.org/wiki/You_Only_Look_Once" target="_blank">YOLO == You Only Look Once</a></li><li><a href="https://netron.app/" target="_blank">Netron</a></li><li><a href="https://en.wikipedia.org/wiki/BERT_(language_model)" target="_blank">BERT</a></li><li><a href="https://huggingface.co/docs/transformers/en/model_doc/roberta" target="_blank">RoBERTA</a></li><li><a href="https://www.shodan.io/" target="_blank">Shodan</a></li><li><a href="https://en.wikipedia.org/wiki/Capture_the_flag_(cybersecurity)" target="_blank">CTF == Capture The Flag</a></li><li><a href="https://aws.amazon.com/blogs/aws/amazon-titan-image-generator-v2-is-now-available-in-amazon-bedrock/" target="_blank">Titan Bedrock Image Generator</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
51 MIN
Understanding The Operational And Organizational Challenges Of Agentic AI
APR 21, 2025
Understanding The Operational And Organizational Challenges Of Agentic AI
Summary<br />In this episode of the AI Engineering podcast Julian LaNeve, CTO of Astronomer, talks about transitioning from simple LLM applications to more complex agentic AI systems. Julian shares insights into the challenges and considerations of this evolution, emphasizing the importance of starting with simpler applications to build operational knowledge and intuition. He discusses the parallels between microservices and agentic AI, highlighting the need for careful orchestration and observability to manage complexity and ensure reliability, and explores the technical requirements for deploying AI systems, including data infrastructure, orchestration tools like Apache Airflow, and understanding the probabilistic nature of AI models.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Seamless data integration into AI applications often falls short, leading many to adopt RAG methods, which come with high costs, complexity, and limited scalability. Cognee offers a better solution with its open-source semantic memory engine that automates data ingestion and storage, creating dynamic knowledge graphs from your data. Cognee enables AI agents to understand the meaning of your data, resulting in accurate responses at a lower cost. Take full control of your data in LLM apps without unnecessary overhead. Visit <a href="https://www.aiengineeringpodcast.com/cognee" target="_blank">aiengineeringpodcast.com/cognee</a> to learn more and elevate your AI apps and agents.</li><li>Your host is Tobias Macey and today I'm interviewing Julian LaNeve about how to avoid putting the cart before the horse with AI applications. When do you move from "simple" LLM apps to agentic AI and what's the path to get there?</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>How do you technically distinguish "agentic AI" (e.g., involving planning, tool use, memory) from "simpler LLM workflows" (e.g., stateless transformations, RAG)? What are the key differences in operational complexity and potential failure modes?</li><li>What specific technical challenges (e.g., state management, observability, non-determinism, prompt fragility, cost explosion) are often underestimated when teams jump directly into building stateful, autonomous agents?</li><li>What are the pre-requisites from a data and infrastructure perspective before going to production with agentic applications?<ul><li>How does that differ from the chat-based systems that companies might be experimenting with?</li></ul></li><li>Technically, where do you most often see ambitious agent projects break down during development or early deployment?</li><li>Beyond generic data quality, what specific data engineering practices become critical when building reliable LLM applications? (e.g., Designing data pipelines for efficient RAG chunking/embedding, versioning prompts alongside data, caching strategies for LLM calls, managing vector database ETL).</li><li>From an implementation complexity standpoint, what characterizes tasks well-suited for initial LLM workflow adoption versus those genuinely requiring agentic capabilities?<ul><li>Can you share examples (anonymized if necessary) highlighting how organizations successfully engineered these simpler LLM workflows? What specific technical designs, tooling choices, or MLOps practices were key to their reliability and scalability?</li></ul></li><li>What are some hard-won technical or operational lessons from deploying and scaling LLM workflows in production environments? Any surprising performance bottlenecks, cost issues, or monitoring challenges engineers should anticipate?</li><li>What technical maturity signals (e.g., robust CI/CD for ML, established monitoring/alerting for pipelines, automated evaluation frameworks, cost tracking mechanisms) suggest an engineering team might be ready to tackle the challenges of building and operating agentic systems?</li><li>How does the technical stack and engineering process need to evolve when moving from orchestrated LLM workflows towards more complex agents involving memory, planning, and dynamic tool use? What new components and failure modes must be engineered for?</li><li>How do you foresee orchestration platforms evolving to better serve the needs of AI engineers building LLM apps?&nbsp;</li><li>What are the most interesting, innovative, or unexpected ways that you have seen organizations build toward advanced AI use cases?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on supporting AI services?</li><li>When is AI the wrong choice?</li><li>What is the single most critical piece of engineering advice you would give to fellow AI engineers who are tasked with integrating LLMs into production systems right now?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/julianlaneve" target="_blank">LinkedIn</a></li><li><a href="https://github.com/jlaneve" target="_blank">GitHub</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Links<br /><ul><li><a href="https://www.astronomer.io/" target="_blank">Astronomer</a></li><li><a href="https://airflow.apache.org/" target="_blank">Airflow</a></li><li><a href="https://www.anthropic.com/" target="_blank">Anthropic</a></li><li><a href="https://www.anthropic.com/engineering/building-effective-agents" target="_blank">Building Effective Agents</a> post from Anthropic</li><li><a href="https://www.astronomer.io/airflow/3-0/" target="_blank">Airflow 3.0</a></li><li><a href="https://microservices.io/" target="_blank">Microservices</a></li><li><a href="https://github.com/pydantic/pydantic-ai" target="_blank">Pydantic AI</a></li><li><a href="https://www.langchain.com/" target="_blank">Langchain</a></li><li><a href="https://www.llamaindex.ai/" target="_blank">LlamaIndex</a></li><li><a href="https://leehanchung.github.io/blogs/2024/08/11/llm-as-a-judge/" target="_blank">LLM As A Judge</a></li><li><a href="https://www.swebench.com/" target="_blank">SWE (SoftWare Engineer) Bench</a></li><li><a href="https://www.cursor.com/" target="_blank">Cursor</a></li><li><a href="https://windsurf.com/editor" target="_blank">Windsurf</a></li><li><a href="https://opentelemetry.io/" target="_blank">OpenTelemetry</a></li><li><a href="https://en.wikipedia.org/wiki/Directed_acyclic_graph" target="_blank">DAG == Directed Acyclic Graph</a></li><li><a href="https://en.wikipedia.org/wiki/Halting_problem" target="_blank">Halting Problem</a></li><li><a href="https://arxiv.org/html/2410.15665v1" target="_blank">AI Long Term Memory</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
72 MIN