AI Engineering Podcast
AI Engineering Podcast

AI Engineering Podcast

Tobias Macey

Overview
Episodes

Details

This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.

Recent Episodes

The Power of Community in AI Development with Oumi
MAR 16, 2025
The Power of Community in AI Development with Oumi
Summary<br />In this episode of the AI Engineering Podcast Emmanouil (Manos) Koukoumidis, CEO of Oumi, about his vision for an open platform for building, evaluating, and deploying AI foundation models. Manos shares his journey from working on natural language AI services at Google Cloud to founding Oumi with a mission to advance open-source AI, emphasizing the importance of community collaboration and accessibility. He discusses the need for open-source models that are not constrained by proprietary APIs, highlights the role of Oumi in facilitating open collaboration, and touches on the complexities of model development, open data, and community-driven advancements in AI. He also explains how Oumi can be used throughout the entire lifecycle of AI model development, post-training, and deployment.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Manos Koukoumidis about Oumi, an all-in-one production-ready open platform to build, evaluate, and deploy AI models</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you describe what Oumi is and the story behind it?</li><li>There are numerous projects, both full suites and point solutions, focused on every aspect of "AI" development. What is the unique value that Oumi provides in this ecosystem?</li><li>You have stated the desire for Oumi to become the Linux of AI development. That is an ambitious goal and one that Linux itself didn't start with. What do you see as the biggest challenges that need addressing to reach a critical mass of adoption?</li><li>In the vein of "open source" AI, the most notable project that I'm aware of that fits the proper definition is the OLMO models from AI2. What lessons have you learned from their efforts that influence the ways that you think about your work on Oumi?</li><li>On the community building front, HuggingFace has been the main player. What do you see as the benefits and shortcomings of that platform in the context of your vision for open and collaborative AI?</li><li>Can you describe the overall design and architecture of Oumi?<ul><li>How did you approach the selection process for the different components that you are building on top of?</li><li>What are the extension points that you have incorporated to allow for customization/evolution?</li></ul></li><li>Some of the biggest barriers to entry for building foundation models are the cost and availability of hardware used for training, and the ability to collect and curate the data needed. How does Oumi help with addressing those challenges?</li><li>For someone who wants to build or contribute to an open source model, what does that process look like?<ul><li>How do you envision the community building/collaboration process?</li></ul></li><li>Your overall goal is to build a foundation for the growth and well-being of truly open AI. How are you thinking about the sustainability of the project and the funding needed to grow and support the community?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen Oumi used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on Oumi?</li><li>When is Oumi the wrong choice?</li><li>What do you have planned for the future of Oumi?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/koukoumidis/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://oumi.ai/" target="_blank">Oumi</a></li><li><a href="https://cloud.google.com/vertex-ai/generative-ai/docs/deprecations/palm" target="_blank">Cloud PaLM</a></li><li><a href="https://deepmind.google/technologies/gemini/" target="_blank">Google Gemini</a></li><li><a href="https://deepmind.google/" target="_blank">DeepMind</a></li><li><a href="https://en.wikipedia.org/wiki/Long_short-term_memory" target="_blank">LSTM == Long Short-Term Memory</a></li><li><a href="https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture" target="_blank">Transfomers</a>)</li><li><a href="https://openai.com/index/chatgpt/" target="_blank">ChatGPT</a></li><li><a href="https://en.wikipedia.org/wiki/Partial_differential_equation" target="_blank">Partial Differential Equation</a></li><li><a href="https://allenai.org/olmo" target="_blank">OLMO</a></li><li><a href="https://opensource.org/ai" target="_blank">OSI AI definition</a></li><li><a href="https://mlflow.org/" target="_blank">MLFlow</a></li><li><a href="https://metaflow.org/" target="_blank">Metaflow</a></li><li><a href="https://docs.skypilot.co/en/latest/docs/index.html" target="_blank">SkyPilot</a></li><li><a href="https://www.llama.com/" target="_blank">Llama</a></li><li><a href="https://en.wikipedia.org/wiki/Retrieval-augmented_generation" target="_blank">RAG</a><ul><li><a href="https://www.aiengineeringpodcast.com/retrieval-augmented-generation-implementation-episode-34" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://en.wikipedia.org/wiki/Synthetic_data" target="_blank">Synthetic Data</a><ul><li><a href="https://www.aiengineeringpodcast.com/gretel-syntehtic-data-for-ai-episode-46" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://www.evidentlyai.com/llm-guide/llm-as-a-judge" target="_blank">LLM As Judge</a></li><li><a href="https://github.com/sgl-project/sglang" target="_blank">SGLang</a></li><li><a href="https://github.com/vllm-project/vllm" target="_blank">vLLM</a></li><li><a href="https://gorilla.cs.berkeley.edu/leaderboard.html" target="_blank">Function Calling Leaderboard</a></li><li><a href="https://en.wikipedia.org/wiki/DeepSeek" target="_blank">Deepseek</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
56 MIN
Arch Gateway: Add AI To Your Apps Without Custom Development
FEB 26, 2025
Arch Gateway: Add AI To Your Apps Without Custom Development
Summary<br />In this episode of the AI Engineering Podcast Adil Hafiz talks about the Arch project, a gateway designed to simplify the integration of AI agents into business systems. He discusses how the gateway uses Rust and Envoy to provide a unified interface for handling prompts and integrating large language models (LLMs), allowing developers to focus on core business logic rather than AI complexities. The conversation also touches on the target audience, challenges, and future directions for the project, including plans to develop a leading planning LLM and enhance agent interoperability.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Adil Hafeez about the Arch project, a gateway for your AI agents</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you describe what Arch is and the story behind it?</li><li>How do you think about the target audience for Arch and the types of problems/projects that they are responsible for?</li><li>The general category of LLM gateways is largely oriented toward abstracting the specific model provider being called. What are the areas of overlap and differentiation in Arch?</li><li>Many of the features in Arch are also available in AI frameworks (e.g. LangChain, LlamaIndex, etc.), such as request routing, guardrails, and tool calling. How do you think about the architectural tradeoffs of having that functionality in a gateway service?</li><li>What is the workflow for someone building an application with Arch?</li><li>Can you describe the architecture and components of the Arch gateway?</li><li>With the pace of change in the AI/LLM ecosystem, how have you designed the Arch project to allow for rapid evolution and extensibility?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen Arch used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on Arch?</li><li>When is Arch the wrong choice?</li><li>What do you have planned for the future of Arch?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/adilhafeez/" target="_blank">LinkedIn</a></li><li><a href="https://github.com/adilhafeez" target="_blank">GitHub</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://archgw.com/" target="_blank">Arch Gateway</a></li><li><a href="https://en.wikipedia.org/wiki/Gradient_boosting" target="_blank">Gradient Boosting</a></li><li><a href="https://www.envoyproxy.io/" target="_blank">Envoy</a></li><li><a href="https://portkey.ai/blog/what-is-an-llm-gateway" target="_blank">LLM Gateway</a></li><li><a href="https://huggingface.co/" target="_blank">Huggingface</a></li><li><a href="https://huggingface.co/katanemo" target="_blank">Katanemo Models</a></li><li><a href="https://github.com/QwenLM/Qwen2.5" target="_blank">Qwen2.5</a></li><li><a href="https://doc.rust-lang.org/clippy/" target="_blank">Rust Clippy</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
31 MIN
The Role Of Synthetic Data In Building Better AI Applications
FEB 16, 2025
The Role Of Synthetic Data In Building Better AI Applications
Summary<br />In this episode of the AI Engineering Podcast Ali Golshan, co-founder and CEO of Gretel.ai, talks about the transformative role of synthetic data in AI systems. Ali explains how synthetic data can be purpose-built for AI use cases, emphasizing privacy, quality, and structural stability. He highlights the shift from traditional methods to using language models, which offer enhanced capabilities in understanding data's deep structure and generating high-quality datasets. The conversation explores the challenges and techniques of integrating synthetic data into AI systems, particularly in production environments, and concludes with insights into the future of synthetic data, including its application in various industries, the importance of privacy regulations, and the ongoing evolution of AI systems.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Seamless data integration into AI applications often falls short, leading many to adopt RAG methods, which come with high costs, complexity, and limited scalability. Cognee offers a better solution with its open-source semantic memory engine that automates data ingestion and storage, creating dynamic knowledge graphs from your data. Cognee enables AI agents to understand the meaning of your data, resulting in accurate responses at a lower cost. Take full control of your data in LLM apps without unnecessary overhead. Visit <a href="https://www.aiengineeringpodcast.com/cognee" target="_blank">aiengineeringpodcast.com/cognee</a> to learn more and elevate your AI apps and agents.</li><li>Your host is Tobias Macey and today I'm interviewing Ali Golshan about the role of synthetic data in building, scaling, and improving AI systems</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you start by summarizing what you mean by synthetic data in the context of this conversation?</li><li>How have the capabilities around the generation and integration of synthetic data changed across the pre- and post-LLM timelines?</li><li>What are the motivating factors that would lead a team or organization to invest in synthetic data generation capacity?</li><li>What are the main methods used for generation of synthetic data sets?<ul><li>How does that differ across open-source and commercial offerings?</li></ul></li><li>From a surface level it seems like synthetic data generation is a straight-forward exercise that can be owned by an engineering team. What are the main "gotchas" that crop up as you move along the adoption curve?<ul><li>What are the scaling characteristics of synthetic data generation as you go from prototype to production scale?</li></ul></li><li>domains/data types that are inappropriate for synthetic use cases (e.g. scientific or educational content)</li><li>managing appropriate distribution of values in the generation process</li><li>Beyond just producing large volumes of semi-random data (structured or otherwise), what are the other processes involved in the workflow of synthetic data and its integration into the different systems that consume it?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen synthetic data generation used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on synthetic data generation?</li><li>When is synthetic data the wrong choice?</li><li>What do you have planned for the future of synthetic data capabilities at Gretel?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/ali-golshan/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://gretel.ai/" target="_blank">Gretel</a></li><li><a href="https://hadoop.apache.org/" target="_blank">Hadoop</a></li><li><a href="https://en.wikipedia.org/wiki/Long_short-term_memory" target="_blank">LSTM == Long Short-Term Memory</a></li><li><a href="https://en.wikipedia.org/wiki/Generative_adversarial_network" target="_blank">GAN == Generative Adversarial Network</a></li><li><a href="https://www.microsoft.com/en-us/research/publication/textbooks-are-all-you-need/" target="_blank">Textbooks are all you need</a> MSFT paper</li><li><a href="https://www.illumina.com/" target="_blank">Illumina</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
54 MIN
Optimize Your AI Applications Automatically With The TensorZero LLM Gateway
JAN 22, 2025
Optimize Your AI Applications Automatically With The TensorZero LLM Gateway
Summary<br />In this episode of the AI Engineering podcast Viraj Mehta, CTO and co-founder of TensorZero, talks about the use of LLM gateways for managing interactions between client-side applications and various AI models. He highlights the benefits of using such a gateway, including standardized communication, credential management, and potential features like request-response caching and audit logging. The conversation also explores TensorZero's architecture and functionality in optimizing AI applications by managing structured data inputs and outputs, as well as the challenges and opportunities in automating prompt generation and maintaining interaction history for optimization purposes.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Seamless data integration into AI applications often falls short, leading many to adopt RAG methods, which come with high costs, complexity, and limited scalability. Cognee offers a better solution with its open-source semantic memory engine that automates data ingestion and storage, creating dynamic knowledge graphs from your data. Cognee enables AI agents to understand the meaning of your data, resulting in accurate responses at a lower cost. Take full control of your data in LLM apps without unnecessary overhead. Visit <a href="https://www.aiengineeringpodcast.com/cognee" target="_blank">aiengineeringpodcast.com/cognee</a> to learn more and elevate your AI apps and agents.&nbsp;</li><li>Your host is Tobias Macey and today I'm interviewing Viraj Mehta about the purpose of an LLM gateway and his work on TensorZero</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>What is an LLM gateway?<ul><li>What purpose does it serve in an AI application architecture?</li></ul></li><li>What are some of the different features and capabilities that an LLM gateway might be expected to provide?</li><li>Can you describe what TensorZero is and the story behind it?<ul><li>What are the core problems that you are trying to address with Tensor0 and for whom?</li></ul></li><li>One of the core features that you are offering is management of interaction history. How does this compare to the "memory" functionality offered by e.g. LangChain, Cognee, Mem0, etc.?</li><li>How does the presence of TensorZero in an application architecture change the ways that an AI engineer might approach the logic and control flows in a chat-based or agent-oriented project?</li><li>Can you describe the workflow of building with Tensor0 and some specific examples of how it feeds back into the performance/behavior of an LLM?</li><li>What are some of the ways in which the addition of Tensor0 or another LLM gateway might have a negative effect on the design or operation of an AI application?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen TensorZero used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on TensorZero?</li><li>When is TensorZero the wrong choice?</li><li>What do you have planned for the future of TensorZero?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/virajrmehta/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://www.tensorzero.com/" target="_blank">TensorZero</a></li><li><a href="https://www.tensorops.ai/post/llm-gateways-in-production-centralized-access-security-and-monitoring" target="_blank">LLM Gateway</a></li><li><a href="https://www.litellm.ai/" target="_blank">LiteLLM</a></li><li><a href="https://openai.com/" target="_blank">OpenAI</a></li><li><a href="https://cloud.google.com/vertex-ai" target="_blank">Google Vertex</a></li><li><a href="https://www.anthropic.com/" target="_blank">Anthropic</a></li><li><a href="https://en.wikipedia.org/wiki/Reinforcement_learning" target="_blank">Reinforcement Learning</a></li><li><a href="https://en.wikipedia.org/wiki/Tokamak" target="_blank">Tokamak Reactor</a></li><li><a href="https://scholar.google.com/citations?view_op=view_citation&amp;hl=en&amp;user=4pHjHBkAAAAJ&amp;sortby=pubdate&amp;citation_for_view=4pHjHBkAAAAJ:M3ejUd6NZC8C" target="_blank">Viraj RLHF Paper</a></li><li><a href="https://arxiv.org/abs/1502.06362" target="_blank">Contextual Dueling Bandits</a></li><li><a href="https://www.superannotate.com/blog/direct-preference-optimization-dpo" target="_blank">Direct Preference Optimization</a></li><li><a href="https://en.wikipedia.org/wiki/Partially_observable_Markov_decision_process" target="_blank">Partially Observable Markov Decision Process</a></li><li><a href="https://dspy.ai/" target="_blank">DSPy</a></li><li><a href="https://pytorch.org/" target="_blank">PyTorch</a></li><li><a href="https://www.cognee.ai" target="_blank">Cognee</a></li><li><a href="https://github.com/mem0ai/mem0" target="_blank">Mem0</a></li><li><a href="https://www.langchain.com/langgraph" target="_blank">LangGraph</a></li><li><a href="https://en.wikipedia.org/wiki/Douglas_Hofstadter" target="_blank">Douglas Hofstadter</a></li><li><a href="https://github.com/openai/gym" target="_blank">OpenAI Gym</a></li><li><a href="https://en.wikipedia.org/wiki/OpenAI_o1" target="_blank">OpenAI o1</a></li><li><a href="https://en.wikipedia.org/wiki/OpenAI_o3" target="_blank">OpenAI o3</a></li><li><a href="https://www.promptingguide.ai/techniques/cot" target="_blank">Chain Of Thought</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
63 MIN
Harnessing The Engine Of AI
DEC 16, 2024
Harnessing The Engine Of AI
Summary<br />In this episode of the AI Engineering Podcast Ron Green, co-founder and CTO of KungFu AI, talks about the evolving landscape of AI systems and the challenges of harnessing generative AI engines. Ron shares his insights on the limitations of large language models (LLMs) as standalone solutions and emphasizes the need for human oversight, multi-agent systems, and robust data management to support AI initiatives. He discusses the potential of domain-specific AI solutions, RAG approaches, and mixture of experts to enhance AI capabilities while addressing risks. The conversation also explores the evolving AI ecosystem, including tooling and frameworks, strategic planning, and the importance of interpretability and control in AI systems. Ron expresses optimism about the future of AI, predicting significant advancements in the next 20 years and the integration of AI capabilities into everyday software applications.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Seamless data integration into AI applications often falls short, leading many to adopt RAG methods, which come with high costs, complexity, and limited scalability. Cognee offers a better solution with its open-source semantic memory engine that automates data ingestion and storage, creating dynamic knowledge graphs from your data. Cognee enables AI agents to understand the meaning of your data, resulting in accurate responses at a lower cost. Take full control of your data in LLM apps without unnecessary overhead. Visit <a href="https://www.aiengineeringpodcast.com/cognee" target="_blank">aiengineeringpodcast.com/cognee</a> to learn more and elevate your AI apps and agents.&nbsp;</li><li>Your host is Tobias Macey and today I'm interviewing Ron Green about the wheels that we need for harnessing the power of the generative AI engine</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you describe what you see as the main shortcomings of LLMs as a stand-alone solution (to anything)?</li><li>The most established vehicle for harnessing LLM capabilities is the RAG pattern. What are the main limitations of that as a "product" solution?</li><li>The idea of multi-agent or mixture-of-experts systems is a more sophisticated approach that is gaining some attention. What do you see as the pro/con conversation around that pattern?</li><li>Beyond the system patterns that are being developed there is also a rapidly shifting ecosystem of frameworks, tools, and point solutions that plugin to various points of the AI lifecycle. How does that volatility hinder the adoption of generative AI in different contexts?<ul><li>In addition to the tooling, the models themselves are rapidly changing. How much does that influence the ways that organizations are thinking about whether and when to test the waters of AI?</li></ul></li><li>Continuing on the metaphor of LLMs and engines and the need for vehicles, where are we on the timeline in relation to the model T Ford?<ul><li>What are the vehicle categories that we still need to design and develop? (e.g. sedans, mini-vans, freight trucks, etc.)</li></ul></li><li>The current transformer architecture is starting to reach scaling limits that lead to diminishing returns. Given your perspective as an industry veteran, what are your thoughts on the future trajectory of AI model architectures?<ul><li>What is the ongoing role of regression style ML in the landscape of generative AI?</li></ul></li><li>What are the most interesting, innovative, or unexpected ways that you have seen LLMs used to power a "vehicle"?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working in this phase of AI?</li><li>When is generative AI/LLMs the wrong choice?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/rongreen/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://www.kungfu.ai/" target="_blank">Kungfu.ai</a></li><li><a href="https://www.llama.com/" target="_blank">Llama</a> open generative AI models</li><li><a href="https://openai.com/index/chatgpt/" target="_blank">ChatGPT</a></li><li><a href="https://github.com/features/copilot" target="_blank">Copilot</a></li><li><a href="https://www.cursor.com/" target="_blank">Cursor</a></li><li><a href="https://en.wikipedia.org/wiki/Retrieval-augmented_generation" target="_blank">RAG == Retrieval Augmented Generation</a><ul><li><a href="https://www.aiengineeringpodcast.com/retrieval-augmented-generation-implementation-episode-34" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://huggingface.co/blog/moe" target="_blank">Mixture of Experts</a></li><li><a href="https://en.wikipedia.org/wiki/Deep_learning" target="_blank">Deep Learning</a></li><li><a href="https://en.wikipedia.org/wiki/Random_forest" target="_blank">Random Forest</a></li><li><a href="https://en.wikipedia.org/wiki/Supervised_learning" target="_blank">Supervised Learning</a></li><li><a href="https://en.wikipedia.org/wiki/Active_learning_(machine_learning" target="_blank">Active Learning</a>)</li><li><a href="https://yann.lecun.com/" target="_blank">Yann LeCunn</a></li><li><a href="https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback" target="_blank">RLHF == Reinforcement Learning from Human Feedback</a></li><li><a href="https://en.wikipedia.org/wiki/Ford_Model_T" target="_blank">Model T Ford</a></li><li><a href="https://github.com/state-spaces/mamba" target="_blank">Mamba selective state space</a></li><li><a href="https://news.mit.edu/2021/machine-learning-adapts-0128" target="_blank">Liquid Network</a></li><li><a href="https://www.promptingguide.ai/techniques/cot" target="_blank">Chain of thought</a></li><li><a href="https://openai.com/o1/" target="_blank">OpenAI o1</a></li><li><a href="https://en.wikipedia.org/wiki/Marvin_Minsky" target="_blank">Marvin Minsky</a></li><li><a href="https://en.wikipedia.org/wiki/Von_Neumann_architecture" target="_blank">Von Neumann Architecture</a></li><li><a href="https://arxiv.org/abs/1706.03762" target="_blank">Attention Is All You Need</a></li><li><a href="https://en.wikipedia.org/wiki/Multilayer_perceptron" target="_blank">Multilayer Perceptron</a></li><li><a href="https://builtin.com/data-science/dot-product-matrix" target="_blank">Dot Product</a></li><li><a href="https://en.wikipedia.org/wiki/Diffusion_model" target="_blank">Diffusion Model</a></li><li><a href="https://en.wikipedia.org/wiki/Gaussian_noise" target="_blank">Gaussian Noise</a></li><li><a href="https://blog.google/technology/ai/google-deepmind-isomorphic-alphafold-3-ai-model/" target="_blank">AlphaFold 3</a></li><li><a href="https://www.anthropic.com/" target="_blank">Anthropic</a></li><li><a href="https://paperswithcode.com/method/sparse-autoencoder" target="_blank">Sparse Autoencoder</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
55 MIN