AI Engineering Podcast
AI Engineering Podcast

AI Engineering Podcast

Tobias Macey

Overview
Episodes

Details

This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.

Recent Episodes

Harnessing The Engine Of AI
DEC 16, 2024
Harnessing The Engine Of AI
Summary<br />In this episode of the AI Engineering Podcast Ron Green, co-founder and CTO of KungFu AI, talks about the evolving landscape of AI systems and the challenges of harnessing generative AI engines. Ron shares his insights on the limitations of large language models (LLMs) as standalone solutions and emphasizes the need for human oversight, multi-agent systems, and robust data management to support AI initiatives. He discusses the potential of domain-specific AI solutions, RAG approaches, and mixture of experts to enhance AI capabilities while addressing risks. The conversation also explores the evolving AI ecosystem, including tooling and frameworks, strategic planning, and the importance of interpretability and control in AI systems. Ron expresses optimism about the future of AI, predicting significant advancements in the next 20 years and the integration of AI capabilities into everyday software applications.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Seamless data integration into AI applications often falls short, leading many to adopt RAG methods, which come with high costs, complexity, and limited scalability. Cognee offers a better solution with its open-source semantic memory engine that automates data ingestion and storage, creating dynamic knowledge graphs from your data. Cognee enables AI agents to understand the meaning of your data, resulting in accurate responses at a lower cost. Take full control of your data in LLM apps without unnecessary overhead. Visit <a href="https://www.aiengineeringpodcast.com/cognee" target="_blank">aiengineeringpodcast.com/cognee</a> to learn more and elevate your AI apps and agents.&nbsp;</li><li>Your host is Tobias Macey and today I'm interviewing Ron Green about the wheels that we need for harnessing the power of the generative AI engine</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you describe what you see as the main shortcomings of LLMs as a stand-alone solution (to anything)?</li><li>The most established vehicle for harnessing LLM capabilities is the RAG pattern. What are the main limitations of that as a "product" solution?</li><li>The idea of multi-agent or mixture-of-experts systems is a more sophisticated approach that is gaining some attention. What do you see as the pro/con conversation around that pattern?</li><li>Beyond the system patterns that are being developed there is also a rapidly shifting ecosystem of frameworks, tools, and point solutions that plugin to various points of the AI lifecycle. How does that volatility hinder the adoption of generative AI in different contexts?<ul><li>In addition to the tooling, the models themselves are rapidly changing. How much does that influence the ways that organizations are thinking about whether and when to test the waters of AI?</li></ul></li><li>Continuing on the metaphor of LLMs and engines and the need for vehicles, where are we on the timeline in relation to the model T Ford?<ul><li>What are the vehicle categories that we still need to design and develop? (e.g. sedans, mini-vans, freight trucks, etc.)</li></ul></li><li>The current transformer architecture is starting to reach scaling limits that lead to diminishing returns. Given your perspective as an industry veteran, what are your thoughts on the future trajectory of AI model architectures?<ul><li>What is the ongoing role of regression style ML in the landscape of generative AI?</li></ul></li><li>What are the most interesting, innovative, or unexpected ways that you have seen LLMs used to power a "vehicle"?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working in this phase of AI?</li><li>When is generative AI/LLMs the wrong choice?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/rongreen/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://www.kungfu.ai/" target="_blank">Kungfu.ai</a></li><li><a href="https://www.llama.com/" target="_blank">Llama</a> open generative AI models</li><li><a href="https://openai.com/index/chatgpt/" target="_blank">ChatGPT</a></li><li><a href="https://github.com/features/copilot" target="_blank">Copilot</a></li><li><a href="https://www.cursor.com/" target="_blank">Cursor</a></li><li><a href="https://en.wikipedia.org/wiki/Retrieval-augmented_generation" target="_blank">RAG == Retrieval Augmented Generation</a><ul><li><a href="https://www.aiengineeringpodcast.com/retrieval-augmented-generation-implementation-episode-34" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://huggingface.co/blog/moe" target="_blank">Mixture of Experts</a></li><li><a href="https://en.wikipedia.org/wiki/Deep_learning" target="_blank">Deep Learning</a></li><li><a href="https://en.wikipedia.org/wiki/Random_forest" target="_blank">Random Forest</a></li><li><a href="https://en.wikipedia.org/wiki/Supervised_learning" target="_blank">Supervised Learning</a></li><li><a href="https://en.wikipedia.org/wiki/Active_learning_(machine_learning" target="_blank">Active Learning</a>)</li><li><a href="https://yann.lecun.com/" target="_blank">Yann LeCunn</a></li><li><a href="https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback" target="_blank">RLHF == Reinforcement Learning from Human Feedback</a></li><li><a href="https://en.wikipedia.org/wiki/Ford_Model_T" target="_blank">Model T Ford</a></li><li><a href="https://github.com/state-spaces/mamba" target="_blank">Mamba selective state space</a></li><li><a href="https://news.mit.edu/2021/machine-learning-adapts-0128" target="_blank">Liquid Network</a></li><li><a href="https://www.promptingguide.ai/techniques/cot" target="_blank">Chain of thought</a></li><li><a href="https://openai.com/o1/" target="_blank">OpenAI o1</a></li><li><a href="https://en.wikipedia.org/wiki/Marvin_Minsky" target="_blank">Marvin Minsky</a></li><li><a href="https://en.wikipedia.org/wiki/Von_Neumann_architecture" target="_blank">Von Neumann Architecture</a></li><li><a href="https://arxiv.org/abs/1706.03762" target="_blank">Attention Is All You Need</a></li><li><a href="https://en.wikipedia.org/wiki/Multilayer_perceptron" target="_blank">Multilayer Perceptron</a></li><li><a href="https://builtin.com/data-science/dot-product-matrix" target="_blank">Dot Product</a></li><li><a href="https://en.wikipedia.org/wiki/Diffusion_model" target="_blank">Diffusion Model</a></li><li><a href="https://en.wikipedia.org/wiki/Gaussian_noise" target="_blank">Gaussian Noise</a></li><li><a href="https://blog.google/technology/ai/google-deepmind-isomorphic-alphafold-3-ai-model/" target="_blank">AlphaFold 3</a></li><li><a href="https://www.anthropic.com/" target="_blank">Anthropic</a></li><li><a href="https://paperswithcode.com/method/sparse-autoencoder" target="_blank">Sparse Autoencoder</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
55 MIN
The Complex World of Generative AI Governance
DEC 1, 2024
The Complex World of Generative AI Governance
Summary<br />In this episode of the AI Engineering Podcast Jim Olsen, CTO of ModelOp, talks about the governance of generative AI models and applications. Jim shares his extensive experience in software engineering and machine learning, highlighting the importance of governance in high-risk applications like healthcare. He explains that governance is more about the use cases of AI models rather than the models themselves, emphasizing the need for proper inventory and monitoring to ensure compliance and mitigate risks. The conversation covers challenges organizations face in implementing AI governance policies, the importance of technical controls for data governance, and the need for ongoing monitoring and baselines to detect issues like PII disclosure and model drift. Jim also discusses the balance between innovation and regulation, particularly with evolving regulations like those in the EU, and provides valuable perspectives on the current state of AI governance and the need for robust model lifecycle management.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Jim Olsen about governance of your generative AI models and applications</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you describe what governance means in the context of generative AI models? (e.g. governing the models, their applications, their outputs, etc.)</li><li>Governance is typically a hybrid endeavor of technical and organizational policy creation and enforcement. From the organizational perspective, what are some of the difficulties that teams are facing in understanding what those policies need to encompass?<ul><li>How much familiarity with the capabilities and limitations of the models is necessary to engage productively with policy debates?</li></ul></li><li>The regulatory landscape around AI is still very nascent. Can you give an overview of the current state of legal burden related to AI?<ul><li>What are some of the regulations that you consider necessary but as-of-yet absent?</li></ul></li><li>Data governance as a practice typically relates to controls over who can access what information and how it can be used. The controls for those policies are generally available in the data warehouse, business intelligence, etc. What are the different dimensions of technical controls that are needed in the application of generative AI systems?<ul><li>How much of the controls that are present for governance of analytical systems are applicable to the generative AI arena?</li></ul></li><li>What are the elements of risk that change when considering internal vs. consumer facing applications of generative AI?<ul><li>How do the modalities of the AI models impact the types of risk that are involved? (e.g. language vs. vision vs. audio)</li></ul></li><li>What are some of the technical aspects of the AI tools ecosystem that are in greatest need of investment to ease the burden of risk and validation of model use?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen AI governance implemented?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI governance?</li><li>What are the technical, social, and organizational trends of AI risk and governance that you are monitoring?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/jimolsen/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://www.modelop.com/" target="_blank">ModelOp</a></li><li><a href="https://en.wikipedia.org/wiki/Foundation_model" target="_blank">Foundation Models</a></li><li><a href="https://en.wikipedia.org/wiki/General_Data_Protection_Regulation" target="_blank">GDPR</a></li><li><a href="https://www.europarl.europa.eu/topics/en/article/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence" target="_blank">EU AI Regulation</a></li><li><a href="https://www.llama.com/llama2/" target="_blank">Llama 2</a></li><li><a href="https://aws.amazon.com/bedrock/" target="_blank">AWS Bedrock</a></li><li><a href="https://en.wikipedia.org/wiki/Shadow_IT" target="_blank">Shadow IT</a></li><li><a href="https://en.wikipedia.org/wiki/Retrieval-augmented_generation" target="_blank">RAG == Retrieval Augmented Generation</a><ul><li><a href="https://www.aiengineeringpodcast.com/retrieval-augmented-generation-implementation-episode-34" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://github.com/NVIDIA/NeMo" target="_blank">Nvidia NEMO</a></li><li><a href="https://www.langchain.com/" target="_blank">LangChain</a></li><li><a href="https://shap.readthedocs.io/en/latest/example_notebooks/overviews/An%20introduction%20to%20explainable%20AI%20with%20Shapley%20values.html" target="_blank">Shapley Values</a></li><li><a href="https://llm-guard.com/output_scanners/gibberish/" target="_blank">Gibberish Detection</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
54 MIN
Building Semantic Memory for AI With Cognee
NOV 25, 2024
Building Semantic Memory for AI With Cognee
Summary<br />In this episode of the AI Engineering Podcast, Vasilije Markovich talks about enhancing Large Language Models (LLMs) with memory to improve their accuracy. He discusses the concept of memory in LLMs, which involves managing context windows to enhance reasoning without the high costs of traditional training methods. He explains the challenges of forgetting in LLMs due to context window limitations and introduces the idea of hierarchical memory, where immediate retrieval and long-term information storage are balanced to improve application performance. Vasilije also shares his work on Cognee, a tool he's developing to manage semantic memory in AI systems, and discusses its potential applications beyond its core use case. He emphasizes the importance of combining cognitive science principles with data engineering to push the boundaries of AI capabilities and shares his vision for the future of AI systems, highlighting the role of personalization and the ongoing development of Cognee to support evolving AI architectures.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Vasilije Markovic about adding memory to LLMs to improve their accuracy</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you describe what "memory" is in the context of LLM systems?</li><li>What are the symptoms of "forgetting" that manifest when interacting with LLMs?<ul><li>How do these issues manifest between single-turn vs. multi-turn interactions?</li></ul></li><li>How does the lack of hierarchical and evolving memory limit the capabilities of LLM systems?</li><li>What are the technical/architectural requirements to add memory to an LLM system/application?</li><li>How does Cognee help to address the shortcomings of current LLM/RAG architectures?</li><li>Can you describe how Cognee is implemented?<ul><li>Recognizing that it has only existed for a short time, how have the design and scope of Cognee evolved since you first started working on it?</li></ul></li><li>What are the data structures that are most useful for managing the memory structures?</li><li>For someone who wants to incorporate Cognee into their LLM architecture, what is involved in integrating it into their applications?<ul><li>How does it change the way that you think about the overall requirements for an LLM application?</li></ul></li><li>For systems that interact with multiple LLMs, how does Cognee manage context across those systems? (e.g. different agents for different use cases)</li><li>There are other systems that are being built to manage user personalization in LLm applications, how do the goals of Cognee relate to those use cases? (e.g. Mem0 - <a href="https://github.com/mem0ai/mem0)" target="_blank">https://github.com/mem0ai/mem0)</a></li><li>What are the unknowns that you are still navigating with Cognee?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen Cognee used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on Cognee?</li><li>When is Cognee the wrong choice?</li><li>What do you have planned for the future of Cognee?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/vasilije-markovic-13302471/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://www.cognee.ai/" target="_blank">Cognee</a></li><li><a href="https://en.wikipedia.org/wiki/Montenegro" target="_blank">Montenegro</a></li><li><a href="https://en.wikipedia.org/wiki/Catastrophic_interference" target="_blank">Catastrophic Forgetting</a></li><li><a href="https://poly.ai/blog/multi-turn-conversations-what-are-they-and-why-do-they-matter-for-your-customers/" target="_blank">Multi-Turn Interaction</a></li><li><a href="https://en.wikipedia.org/wiki/Retrieval-augmented_generation" target="_blank">RAG == Retrieval Augmented Generation</a><ul><li><a href="https://www.aiengineeringpodcast.com/retrieval-augmented-generation-implementation-episode-34" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://neo4j.com/blog/graphrag-manifesto/" target="_blank">GraphRAG</a><ul><li><a href="https://www.aiengineeringpodcast.com/graphrag-knowledge-graph-semantic-retrieval-episode-37" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://en.wikipedia.org/wiki/Long-term_memory" target="_blank">Long-term memory</a></li><li><a href="https://en.wikipedia.org/wiki/Short-term_memory" target="_blank">Short-term memory</a></li><li><a href="https://www.langchain.com/" target="_blank">Langchain</a></li><li><a href="https://www.llamaindex.ai/" target="_blank">LlamaIndex</a></li><li><a href="https://haystack.deepset.ai/" target="_blank">Haystack</a></li><li><a href="https://dlthub.com/" target="_blank">dlt</a><ul><li><a href="https://www.dataengineeringpodcast.com/dlt-pure-python-data-integration-episode-441" target="_blank">Data Engineering Podcast Episode</a></li></ul></li><li><a href="https://www.pinecone.io/" target="_blank">Pinecone</a><ul><li><a href="https://www.dataengineeringpodcast.com/pinecone-vector-database-similarity-search-episode-189/" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://weaviate.io/blog/what-is-agentic-rag" target="_blank">Agentic RAG</a></li><li><a href="https://airflow.apache.org/" target="_blank">Airflow</a></li><li><a href="https://en.wikipedia.org/wiki/Directed_acyclic_graph" target="_blank">DAG == Directed Acyclic Graph</a></li><li><a href="https://github.com/FalkorDB/falkordb" target="_blank">FalkorDB</a></li><li><a href="https://neo4j.com/" target="_blank">Neo4J</a></li><li><a href="https://pydantic.dev/" target="_blank">Pydantic</a></li><li><a href="https://aws.amazon.com/ecs/" target="_blank">AWS ECS</a></li><li><a href="https://aws.amazon.com/sns/" target="_blank">AWS SNS</a></li><li><a href="https://aws.amazon.com/sqs/" target="_blank">AWS SQS</a></li><li><a href="https://aws.amazon.com/lambda/" target="_blank">AWS Lambda</a></li><li><a href="https://www.evidentlyai.com/llm-guide/llm-as-a-judge" target="_blank">LLM As Judge</a></li><li><a href="https://www.mem0.ai/" target="_blank">Mem0</a></li><li><a href="https://qdrant.tech/" target="_blank">QDrant</a></li><li><a href="https://lancedb.com/" target="_blank">LanceDB</a></li><li><a href="https://duckdb.org/" target="_blank">DuckDB</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
55 MIN
The Impact of Generative AI on Software Development
NOV 22, 2024
The Impact of Generative AI on Software Development
Summary<br />In this episode of the AI Engineering Podcast, Tanner Burson, VP of Engineering at Prismatic, talks about the evolving impact of generative AI on software developers. Tanner shares his insights from engineering leadership and data engineering initiatives, discussing how AI is blurring the lines of developer roles and the strategic value of AI in software development. He explores the current landscape of AI tools, such as GitHub's Copilot, and their influence on productivity and workflow, while also touching on the challenges and opportunities presented by AI in code generation, review, and tooling. Tanner emphasizes the need for human oversight to maintain code quality and security, and offers his thoughts on the future of AI in development, the importance of balancing innovation with practicality, and the evolving role of engineers in an AI-driven landscape.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Tanner Burson about the impact of generative AI on software developers</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you describe what types of roles and work you consider encompassed by the term "developers" for the purpose of this conversation?</li><li>How does your work at Prismatic give you visibility and insight into the effects of AI on developers and their work?</li><li>There have been many competing narratives about AI and how much of the software development process it is capable of encompassing. What is your top-level view on what the long-term impact on the job prospects of software developers will be as a result of generative AI?</li><li>There are many obvious examples of utilities powered by generative AI that are focused on software development. What do you see as the categories or specific tools that are most impactful to the development cycle?</li><li>In what ways do you find familiarity with/understanding of LLM internals useful when applying them to development processes?</li><li>As an engineering leader, how are you evaluating and guiding your team on the use of AI powered tools?<ul><li>What are some of the risks that you are guarding against as a result of AI in the development process?</li></ul></li><li>What are the most interesting, innovative, or unexpected ways that you have seen AI used in the development process?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while using AI for software development?</li><li>When is AI the wrong choice for a developer?</li><li>What are your projections for the near to medium term impact on the developer experience as a result of generative AI?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/tannerburson/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://prismatic.io/" target="_blank">Prismatic</a></li><li><a href="https://arstechnica.com/ai/2024/10/google-ceo-says-over-25-of-new-google-code-is-generated-by-ai/" target="_blank">Google AI Development announcement</a></li><li><a href="https://www.tabnine.com/" target="_blank">Tabnine</a><ul><li><a href="https://www.aiengineeringpodcast.com/tabnine-generative-ai-developer-assistant-episode-24" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://github.com/features/copilot" target="_blank">GitHub Copilot</a></li><li><a href="https://github.com/plandex-ai/plandex" target="_blank">Plandex</a></li><li><a href="https://platform.openai.com/docs/overview" target="_blank">OpenAI API</a></li><li><a href="https://aws.amazon.com/q/" target="_blank">Amazon Q</a></li><li><a href="https://ollama.com/" target="_blank">Ollama</a></li><li><a href="https://huggingface.co/docs/transformers/en/index" target="_blank">Huggingface Transformers</a></li><li><a href="https://www.anthropic.com/" target="_blank">Anthropic</a></li><li><a href="https://www.langchain.com/" target="_blank">Langchain</a></li><li><a href="https://www.llamaindex.ai/" target="_blank">Llamaindex</a></li><li><a href="https://haystack.deepset.ai/" target="_blank">Haystack</a></li><li><a href="https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/" target="_blank">Llama 3.2</a></li><li><a href="https://github.com/QwenLM/Qwen2.5-Coder" target="_blank">Qwen2.5-Coder</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
52 MIN
ML Infrastructure Without The Ops: Simplifying The ML Developer Experience With Runhouse
NOV 11, 2024
ML Infrastructure Without The Ops: Simplifying The ML Developer Experience With Runhouse
Summary<br />Machine learning workflows have long been complex and difficult to operationalize. They are often characterized by a period of research, resulting in an artifact that gets passed to another engineer or team to prepare for running in production. The MLOps category of tools have tried to build a new set of utilities to reduce that friction, but have instead introduced a new barrier at the team and organizational level. Donny Greenberg took the lessons that he learned on the PyTorch team at Meta and created Runhouse. In this episode he explains how, by reducing the number of opinions in the framework, he has also reduced the complexity of moving from development to production for ML systems.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Donny Greenberg about Runhouse and the current state of ML infrastructure</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>What are the core elements of infrastructure for ML and AI?<ul><li>How has that changed over the past ~5 years?</li><li>For the past few years the MLOps and data engineering stacks were built and managed separately. How does the current generation of tools and product requirements influence the present and future approach to those domains?</li></ul></li><li>There are numerous projects that aim to bridge the complexity gap in running Python and ML code from your laptop up to distributed compute on clouds (e.g. Ray, Metaflow, Dask, Modin, etc.). How do you view the decision process for teams trying to understand which tool(s) to use for managing their ML/AI developer experience?</li><li>Can you describe what Runhouse is and the story behind it?<ul><li>What are the core problems that you are working to solve?</li><li>What are the main personas that you are focusing on? (e.g. data scientists, DevOps, data engineers, etc.)</li><li>How does Runhouse factor into collaboration across skill sets and teams?</li></ul></li><li>Can you describe how Runhouse is implemented?<ul><li>How has the focus on developer experience informed the way that you think about the features and interfaces that you include in Runhouse?</li></ul></li><li>How do you think about the role of Runhouse in the integration with the AI/ML and data ecosystem?</li><li>What does the workflow look like for someone building with Runhouse?</li><li>What is involved in managing the coordination of compute and data locality to reduce networking costs and latencies?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen Runhouse used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on Runhouse?</li><li>When is Runhouse the wrong choice?</li><li>What do you have planned for the future of Runhouse?</li><li>What is your vision for the future of infrastructure and developer experience in ML/AI?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/greenbergdon/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://www.run.house/" target="_blank">Runhouse</a><ul><li><a href="https://github.com/run-house/runhouse" target="_blank">GitHub</a></li></ul></li><li><a href="https://pytorch.org/" target="_blank">PyTorch</a><ul><li><a href="https://www.pythonpodcast.com/pytorch-deep-learning-epsiode-202" target="_blank">Podcast.__init__ Episode</a></li></ul></li><li><a href="https://kubernetes.io/" target="_blank">Kubernetes</a></li><li><a href="https://en.wikipedia.org/wiki/Bin_packing_problem" target="_blank">Bin Packing</a></li><li><a href="https://en.wikipedia.org/wiki/Linear_regression" target="_blank">Linear Regression</a></li><li><a href="https://developers.google.com/machine-learning/decision-forests/intro-to-gbdt" target="_blank">Gradient Boosted Decision Tree</a></li><li><a href="https://en.wikipedia.org/wiki/Deep_learning" target="_blank">Deep Learning</a></li><li><a href="https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture" target="_blank">Transformer Architecture</a>)</li><li><a href="https://slurm.schedmd.com/documentation.html" target="_blank">Slurm</a></li><li><a href="https://aws.amazon.com/sagemaker/" target="_blank">Sagemaker</a></li><li><a href="https://cloud.google.com/vertex-ai?hl=en" target="_blank">Vertex AI</a></li><li><a href="https://metaflow.org/" target="_blank">Metaflow</a><ul><li><a href="https://www.pythonpodcast.com/metaflow-machine-learning-operations-episode-274" target="_blank">Podcast.__init__ Episode</a></li></ul></li><li><a href="https://mlflow.org/" target="_blank">MLFlow</a></li><li><a href="https://www.dask.org/" target="_blank">Dask</a><ul><li><a href="https://www.dataengineeringpodcast.com/episode-2-dask-with-matthew-rocklin" target="_blank">Data Engineering Podcast Episode</a></li></ul></li><li><a href="https://www.ray.io/" target="_blank">Ray</a><ul><li><a href="https://www.pythonpodcast.com/ray-distributed-computing-episode-258" target="_blank">Podcast.__init__ Episode</a></li></ul></li><li><a href="https://spark.apache.org/" target="_blank">Spark</a></li><li><a href="https://www.databricks.com/" target="_blank">Databricks</a></li><li><a href="https://www.snowflake.com/en/" target="_blank">Snowflake</a></li><li><a href="https://argo-cd.readthedocs.io/en/stable/" target="_blank">ArgoCD</a></li><li><a href="https://pytorch.org/tutorials/beginner/dist_overview.html" target="_blank">PyTorch Distributed</a></li><li><a href="https://horovod.ai/" target="_blank">Horovod</a></li><li><a href="https://github.com/ggerganov/llama.cpp" target="_blank">Llama.cpp</a></li><li><a href="https://www.prefect.io/" target="_blank">Prefect</a><ul><li><a href="https://www.dataengineeringpodcast.com/prefect-workflow-engine-episode-86" target="_blank">Data Engineering Podcast Episode</a></li></ul></li><li><a href="https://airflow.apache.org/" target="_blank">Airflow</a></li><li><a href="https://en.wikipedia.org/wiki/Out_of_memory" target="_blank">OOM == Out of Memory</a></li><li><a href="https://wandb.ai/site/" target="_blank">Weights and Biases</a></li><li><a href="https://knative.dev/docs/" target="_blank">KNative</a></li><li><a href="https://en.wikipedia.org/wiki/BERT_(language_model" target="_blank">BERT</a> language model</li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
76 MIN