AI Engineering Podcast
AI Engineering Podcast

AI Engineering Podcast

Tobias Macey

Overview
Episodes

Details

This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.

Recent Episodes

ML Infrastructure Without The Ops: Simplifying The ML Developer Experience With Runhouse
NOV 11, 2024
ML Infrastructure Without The Ops: Simplifying The ML Developer Experience With Runhouse
Summary<br />Machine learning workflows have long been complex and difficult to operationalize. They are often characterized by a period of research, resulting in an artifact that gets passed to another engineer or team to prepare for running in production. The MLOps category of tools have tried to build a new set of utilities to reduce that friction, but have instead introduced a new barrier at the team and organizational level. Donny Greenberg took the lessons that he learned on the PyTorch team at Meta and created Runhouse. In this episode he explains how, by reducing the number of opinions in the framework, he has also reduced the complexity of moving from development to production for ML systems.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Donny Greenberg about Runhouse and the current state of ML infrastructure</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>What are the core elements of infrastructure for ML and AI?<ul><li>How has that changed over the past ~5 years?</li><li>For the past few years the MLOps and data engineering stacks were built and managed separately. How does the current generation of tools and product requirements influence the present and future approach to those domains?</li></ul></li><li>There are numerous projects that aim to bridge the complexity gap in running Python and ML code from your laptop up to distributed compute on clouds (e.g. Ray, Metaflow, Dask, Modin, etc.). How do you view the decision process for teams trying to understand which tool(s) to use for managing their ML/AI developer experience?</li><li>Can you describe what Runhouse is and the story behind it?<ul><li>What are the core problems that you are working to solve?</li><li>What are the main personas that you are focusing on? (e.g. data scientists, DevOps, data engineers, etc.)</li><li>How does Runhouse factor into collaboration across skill sets and teams?</li></ul></li><li>Can you describe how Runhouse is implemented?<ul><li>How has the focus on developer experience informed the way that you think about the features and interfaces that you include in Runhouse?</li></ul></li><li>How do you think about the role of Runhouse in the integration with the AI/ML and data ecosystem?</li><li>What does the workflow look like for someone building with Runhouse?</li><li>What is involved in managing the coordination of compute and data locality to reduce networking costs and latencies?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen Runhouse used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on Runhouse?</li><li>When is Runhouse the wrong choice?</li><li>What do you have planned for the future of Runhouse?</li><li>What is your vision for the future of infrastructure and developer experience in ML/AI?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/greenbergdon/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://www.run.house/" target="_blank">Runhouse</a><ul><li><a href="https://github.com/run-house/runhouse" target="_blank">GitHub</a></li></ul></li><li><a href="https://pytorch.org/" target="_blank">PyTorch</a><ul><li><a href="https://www.pythonpodcast.com/pytorch-deep-learning-epsiode-202" target="_blank">Podcast.__init__ Episode</a></li></ul></li><li><a href="https://kubernetes.io/" target="_blank">Kubernetes</a></li><li><a href="https://en.wikipedia.org/wiki/Bin_packing_problem" target="_blank">Bin Packing</a></li><li><a href="https://en.wikipedia.org/wiki/Linear_regression" target="_blank">Linear Regression</a></li><li><a href="https://developers.google.com/machine-learning/decision-forests/intro-to-gbdt" target="_blank">Gradient Boosted Decision Tree</a></li><li><a href="https://en.wikipedia.org/wiki/Deep_learning" target="_blank">Deep Learning</a></li><li><a href="https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture" target="_blank">Transformer Architecture</a>)</li><li><a href="https://slurm.schedmd.com/documentation.html" target="_blank">Slurm</a></li><li><a href="https://aws.amazon.com/sagemaker/" target="_blank">Sagemaker</a></li><li><a href="https://cloud.google.com/vertex-ai?hl=en" target="_blank">Vertex AI</a></li><li><a href="https://metaflow.org/" target="_blank">Metaflow</a><ul><li><a href="https://www.pythonpodcast.com/metaflow-machine-learning-operations-episode-274" target="_blank">Podcast.__init__ Episode</a></li></ul></li><li><a href="https://mlflow.org/" target="_blank">MLFlow</a></li><li><a href="https://www.dask.org/" target="_blank">Dask</a><ul><li><a href="https://www.dataengineeringpodcast.com/episode-2-dask-with-matthew-rocklin" target="_blank">Data Engineering Podcast Episode</a></li></ul></li><li><a href="https://www.ray.io/" target="_blank">Ray</a><ul><li><a href="https://www.pythonpodcast.com/ray-distributed-computing-episode-258" target="_blank">Podcast.__init__ Episode</a></li></ul></li><li><a href="https://spark.apache.org/" target="_blank">Spark</a></li><li><a href="https://www.databricks.com/" target="_blank">Databricks</a></li><li><a href="https://www.snowflake.com/en/" target="_blank">Snowflake</a></li><li><a href="https://argo-cd.readthedocs.io/en/stable/" target="_blank">ArgoCD</a></li><li><a href="https://pytorch.org/tutorials/beginner/dist_overview.html" target="_blank">PyTorch Distributed</a></li><li><a href="https://horovod.ai/" target="_blank">Horovod</a></li><li><a href="https://github.com/ggerganov/llama.cpp" target="_blank">Llama.cpp</a></li><li><a href="https://www.prefect.io/" target="_blank">Prefect</a><ul><li><a href="https://www.dataengineeringpodcast.com/prefect-workflow-engine-episode-86" target="_blank">Data Engineering Podcast Episode</a></li></ul></li><li><a href="https://airflow.apache.org/" target="_blank">Airflow</a></li><li><a href="https://en.wikipedia.org/wiki/Out_of_memory" target="_blank">OOM == Out of Memory</a></li><li><a href="https://wandb.ai/site/" target="_blank">Weights and Biases</a></li><li><a href="https://knative.dev/docs/" target="_blank">KNative</a></li><li><a href="https://en.wikipedia.org/wiki/BERT_(language_model" target="_blank">BERT</a> language model</li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
76 MIN
Building AI Systems on Postgres: An Inside Look at pgai Vectorizer
NOV 11, 2024
Building AI Systems on Postgres: An Inside Look at pgai Vectorizer
Summary<br />With the growth of vector data as a core element of any AI application comes the need to keep those vectors up to date. When you go beyond prototypes and into production you will need a way to continue experimenting with new embedding models, chunking strategies, etc. You will also need a way to keep the embeddings up to date as your data changes. The team at Timescale created the pgai Vectorizer toolchain to let you manage that work in your Postgres database. In this episode Avthar Sewrathan explains how it works and how you can start using it today.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Avthar Sewrathan about the pgai extension for Postgres and how to run your AI workflows in your database</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you describe what pgai Vectorizer is and the story behind it?</li><li>What are the benefits of using the database engine to execute AI workflows?<ul><li>What types of operations does pgai Vectorizer enable?</li><li>What are some common generative AI patterns that can't be done with pgai?</li></ul></li><li>AI applications require a large and complex set of dependencies. How does that work with pgai Vectorizer and the Python runtime in Postgres?<ul><li>What are some of the other challenges or system pressures that are introduced by running these AI workflows in the database context?</li></ul></li><li>Can you describe how the pgai extension is implemented?</li><li>With the rapid pace of change in the AI ecosystem, how has that informed the set of features that make sense in pgai Vectorizer and won't require rebuilding in 6 months?</li><li>Can you describe the workflow of using pgai Vectorizer to build and maintain a set of embeddings in their database?<ul><li>How can pgai Vectorizer help with the situation of migrating to a new embedding model and having to reindex all of the content?</li></ul></li><li>How do you think about the developer experience for people who are working with pgai Vectorizer, as compared to using e.g. LangChain, LlamaIndex, etc.?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen pgai Vectorizer used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on pgai Vectorizer?</li><li>When is pgai Vectorizer the wrong choice?</li><li>What do you have planned for the future of pgai Vectorizer?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/avthars/" target="_blank">LinkedIn</a></li><li><a href="https://avthar.com/" target="_blank">Website</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://www.timescale.com/" target="_blank">Timescale</a></li><li><a href="https://github.com/timescale/pgai" target="_blank">pgai</a></li><li><a href="https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture" target="_blank">Transformer</a> architecture for deep learning</li><li><a href="https://en.wikipedia.org/wiki/Neural_network_(machine_learning" target="_blank">Neural Networks</a></li><li><a href="https://github.com/pgvector/pgvector" target="_blank">pgvector</a></li><li><a href="https://github.com/timescale/pgvectorscale" target="_blank">pgvectorscale</a></li><li><a href="https://modal.com/docs" target="_blank">Modal</a></li><li><a href="https://en.wikipedia.org/wiki/Retrieval-augmented_generation" target="_blank">RAG == Retrieval Augmented Generation</a></li><li><a href="https://en.wikipedia.org/wiki/Semantic_search" target="_blank">Semantic Search</a></li><li><a href="https://ollama.com/" target="_blank">Ollama</a></li><li><a href="https://neo4j.com/blog/graphrag-manifesto/" target="_blank">GraphRAG</a></li><li><a href="https://github.com/bitnine-oss/agensgraph" target="_blank">agensgraph</a></li><li><a href="https://www.langchain.com/" target="_blank">LangChain</a></li><li><a href="https://www.llamaindex.ai/" target="_blank">LlamaIndex</a></li><li><a href="https://haystack.deepset.ai/" target="_blank">Haystack</a></li><li><a href="https://skyzh.github.io/write-you-a-vector-db/cpp-05-ivfflat.html" target="_blank">IVFFlat</a></li><li><a href="https://skyzh.github.io/write-you-a-vector-db/cpp-06-02-hnsw.html" target="_blank">HNSW</a></li><li><a href="https://proceedings.neurips.cc/paper_files/paper/2019/file/09853c7fb1d3f8ee67a61b6bf4a7f8e6-Paper.pdf" target="_blank">DiskANN</a></li><li><a href="https://docs.replit.com/replitai/agent" target="_blank">Repl.it Agent</a></li><li><a href="https://en.wikipedia.org/wiki/Okapi_BM25" target="_blank">BM25</a></li><li><a href="https://www.postgresql.org/docs/current/datatype-textsearch.html#DATATYPE-TSVECTOR" target="_blank">TSVector</a></li><li><a href="https://www.paradedb.com/" target="_blank">ParadeDB</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
53 MIN
Running Generative AI Models In Production
OCT 28, 2024
Running Generative AI Models In Production
Summary<br />In this episode Philip Kiely from BaseTen talks about the intricacies of running open models in production. Philip shares his journey into AI and ML engineering, highlighting the importance of understanding product-level requirements and selecting the right model for deployment. The conversation covers the operational aspects of deploying AI models, including model evaluation, compound AI, and model serving frameworks such as TensorFlow Serving and AWS SageMaker. Philip also discusses the challenges of model quantization, rapid model evolution, and monitoring and observability in AI systems, offering valuable insights into the future trends in AI, including local inference and the competition between open source and proprietary models.<br /><br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Philip Kiely about running open models in production</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you start by giving an overview of the major decisions to be made when planning the deployment of a generative AI model?</li><li>How does the model selected in the beginning of the process influence the downstream choices?</li><li>In terms of application architecture, the major patterns that I've seen are RAG, fine-tuning, multi-agent, or large model. What are the most common methods that you see? (and any that I failed to mention)<ul><li>How have the rapid succession of model generations impacted the ways that teams think about their overall application? (capabilities, features, architecture, etc.)</li></ul></li><li>In terms of model serving, I know that Baseten created Truss. What are some of the other notable options that teams are building with?<ul><li>What is the role of the serving framework in the context of the application?</li></ul></li><li>There are also a large number of inference engines that have been released. What are the major players in that arena?<ul><li>What are the features and capabilities that they are each basing their competitive advantage on?</li></ul></li><li>For someone who is new to AI Engineering, what are some heuristics that you would recommend when choosing an inference engine?</li><li>Once a model (or set of models) is in production and serving traffic it's necessary to have visibility into how it is performing. What are the key metrics that are necessary to monitor for generative AI systems?<ul><li>In the event that one (or more) metrics are trending negatively, what are the levers that teams can pull to improve them?</li></ul></li><li>When running models constructed with e.g. linear regression or deep learning there was a common issue with "concept drift". How does that manifest in the context of large language models, particularly when coupled with performance optimization?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen teams manage the serving of open gen AI models?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working with generative AI model serving?</li><li>When is Baseten the wrong choice?</li><li>What are the future trends and technology investments that you are focused on in the space of AI model serving?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/philipkiely/" target="_blank">LinkedIn</a></li><li><a href="https://x.com/philip_kiely" target="_blank">Twitter</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://www.baseten.co/" target="_blank">Baseten</a><ul><li><a href="https://www.aiengineeringpodcast.com/wrap-your-model-in-a-full-stack-application-in-an-afternoon-with-baseten" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://en.wikipedia.org/wiki/Copyleft" target="_blank">Copyleft</a></li><li><a href="https://www.llama.com/" target="_blank">Llama Models</a></li><li><a href="https://www.nomic.ai/blog/posts/nomic-embed-text-v1" target="_blank">Nomic</a></li><li><a href="https://allenai.org/olmo" target="_blank">Olmo</a></li><li><a href="https://allenai.org/" target="_blank">Allen Institute for AI</a></li><li><a href="https://www.baseten.co/library/playground-v2-aesthetic/" target="_blank">Playground 2</a></li><li><a href="https://calmfund.com/thesis#:~:text=The%20Essential%20Ingredient%3A%20The%20Peace%20Dividend%20of%20the%20SaaS%20Wars&amp;text=A%20peace%20dividend%20refers%20to,put%20it%20to%20better%20uses." target="_blank">The Peace Dividend Of The SaaS Wars</a></li><li><a href="https://vercel.com/" target="_blank">Vercel</a></li><li><a href="https://www.netlify.com/" target="_blank">Netlify</a></li><li><a href="https://en.wikipedia.org/wiki/Retrieval-augmented_generation" target="_blank">RAG == Retrieval Augmented Generation</a><ul><li><a href="https://www.aiengineeringpodcast.com/retrieval-augmented-generation-implementation-episode-34" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://www.baseten.co/blog/compound-ai-systems-explained/" target="_blank">Compound AI</a></li><li><a href="https://www.langchain.com/" target="_blank">Langchain</a></li><li><a href="https://github.com/dottxt-ai/outlines" target="_blank">Outlines</a> Structured output for AI systems</li><li><a href="https://docs.baseten.co/deploy/overview" target="_blank">Truss</a></li><li><a href="https://docs.baseten.co/chains/overview" target="_blank">Chains</a></li><li><a href="https://www.llamaindex.ai/" target="_blank">Llamaindex</a></li><li><a href="https://www.ray.io/" target="_blank">Ray</a></li><li><a href="https://mlflow.org/" target="_blank">MLFlow</a></li><li><a href="https://github.com/replicate/cog" target="_blank">Cog</a> (Replicate) containers for ML</li><li><a href="https://www.bentoml.com/" target="_blank">BentoML</a></li><li><a href="https://www.djangoproject.com/" target="_blank">Django</a></li><li><a href="https://wsgi.readthedocs.io/en/latest/what.html" target="_blank">WSGI</a></li><li><a href="https://uwsgi-docs.readthedocs.io/en/latest/" target="_blank">uWSGI</a></li><li><a href="https://gunicorn.org/" target="_blank">Gunicorn</a></li><li><a href="https://zapier.com/" target="_blank">Zapier</a></li><li><a href="https://github.com/vllm-project/vllm" target="_blank">vLLM</a></li><li><a href="https://github.com/NVIDIA/TensorRT-LLM" target="_blank">TensorRT-LLM</a></li><li><a href="https://developer.nvidia.com/tensorrt" target="_blank">TensorRT</a></li><li><a href="https://www.baseten.co/blog/introduction-to-quantizing-ml-models/" target="_blank">Quantization</a></li><li><a href="https://arxiv.org/abs/2106.09685" target="_blank">LoRA</a> Low Rank Adaptation of Large Language Models</li><li><a href="https://en.wikipedia.org/wiki/Decision_tree_pruning" target="_blank">Pruning</a></li><li><a href="https://en.wikipedia.org/wiki/Knowledge_distillation" target="_blank">Distillation</a></li><li><a href="https://grafana.com/" target="_blank">Grafana</a></li><li><a href="https://pytorch.org/blog/hitchhikers-guide-speculative-decoding/" target="_blank">Speculative Decoding</a></li><li><a href="https://groq.com/" target="_blank">Groq</a></li><li><a href="https://www.runpod.io/" target="_blank">Runpod</a></li><li><a href="https://lambdalabs.com/" target="_blank">Lambda Labs</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
57 MIN
Enhancing AI Retrieval with Knowledge Graphs: A Deep Dive into GraphRAG
SEP 10, 2024
Enhancing AI Retrieval with Knowledge Graphs: A Deep Dive into GraphRAG
Summary<br />In this episode of the AI Engineering podcast, Philip Rathle, CTO of Neo4J, talks about the intersection of knowledge graphs and AI retrieval systems, specifically Retrieval Augmented Generation (RAG). He delves into GraphRAG, a novel approach that combines knowledge graphs with vector-based similarity search to enhance generative AI models. Philip explains how GraphRAG works by integrating a graph database for structured data storage, providing more accurate and explainable AI responses, and addressing limitations of traditional retrieval systems. The conversation covers technical aspects such as data modeling, entity extraction, and ontology use cases, as well as the infrastructure and workflow required to support GraphRAG, setting the stage for innovative applications across various industries.<br /><br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Philip Rathle about the application of knowledge graphs in AI retrieval systems</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you describe what GraphRAG is?<ul><li>What are the capabilities that graph structures offer beyond vector/similarity-based retrieval methods of prompting?</li></ul></li><li>What are some examples of the ways that semantic limitations of nearest-neighbor vector retrieval fail to provide relevant results?</li><li>What are the technical requirements to implement graph-augmented retrieval?<ul><li>What are the concrete ways in which the embedding and retrieval steps of a typical RAG pipeline need to be modified to account for the addition of the graph?</li></ul></li><li>Many tutorials for building vector-based knowledge repositories skip over considerations around data modeling. For building a graph-based knowledge repository there obviously needs to be a bit more work put in. What are the key design choices that need to be made for implementing the graph for an AI application?<ul><li>How does the selection of the ontology/taxonomy impact the performance and capabilities of the resulting application?</li></ul></li><li>Building a fully functional knowledge graph can be a significant undertaking on its own. How can LLMs and AI models help with the construction and maintenance of that knowledge repository?<ul><li>What are some of the validation methods that should be brought to bear to ensure that the resulting graph properly represents the knowledge domain that you are trying to model?</li></ul></li><li>Vector embedding and retrieval are a core building block for a majority of AI application frameworks. How much support do you see for GraphRAG in the ecosystem?<ul><li>For the case where someone is using a framework that does not explicitly implement GraphRAG techniques, what are some of the implementation strategies that you have seen be most effective for adding that functionality?</li></ul></li><li>What are some of the ways that the combination of vector search and knowledge graphs are useful independent of their combination with language models?</li><li>What are the most interesting, innovative, or unexpected ways that you have seen GraphRAG used?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on GraphRAG applications?</li><li>When is GraphRAG the wrong choice?</li><li>What are the opportunities for improvement in the design and implementation of graph-based retrieval systems?</li></ul>Contact Info<br /><ul><li><a href="https://www.linkedin.com/in/prathle/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://neo4j.com/" target="_blank">Neo4J</a></li><li><a href="https://neo4j.com/blog/graphrag-manifesto/" target="_blank">GraphRAG Manifesto</a></li><li><a href="https://github.blog/ai-and-ml/generative-ai/what-is-retrieval-augmented-generation-and-what-does-it-do-for-generative-ai/" target="_blank">RAG == Retrieval Augmented Generation</a><ul><li><a href="https://www.aiengineeringpodcast.com/retrieval-augmented-generation-implementation-episode-34" target="_blank">Podcast Episode</a></li></ul></li><li><a href="https://en.wikipedia.org/wiki/Very_large_database" target="_blank">VLDB == Very Large DataBases</a></li><li><a href="https://en.wikipedia.org/wiki/Knowledge_graph" target="_blank">Knowledge Graph</a></li><li><a href="https://en.wikipedia.org/wiki/Nearest_neighbor_search" target="_blank">Nearest Neighbor Search</a></li><li><a href="https://en.wikipedia.org/wiki/PageRank" target="_blank">PageRank</a></li><li><a href="https://blog.google/products/search/introducing-knowledge-graph-things-not/" target="_blank">Things Not Strings</a>) Google Knowledge Graph Paper</li><li><a href="https://github.com/pgvector/pgvector" target="_blank">pgvector</a></li><li><a href="https://www.pinecone.io/" target="_blank">Pinecone</a><ul><li><a href="https://www.dataengineeringpodcast.com/pinecone-vector-database-similarity-search-episode-189/" target="_blank">Data Engineering Podcast Episode</a></li></ul></li><li><a href="https://neo4j.com/docs/getting-started/data-modeling/relational-to-graph-modeling/" target="_blank">Tables To Labels</a></li><li><a href="https://en.wikipedia.org/wiki/Natural_language_processing" target="_blank">NLP == Natural Language Processing</a></li><li><a href="https://graph.build/resources/ontology" target="_blank">Ontology</a></li><li><a href="https://www.langchain.com/" target="_blank">LangChain</a></li><li><a href="https://www.llamaindex.ai/" target="_blank">LlamaIndex</a></li><li><a href="https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback" target="_blank">RLHF == Reinforcement Learning with Human Feedback</a></li><li><a href="https://senzing.com/" target="_blank">Senzing</a></li><li><a href="https://neo4j.com/labs/genai-ecosystem/neoconverse/" target="_blank">NeoConverse</a></li><li><a href="https://en.wikipedia.org/wiki/Cypher_(query_language" target="_blank">Cypher</a> query language</li><li><a href="https://www.gqlstandards.org/" target="_blank">GQL</a> query standard</li><li><a href="https://aws.amazon.com/bedrock/" target="_blank">AWS Bedrock</a></li><li><a href="https://cloud.google.com/vertex-ai" target="_blank">Vertex AI</a></li><li><a href="https://www.sequoiacap.com/podcast/training-data-sebastian-siemiatkowski/" target="_blank">Sequoia Training Data - Klarna episode</a></li><li><a href="https://en.wikipedia.org/wiki/Ouroboros" target="_blank">Ouroboros</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
59 MIN
Harnessing Generative AI for Effective Digital Advertising Campaigns
SEP 2, 2024
Harnessing Generative AI for Effective Digital Advertising Campaigns
Summary<br />In this episode of the AI Engineering podcast Praveen Gujar, Director of Product at LinkedIn, talks about the applications of generative AI in digital advertising. He highlights the key areas of digital advertising, including audience targeting, content creation, and ROI measurement, and delves into how generative AI is revolutionizing these aspects. Praveen shares successful case studies of generative AI in digital advertising, including campaigns by Heinz, the Barbie movie, and Maggi, and discusses the potential pitfalls and risks associated with AI-powered tools. He concludes with insights into the future of generative AI in digital advertising, highlighting the importance of cultural transformation and the synergy between human creativity and AI.<br />Announcements<br /><ul><li>Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems</li><li>Your host is Tobias Macey and today I'm interviewing Praveen Gujar about the applications of generative AI in digital advertising</li></ul>Interview<br /><ul><li>Introduction</li><li>How did you get involved in machine learning?</li><li>Can you start by defining "digital advertising" for the scope of this conversation?<ul><li>What are the key elements/characteristics/goals of digital avertising?</li></ul></li><li>In the world before generative AI, what did a typical end-to-end advertising campaign workflow look like?<ul><li>What are the stages of that workflow where generative AI are proving to be most useful?<ul><li>How do the current limitations of generative AI (e.g. hallucinations, non-determinism) impact the ways in which they can be used?</li></ul></li></ul></li><li>What are the technological and organizational systems that need to be implemented to effectively apply generative AI in public-facing applications that are so closely tied to brand/company image?<ul><li>What are the elements of user education/expectation setting that are necessary when working with marketing/advertising personnel to help avoid damage to the brands?</li></ul></li><li>What are some examples of applications for generative AI in digital advertising that have gone well?<ul><li>Any that have gone wrong?</li></ul></li><li>What are the most interesting, innovative, or unexpected ways that you have seen generative AI used in digital advertising?</li><li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on digital advertising applications of generative AI?</li><li>When is generative AI the wrong choice?</li><li>What are your future predictions for the use of generative AI in dgital advertising?</li></ul>Contact Info<br /><ul><li><a href="https://www.praveengujar.com/" target="_blank">Website</a></li><li><a href="https://www.linkedin.com/in/praveengujar/" target="_blank">LinkedIn</a></li></ul>Parting Question<br /><ul><li>From your perspective, what is the biggest barrier to adoption of machine learning today?</li></ul>Closing Announcements<br /><ul><li>Thank you for listening! Don't forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. <a href="https://www.pythonpodcast.com" target="_blank">Podcast.__init__</a> covers the Python language, its community, and the innovative ways it is being used.</li><li>Visit the <a href="https://www.aiengineeringpodcast.com" target="_blank">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li><li>If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.</li><li>To help other people find the show please leave a review on <a href="https://podcasts.apple.com/us/podcast/the-machine-learning-podcast/id1626358243" target="_blank">iTunes</a> and tell your friends and co-workers.</li></ul>Links<br /><ul><li><a href="https://generativeai.net/" target="_blank">Generative AI</a></li><li><a href="https://en.wikipedia.org/wiki/Large_language_model" target="_blank">LLM == Large Language Model</a></li><li><a href="https://openai.com/index/dall-e/" target="_blank">Dall-E</a>)</li><li><a href="https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback" target="_blank">RLHF == Reinforcement Learning fHuman Feedback</a></li></ul>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/" target="_blank">Hitman's Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank">CC BY-SA 3.0</a>
play-circle
41 MIN