90: Using AI at Work to Create an AI Quality Assurance System with Hernan Lardiez

FEB 9, 202652 MIN
Using AI at Work: AI in the Workplace & Generative AI for Business Leaders

90: Using AI at Work to Create an AI Quality Assurance System with Hernan Lardiez

FEB 9, 202652 MIN

Description

Chris Daigle sits down with Hernan Lardiez, COO of RagMetrics, to break down AI evaluations (evals) and why monitoring matters when you put GenAI into production especially in regulated or high-risk environments.Hernan explains what “good evals” actually look like without getting lost in technical weeds: building test datasets, measuring accuracy and consistency, and then continuously re-testing so you can catch drift before it becomes a business problem.They compare the “spreadsheet + spot check” approach to automated eval pipelines that can run fast, repeatable tests at scale.The conversation also covers a practical way to think about pre-production testing vs. in-production monitoring, why token usage and cost should be part of evaluation, and how small RAG tuning decisions (like Top-K chunks) can improve accuracy while cutting token consumption.If you’re leading AI adoption and you want confidence not guesswork this episode will help you build the control points and guardrails to scale GenAI safely.🔎 Find Out More About Hernan LardiezHernan Lardiez on LinkedInhttps://www.linkedin.com/in/hlardiez/RagMetricshttps://ragmetrics.ai/🛠 AI Tools and Resources MentionedRagMetrics - https://ragmetrics.aiThe AI Exchange (Rachel Woods) - https://www.theaiexchange.com/Chief AI Officer -  https://www.chiefaiofficer.com/📌 Chapters00:00 Why regulated industries can’t “hope” with AI02:04 What model evaluations (evals) actually are05:08 The two audiences: business owner vs builders08:52 Pre-production testing vs in-production monitoring14:23 Why “monitoring is required” to reduce risk16:14 Manual spreadsheet grading vs automated evals18:01 Building test datasets + injecting through the pipeline31:21 Measuring accuracy AND token consumption (cost)34:01 Continuous evals to catch drift over time42:11 RAG tuning: Top-K chunks, accuracy vs noise, token savings49:21 Evals as “low-cost insurance” for production AI50:27 Closing advice: control points + IT boundariesIn this clip from the Using AI at Work podcast, we explore the challenges of AI implementation, particularly for organizations in regulated markets. The discussion highlights the critical role of effective risk management in navigating potential outcomes.We identify key stakeholders, like the business owner and the development team, who are crucial for understanding AI requirements and ensuring compliance. This session emphasizes the importance of strategic ai leadership and how ai business can integrate these considerations for successful operations management.