Women in AI Research (WiAIR)
Women in AI Research (WiAIR)

Women in AI Research (WiAIR)

WiAIR

Overview
Episodes

Details

Women in AI Research (WiAIR) is a podcast dedicated to celebrating the remarkable contributions of female AI researchers from around the globe. Our mission is to challenge the prevailing perception that AI research is predominantly male-driven. Our goal is to empower early career researchers, especially women, to pursue their passion for AI and make an impact in this rapidly growing field. You will learn from women at different career stages, stay updated on the latest research and advancements, and hear powerful stories of overcoming obstacles and breaking stereotypes.

Recent Episodes

EACL 2026: LLMs Can Hearโ€ฆ But Can They Reason? A New Benchmark for Audio Intelligence
APR 13, 2026
EACL 2026: LLMs Can Hearโ€ฆ But Can They Reason? A New Benchmark for Audio Intelligence
What does it actually mean for a model to understand audioPaper: https://arxiv.org/abs/2601.19673In this episode, I talk with Iwona Christop, a PhD student at Adam Mickiewicz University, about her recent EACL paper introducing ART (Audio Reasoning Tasks) โ€” a new benchmark designed to evaluate whether multimodal LLMs can truly reason over audio, not just transcribe or classify it.Most existing benchmarks test audio skills in isolation (like ASR or classification). But real-world intelligence requires something deeper: combining signals, comparing sounds, tracking context, and making decisions.This work takes a different approach:No text-only shortcuts โ€” tasks canโ€™t be solved via transcription aloneReasoning-first design โ€” models must combine multiple audio cuesNo expert knowledge required โ€” anyone can verify correctnessWe also dive into the diverse task design, including:Audio arithmetic (counting and comparing sounds)Cross-recording speaker & language identificationSound-based reasoning (e.g., inferring properties from audio)Speech feature comparison (accents, variations)Multimodal reasoning across text and soundThe dataset includes 9 tasks, 9,000 samples, and 30+ hours of audio โ€” all generated in a scalable way using templates and TTS.๐Ÿ‘‰ If you care about multimodal reasoning, evaluation, or the limits of current LLM capabilities, this conversation is for you.Iwona Christop:https://www.linkedin.com/in/iwona-christop/๐Ÿ‘ Like & subscribe for more deep dives into cutting-edge AI research๐Ÿ”” New episodes from EACL 2026 coming soon#WiAIR #EACL2026
play-circle icon
18 MIN
EACL 2026: Reasoning Can Hurt LLM Safety?! Rethinking Accuracy in AI Systems
APR 10, 2026
EACL 2026: Reasoning Can Hurt LLM Safety?! Rethinking Accuracy in AI Systems
In this episode of #WiAIRpodcast, we dive into a subtle but critical question: Does adding reasoning actually make LLMs safer and more reliable?Paper: https://arxiv.org/abs/2510.21049Atoosa Chegini (University of Maryland, Apple) presents Reasoning's Razor (EACL 2026), where she and her collaborators examine how reasoning impacts high-stakes binary classification tasks, including safety filtering and hallucination detection.Their findings highlight an important nuance:While reasoning can improve overall accuracy, it may degrade performance at low false positive rates -- exactly where real-world systems need to operate.This conversation covers:Why accuracy is a misleading metric for safety-critical LLM applicationsThe importance of evaluating models at fixed false positive rates (FPR)How two models with identical accuracy can behave completely differently in deploymentThe impact of "think-on" (with reasoning) vs "think-off" (no reasoning) settingsPractical implications for RLHF, SFT, and post-training pipelinesIf you're working on:LLM evaluation & reliabilityAI safety or hallucination detectionProduction deployment of language modelsโ€” this discussion offers a perspective that is both technically grounded and immediately actionable.Atoosa:https://www.linkedin.com/in/atoosa-chegini-6713741a3/https://scholar.google.com/citations?user=5nY9tagAAAAJ&hl=en&oi=ao๐Ÿ‘ Like & subscribe for more deep dives into cutting-edge AI research๐Ÿ”” New episodes from EACL 2026 coming soon
play-circle icon
21 MIN