Podcast for Zvi's blog, Don't Worry About the Vase Podcast
Claude Opus 4.8: The System Card
MAY 29, 202655 MIN
Claude Opus 4.8: The System Card
MAY 29, 202655 MIN
Description
<p>The Don’t Worry About the Vase Podcast is a listener-supported podcast. To receive new posts and support the cost of creation, consider becoming a free or paid subscriber.</p><p>* 00:00 - Introduction</p><p>* 01:15 - Table of Contents</p><p>* 02:32 - Here We Go Again: Executive Summary</p><p>* 03:58 - Introduction (1)</p><p>* 04:03 - RSP Evaluations (2)</p><p>* 05:11 - Move That Goalpost</p><p>* 07:11 - The Failures Are News</p><p>* 09:21 - Alignment Risk Slowly Rises</p><p>* 10:52 - New Risk Pathways Just Dropped</p><p>* 13:28 - Cyber (3)</p><p>* 14:27 - Harmful Requests (4.1)</p><p>* 16:46 - We Need To Talk (4.2 and 4.3)</p><p>* 19:56 - Overcoming Bias (4.4)</p><p>* 21:59 - Agentic Safety (5)</p><p>* 24:38 - Prompt Injection (5.2)</p><p>* 31:08 - Alignment (6)</p><p>* 32:23 - Looking For Problems</p><p>* 33:54 - Who Watches The Training (6.2.2)</p><p>* 38:02 - Automated Behavioral Audit</p><p>* 38:39 - The Model Is Smarter Than The Eval (6.2.3.2)</p><p>* 40:48 - You Should See The Other Guy</p><p>* 43:12 - UK AISI Testing (6.2.4)</p><p>* 43:32 - In Vendbench (6.2.5)</p><p>* 46:10 - Honesty (6.3.3 to 6.3.6)</p><p>* 49:00 - Chain of Thought (CoT) Monitorability (6.5)</p><p>* 51:46 - What’s In The Box? (6.6)</p><p>* 54:01 - That’s All For Now</p><p><a target="_blank" href="https://open.substack.com/pub/thezvi/p/claude-opus-48-is-honestly-better?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web">https://open.substack.com/pub/thezvi/p/claude-opus-48-is-honestly-better?r=67y1h&utm_campaign=post-expanded-share&utm_medium=web</a></p> <br/><br/>Get full access to DWAtV Podcast at <a href="https://dwatvpodcast.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4">dwatvpodcast.substack.com/subscribe</a>