<description>&lt;p&gt;This week I welcome on the show two of the most important technologists ever, in any field.&lt;/p&gt;&lt;p&gt;Jeff Dean is Google's Chief Scientist, and through 25 years at the company, has worked on basically the most transformative systems in modern computing: from MapReduce, BigTable, Tensorflow, AlphaChip, to Gemini.&lt;/p&gt;&lt;p&gt;Noam Shazeer invented or co-invented all the main architectures and techniques that are used for modern LLMs: from the Transformer itself, to Mixture of Experts, to Mesh Tensorflow, to Gemini and many other things.&lt;/p&gt;&lt;p&gt;We talk about their 25 years at Google, going from PageRank to MapReduce to the Transformer to MoEs to AlphaChip – and maybe soon to ASI.&lt;/p&gt;&lt;p&gt;My favorite part was Jeff's vision for Pathways, Google’s grand plan for a mutually-reinforcing loop of hardware and algorithmic design and for going past autoregression. That culminates in us imagining *all* of Google-the-company, going through one huge MoE model.&lt;/p&gt;&lt;p&gt;And Noam just bites every bullet: 100x world GDP soon; let’s get a million automated researchers running in the Google datacenter; living to see the year 3000.Watch on &lt;a target="_blank" href="https://youtu.be/v0gjI__RyCY"&gt;Youtube&lt;/a&gt;; listen on &lt;a target="_blank" href="https://podcasts.apple.com/us/podcast/jeff-dean-noam-shazeer-25-years-at-google-from-pagerank/id1516093381?i=1000691556147"&gt;Apple Podcasts&lt;/a&gt; or &lt;a target="_blank" href="https://open.spotify.com/episode/4atx1POpKIL8WGvdVfdnbb?si=XYxo6SIyRi2qmZ1ZGfl6vw"&gt;Spotify&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Sponsors&lt;/p&gt;&lt;p&gt;Scale partners with major AI labs like Meta, Google Deepmind, and OpenAI. Through Scale’s Data Foundry, labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities. If you’re an AI researcher or engineer, learn about how Scale’s Data Foundry and research lab, SEAL, can help you go beyond the current frontier at &lt;a target="_blank" href="https://scale.com/dwarkesh"&gt;scale.com/dwarkesh&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Curious how Jane Street teaches their new traders? They use Figgie, a rapid-fire card game that simulates the most exciting parts of markets and trading. It’s become so popular that Jane Street hosts an inter-office Figgie championship every year. Download from the app store or play on your desktop at &lt;a target="_blank" href="https://www.figgie.com/"&gt;figgie.com&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Meter wants to radically improve the digital world we take for granted. They’re developing a foundation model that automates network management end-to-end. To do this, they just announced a long-term partnership with Microsoft for tens of thousands of GPUs, and they’re recruiting a world class AI research team. To learn more, go to &lt;a target="_blank" href="https://meter.com/dwarkesh"&gt;meter.com/dwarkesh&lt;/a&gt;&lt;/p&gt;&lt;p&gt;To sponsor a future episode, visit &lt;a target="_blank" href="https://www.dwarkeshpatel.com/p/advertise"&gt;dwarkeshpatel.com/p/advertise&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Timestamps&lt;/p&gt;&lt;p&gt;00:00:00 - Intro&lt;/p&gt;&lt;p&gt;00:02:44 - Joining Google in 1999&lt;/p&gt;&lt;p&gt;00:05:36 - Future of Moore's Law&lt;/p&gt;&lt;p&gt;00:10:21 - Future TPUs&lt;/p&gt;&lt;p&gt;00:13:13 - Jeff’s undergrad thesis: parallel backprop&lt;/p&gt;&lt;p&gt;00:15:10 - LLMs in 2007&lt;/p&gt;&lt;p&gt;00:23:07 - “Holy s**t” moments&lt;/p&gt;&lt;p&gt;00:29:46 - AI fulfills Google’s original mission&lt;/p&gt;&lt;p&gt;00:34:19 - Doing Search in-context&lt;/p&gt;&lt;p&gt;00:38:32 - The internal coding model&lt;/p&gt;&lt;p&gt;00:39:49 - What will 2027 models do?&lt;/p&gt;&lt;p&gt;00:46:00 - A new architecture every day?&lt;/p&gt;&lt;p&gt;00:49:21 - Automated chip design and intelligence explosion&lt;/p&gt;&lt;p&gt;00:57:31 - Future of inference scaling&lt;/p&gt;&lt;p&gt;01:03:56 - Already doing multi-datacenter runs&lt;/p&gt;&lt;p&gt;01:22:33 - Debugging at scale&lt;/p&gt;&lt;p&gt;01:26:05 - Fast takeoff and superalignment&lt;/p&gt;&lt;p&gt;01:34:40 - A million evil Jeff Deans&lt;/p&gt;&lt;p&gt;01:38:16 - Fun times at Google&lt;/p&gt;&lt;p&gt;01:41:50 - World compute demand in 2030&lt;/p&gt;&lt;p&gt;01:48:21 - Getting back to modularity&lt;/p&gt;&lt;p&gt;01:59:13 - Keeping a giga-MoE in-memory&lt;/p&gt;&lt;p&gt;02:04:09 - All of Google in one model&lt;/p&gt;&lt;p&gt;02:12:43 - What’s missing from distillation&lt;/p&gt;&lt;p&gt;02:18:03 - Open research, pros and cons&lt;/p&gt;&lt;p&gt;02:24:54 - Going the distance&lt;/p&gt; &lt;br/&gt;&lt;br/&gt;Get full access to Dwarkesh Podcast at &lt;a href="https://www.dwarkesh.com/subscribe?utm_medium=podcast&amp;#38;utm_campaign=CTA_4"&gt;www.dwarkesh.com/subscribe&lt;/a&gt;</description>

Dwarkesh Podcast

Dwarkesh Patel

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

FEB 12, 2025134 MIN
Dwarkesh Podcast

Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI

FEB 12, 2025134 MIN

Description

<p>This week I welcome on the show two of the most important technologists ever, in any field.</p><p>Jeff Dean is Google's Chief Scientist, and through 25 years at the company, has worked on basically the most transformative systems in modern computing: from MapReduce, BigTable, Tensorflow, AlphaChip, to Gemini.</p><p>Noam Shazeer invented or co-invented all the main architectures and techniques that are used for modern LLMs: from the Transformer itself, to Mixture of Experts, to Mesh Tensorflow, to Gemini and many other things.</p><p>We talk about their 25 years at Google, going from PageRank to MapReduce to the Transformer to MoEs to AlphaChip – and maybe soon to ASI.</p><p>My favorite part was Jeff's vision for Pathways, Google’s grand plan for a mutually-reinforcing loop of hardware and algorithmic design and for going past autoregression. That culminates in us imagining *all* of Google-the-company, going through one huge MoE model.</p><p>And Noam just bites every bullet: 100x world GDP soon; let’s get a million automated researchers running in the Google datacenter; living to see the year 3000.Watch on <a target="_blank" href="https://youtu.be/v0gjI__RyCY">Youtube</a>; listen on <a target="_blank" href="https://podcasts.apple.com/us/podcast/jeff-dean-noam-shazeer-25-years-at-google-from-pagerank/id1516093381?i=1000691556147">Apple Podcasts</a> or <a target="_blank" href="https://open.spotify.com/episode/4atx1POpKIL8WGvdVfdnbb?si=XYxo6SIyRi2qmZ1ZGfl6vw">Spotify</a>.</p><p>Sponsors</p><p>Scale partners with major AI labs like Meta, Google Deepmind, and OpenAI. Through Scale’s Data Foundry, labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities. If you’re an AI researcher or engineer, learn about how Scale’s Data Foundry and research lab, SEAL, can help you go beyond the current frontier at <a target="_blank" href="https://scale.com/dwarkesh">scale.com/dwarkesh</a></p><p>Curious how Jane Street teaches their new traders? They use Figgie, a rapid-fire card game that simulates the most exciting parts of markets and trading. It’s become so popular that Jane Street hosts an inter-office Figgie championship every year. Download from the app store or play on your desktop at <a target="_blank" href="https://www.figgie.com/">figgie.com</a></p><p>Meter wants to radically improve the digital world we take for granted. They’re developing a foundation model that automates network management end-to-end. To do this, they just announced a long-term partnership with Microsoft for tens of thousands of GPUs, and they’re recruiting a world class AI research team. To learn more, go to <a target="_blank" href="https://meter.com/dwarkesh">meter.com/dwarkesh</a></p><p>To sponsor a future episode, visit <a target="_blank" href="https://www.dwarkeshpatel.com/p/advertise">dwarkeshpatel.com/p/advertise</a></p><p>Timestamps</p><p>00:00:00 - Intro</p><p>00:02:44 - Joining Google in 1999</p><p>00:05:36 - Future of Moore's Law</p><p>00:10:21 - Future TPUs</p><p>00:13:13 - Jeff’s undergrad thesis: parallel backprop</p><p>00:15:10 - LLMs in 2007</p><p>00:23:07 - “Holy s**t” moments</p><p>00:29:46 - AI fulfills Google’s original mission</p><p>00:34:19 - Doing Search in-context</p><p>00:38:32 - The internal coding model</p><p>00:39:49 - What will 2027 models do?</p><p>00:46:00 - A new architecture every day?</p><p>00:49:21 - Automated chip design and intelligence explosion</p><p>00:57:31 - Future of inference scaling</p><p>01:03:56 - Already doing multi-datacenter runs</p><p>01:22:33 - Debugging at scale</p><p>01:26:05 - Fast takeoff and superalignment</p><p>01:34:40 - A million evil Jeff Deans</p><p>01:38:16 - Fun times at Google</p><p>01:41:50 - World compute demand in 2030</p><p>01:48:21 - Getting back to modularity</p><p>01:59:13 - Keeping a giga-MoE in-memory</p><p>02:04:09 - All of Google in one model</p><p>02:12:43 - What’s missing from distillation</p><p>02:18:03 - Open research, pros and cons</p><p>02:24:54 - Going the distance</p> <br/><br/>Get full access to Dwarkesh Podcast at <a href="https://www.dwarkesh.com/subscribe?utm_medium=podcast&#38;utm_campaign=CTA_4">www.dwarkesh.com/subscribe</a>