<p>Episode 517 starts with a light chat about AI avatars and new text‑to‑speech deepfakes before diving into LLM “thinking” modes—what baked‑in planning actually does, why it multiplies token costs, and when it helps or hurts. James and Frank give concrete dev advice: try low‑thinking settings, use big models for creative planning then smaller ones to execute, leverage harnesses/system prompts, and beware quantized local models often do better without thinking.</p>

<h3>Follow Us</h3>

<ul>
<li>Frank: <a href="http://twitter.com/praeclarum" target="_blank" rel="nofollow noopener">Twitter</a>,  <a href="http://praeclarum.org" target="_blank" rel="nofollow noopener">Blog</a>, <a href="http://github.com/praeclarum" target="_blank" rel="nofollow noopener">GitHub</a></li>
<li>James: <a href="http://twitter.com/jamesmontemagno" target="_blank" rel="nofollow noopener">Twitter</a>,  <a href="https://montemagno.com" target="_blank" rel="nofollow noopener">Blog</a>, <a href="http://github.com/jamesmontemagno" target="_blank" rel="nofollow noopener">GitHub</a></li>
<li>Merge Conflict: <a href="http://twitter.com/mergeconflictfm" target="_blank" rel="nofollow noopener">Twitter</a>,  <a href="https://www.facebook.com/mergeconflictfm" target="_blank" rel="nofollow noopener">Facebook</a>, <a href="http://mergeconflict.fm" target="_blank" rel="nofollow noopener">Website</a>, <a href="https://www.mergeconflict.fm/discord" target="_blank" rel="nofollow noopener">Chat on Discord</a></li>
<li>Music : Amethyst Seer - Citrine by <a href="https://soundcloud.com/adventureface" target="_blank" rel="nofollow noopener">Adventureface</a></li>
</ul>

<p>⭐⭐ <a href="https://itunes.apple.com/us/podcast/merge-conflict/id1133064277?mt=2&amp;ls=1" rel="nofollow noopener">Review Us</a> ⭐⭐</p>

<p>Machine transcription available on <a href="http://mergeconflict.fm" rel="nofollow noopener">http://mergeconflict.fm</a></p><p><a rel="payment" href="https://www.patreon.com/mergeconflictfm">Support Merge Conflict</a></p>
      

<description>
        &lt;p&gt;Episode 517 starts with a light chat about AI avatars and new text‑to‑speech deepfakes before diving into LLM “thinking” modes—what baked‑in planning actually does, why it multiplies token costs, and when it helps or hurts. James and Frank give concrete dev advice: try low‑thinking settings, use big models for creative planning then smaller ones to execute, leverage harnesses/system prompts, and beware quantized local models often do better without thinking.&lt;/p&gt;

&lt;h3&gt;Follow Us&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Frank: &lt;a href="http://twitter.com/praeclarum" target="_blank" rel="nofollow noopener"&gt;Twitter&lt;/a&gt;,  &lt;a href="http://praeclarum.org" target="_blank" rel="nofollow noopener"&gt;Blog&lt;/a&gt;, &lt;a href="http://github.com/praeclarum" target="_blank" rel="nofollow noopener"&gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;James: &lt;a href="http://twitter.com/jamesmontemagno" target="_blank" rel="nofollow noopener"&gt;Twitter&lt;/a&gt;,  &lt;a href="https://montemagno.com" target="_blank" rel="nofollow noopener"&gt;Blog&lt;/a&gt;, &lt;a href="http://github.com/jamesmontemagno" target="_blank" rel="nofollow noopener"&gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Merge Conflict: &lt;a href="http://twitter.com/mergeconflictfm" target="_blank" rel="nofollow noopener"&gt;Twitter&lt;/a&gt;,  &lt;a href="https://www.facebook.com/mergeconflictfm" target="_blank" rel="nofollow noopener"&gt;Facebook&lt;/a&gt;, &lt;a href="http://mergeconflict.fm" target="_blank" rel="nofollow noopener"&gt;Website&lt;/a&gt;, &lt;a href="https://www.mergeconflict.fm/discord" target="_blank" rel="nofollow noopener"&gt;Chat on Discord&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Music : Amethyst Seer - Citrine by &lt;a href="https://soundcloud.com/adventureface" target="_blank" rel="nofollow noopener"&gt;Adventureface&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;⭐⭐ &lt;a href="https://itunes.apple.com/us/podcast/merge-conflict/id1133064277?mt=2&amp;amp;ls=1" rel="nofollow noopener"&gt;Review Us&lt;/a&gt; ⭐⭐&lt;/p&gt;

&lt;p&gt;Machine transcription available on &lt;a href="http://mergeconflict.fm" rel="nofollow noopener"&gt;http://mergeconflict.fm&lt;/a&gt;&lt;/p&gt;
      </description>

Episode 517 starts with a light chat about AI avatars and new text‑to‑speech deepfakes before diving into LLM “thinking” modes—what baked‑in planning actually does, why it multiplies token costs, and when it helps or hurts. James and Frank give concrete dev advice: try low‑thinking settings, use big models for creative planning then smaller ones to execute, leverage harnesses/system prompts, and beware quantized local models often do better without thinking.

Merge Conflict

517: Plan First, Think Less: Save Tokens, Improve Code

517: Plan First, Think Less: Save Tokens, Improve Code

Description