<description>&lt;p&gt;Welcome to AI Daily! Join hosts Farb, Ethan, and Conner as they explore three groundbreaking AI stories First up, HierVST Voice Cloning - Experience zero-shot voice cloning with impressive accuracy using just one audio clip. Next, NVIDIA Perfusion - a small, powerful personalization model for text images, using key locking to maintain consistency. Lastly, Meta's AudioCraft - the fusion of music generation, audio generation, and codecs into one open-source code base, creating high-fidelity outputs.&lt;/p&gt;&lt;p&gt;Quick Points&lt;/p&gt;&lt;p&gt;1️⃣ HierVST Voice Cloning&lt;/p&gt;&lt;p&gt;* Zero-shot voice cloning system achieves accurate outputs with just one audio clip.&lt;/p&gt;&lt;p&gt;* Uses hierarchical models for long and short-term generation understanding.&lt;/p&gt;&lt;p&gt;* Potential challenges in handling longer clips and need for further fine-tuning.&lt;/p&gt;&lt;p&gt;2️⃣ NVIDIA Perfusion&lt;/p&gt;&lt;p&gt;* Personalization model for text images with key locking for subject consistency.&lt;/p&gt;&lt;p&gt;* Only 100 kilobytes, trains in four minutes, and outperforms other models.&lt;/p&gt;&lt;p&gt;* Open-source codebase, but may need improvements for human subjects.&lt;/p&gt;&lt;p&gt;3️⃣ Meta’s AudioCraft&lt;/p&gt;&lt;p&gt;* Audio generation, music gen, and codecs combined into an open-source codebase.&lt;/p&gt;&lt;p&gt;* High-fidelity outputs, 30 seconds of sounds, compressing audio files efficiently.&lt;/p&gt;&lt;p&gt;* Meta making strides in audio AI, impressively opens research use for community.&lt;/p&gt;&lt;p&gt;🔗 Episode Links&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://twitter.com/dreamingtulpa/status/1686649903525584896?s=42&amp;#38;t=sTbeB89T07xhM3ob89LHRg"&gt;HierVST Voice Cloning&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://research.nvidia.com/labs/par/Perfusion/"&gt;NVIDIA Perfusion&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://ai.meta.com/blog/audiocraft-musicgen-audiogen-encodec-generative-ai-audio/"&gt;Meta's AudioCraft&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://twitter.com/nostalgebraist/status/1686576041803096065?s=42&amp;#38;t=ziEc9CMi8q_PlJ34DMJVkA"&gt;ChatGPT String Tweet&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://techcrunch.com/2023/08/01/generative-ai-services-pulled-from-apple-app-store-in-china-ahead-of-new-regulations/?utm_source=bensbites&amp;#38;utm_medium=newsletter&amp;#38;utm_campaign=apple-removes-ai-apps-from-china&amp;#38;guccounter=2"&gt;Apple App Store/China Story&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Connect With Us:&lt;/p&gt;&lt;p&gt;&lt;a target="_blank" href="https://www.threads.net/@aidailypod"&gt;Follow&lt;/a&gt; us on &lt;strong&gt;Threads&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;a target="_blank" href="https://www.aidailypod.com/"&gt;Subscribe&lt;/a&gt; to our &lt;strong&gt;Substack&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Follow us on&lt;strong&gt; Twitter:&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://twitter.com/aidailypod"&gt;AI Daily&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://twitter.com/farbood"&gt;Farb&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://twitter.com/ejaldrich?s=20"&gt;Ethan&lt;/a&gt;&lt;/p&gt;&lt;p&gt;* &lt;a target="_blank" href="https://twitter.com/semicognitive?s=20"&gt;Conner&lt;/a&gt;&lt;/p&gt; &lt;br/&gt;&lt;br/&gt;This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit &lt;a href="https://www.aidailypod.com?utm_medium=podcast&amp;#38;utm_campaign=CTA_1"&gt;www.aidailypod.com&lt;/a&gt;</description>

AI Daily

Daily insights on the latest news, innovations, and tools in the world of AI.

HierVST Voice Cloning | NVIDIA Perfusion | Meta's AudioCraft

AUG 3, 202311 MIN
AI Daily

HierVST Voice Cloning | NVIDIA Perfusion | Meta's AudioCraft

AUG 3, 202311 MIN

Description

<p>Welcome to AI Daily! Join hosts Farb, Ethan, and Conner as they explore three groundbreaking AI stories First up, HierVST Voice Cloning - Experience zero-shot voice cloning with impressive accuracy using just one audio clip. Next, NVIDIA Perfusion - a small, powerful personalization model for text images, using key locking to maintain consistency. Lastly, Meta's AudioCraft - the fusion of music generation, audio generation, and codecs into one open-source code base, creating high-fidelity outputs.</p><p>Quick Points</p><p>1️⃣ HierVST Voice Cloning</p><p>* Zero-shot voice cloning system achieves accurate outputs with just one audio clip.</p><p>* Uses hierarchical models for long and short-term generation understanding.</p><p>* Potential challenges in handling longer clips and need for further fine-tuning.</p><p>2️⃣ NVIDIA Perfusion</p><p>* Personalization model for text images with key locking for subject consistency.</p><p>* Only 100 kilobytes, trains in four minutes, and outperforms other models.</p><p>* Open-source codebase, but may need improvements for human subjects.</p><p>3️⃣ Meta’s AudioCraft</p><p>* Audio generation, music gen, and codecs combined into an open-source codebase.</p><p>* High-fidelity outputs, 30 seconds of sounds, compressing audio files efficiently.</p><p>* Meta making strides in audio AI, impressively opens research use for community.</p><p>🔗 Episode Links</p><p>* <a target="_blank" href="https://twitter.com/dreamingtulpa/status/1686649903525584896?s=42&#38;t=sTbeB89T07xhM3ob89LHRg">HierVST Voice Cloning</a></p><p>* <a target="_blank" href="https://research.nvidia.com/labs/par/Perfusion/">NVIDIA Perfusion</a></p><p>* <a target="_blank" href="https://ai.meta.com/blog/audiocraft-musicgen-audiogen-encodec-generative-ai-audio/">Meta's AudioCraft</a></p><p>* <a target="_blank" href="https://twitter.com/nostalgebraist/status/1686576041803096065?s=42&#38;t=ziEc9CMi8q_PlJ34DMJVkA">ChatGPT String Tweet</a></p><p>* <a target="_blank" href="https://techcrunch.com/2023/08/01/generative-ai-services-pulled-from-apple-app-store-in-china-ahead-of-new-regulations/?utm_source=bensbites&#38;utm_medium=newsletter&#38;utm_campaign=apple-removes-ai-apps-from-china&#38;guccounter=2">Apple App Store/China Story</a></p><p>Connect With Us:</p><p><a target="_blank" href="https://www.threads.net/@aidailypod">Follow</a> us on <strong>Threads</strong></p><p><a target="_blank" href="https://www.aidailypod.com/">Subscribe</a> to our <strong>Substack</strong></p><p>Follow us on<strong> Twitter:</strong></p><p>* <a target="_blank" href="https://twitter.com/aidailypod">AI Daily</a></p><p>* <a target="_blank" href="https://twitter.com/farbood">Farb</a></p><p>* <a target="_blank" href="https://twitter.com/ejaldrich?s=20">Ethan</a></p><p>* <a target="_blank" href="https://twitter.com/semicognitive?s=20">Conner</a></p> <br/><br/>This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit <a href="https://www.aidailypod.com?utm_medium=podcast&#38;utm_campaign=CTA_1">www.aidailypod.com</a>