Uncanny Audio: Is AI-generated Content Music to Our Ears?

MAR 12, 202511 MIN

Uncanny Audio: Is AI-generated Content Music to Our Ears?

MAR 12, 202511 MIN

Description

Artificial intelligence has come a long way over just the past few years. It can hold conversations and manage social media, it can create art and edit videos, and it can even write blogs (though not this one). Every aspect of our lives has been touched by AI in one way or another, and that’s particularly true for sound. While many podcasters, including some of my guests, now use AI tools for research and sound editing, it’s also front and center in sound, from cloning voices to writing its own songs. Royalty-free music is already starting to give way to copyright-free AI music, and a variety of powerful audio content generation tools are scheduled for release later this year.

But can computers replace human composers? Will listeners be able to tell the difference? And how did we get from vinyl records to virtual music? It may seem hard to believe, but the very first song written by a computer is older than cassette tapes. The Illiac Suite, or “String Quartet No. 4,” as it’s officially named, was created in 1955, using pioneering techniques still found in AI today.

The ILLIAC I (ill-ee-ack one) was one of the world’s first computers. It was built in 1952 at the University of Illinois, and it filled an entire room. The ILLIAC I weighed five tons and used over two thousand vacuum tubes, some of which had to be replaced each night. A pair of music professors, Lejaren Hiller and Leonard Isaacson, programmed the ILLIAC to compose a string quartet using what’s called “stochastic music,” music that’s written using probability calculations and mathematical sequences – in this case, Markov chains – instead of human inspiration.

One of the researchers who helped build the ILLIAC I was Saburo Muroga, who also built the MUSASINO-1 later that year in Japan. And, as it happens, another breakthrough in computer-generated music would emerge from Japan exactly fifty years after the Illiac Suite’s release.

Synthetic voices were the next step in creating digital music, and in 1961 the IBM 7094 became the first computer to sing a song, “Daisy Bell.” Another computer voice that could sing was called Perfect Paul, and it was one of the voice settings on 1983‘s text-to-speech DECtalk device. This is the speech synthesizer Professor Stephen Hawking used in his later years, and it was based on the voice of MIT researcher Dennis Klatt. The next decade brought us Auto-Tune, which can digitally modulate singing voices in real-time and has become, for better or worse, a staple of pop music.

These developments all came together in 2004 as “Vocaloids,” synthesized voices that can talk and sing with perfect pitch. The most famous of them by far is Crypton Future Media’s Hatsune Miku, a second-generation Vocaloid who debuted in 2007. While there have been four more generations and many more voices since then, Miku is the one who captured the public’s eyes and ears. Arguably the world’s first virtual celebrity, she’s opened for Lady Gaga, put in a holographic appearance at the 2024 Coachella festival, and just wrapped up her latest ‘Miku Expo’ world tour last December.

In some ways, Miku and the Vocaloids that followed marked a turning point in synthetic voices. Older synthesizers like Perfect Paul and Microsoft Sam couldn’t be mistaken for an ordinary person, but Vocaloids come closer than anything before – so close, in fact, that some music critics have said they fall into a sort of audio uncanny valley. They sound almost, but not quite, human.

Now it’s the year 2025, and AI has taken the stage: it’s talking, singing, composing, and even creating whole new kinds of sound. Both OpenAI’s Jukebox and Google’s AI MusicLM can convert text into music, and Nvidia’s upcoming Fugatto software is described as a sonic “Swiss Army knife” for creating sounds that have never existed, like a screaming saxophone or a trumpet that meows. Another new song-generation service by Musical AI and Beatoven.ai that’s set to release later this year promises to share revenue with its three million musical sources even as it composes custom audio tracks for enterprise clients. And, just like before, some critics worry that all this AI-driven music is bound to fall into the uncanny valley, the gap where it’s more disturbing than impressive.

Patten, an experimental musician from London, released a text-to-audio AI album in 2023 called Mirage FM with twenty-one tracks. Is the resulting sound intriguing, eerie, or maybe even a bit of both?

A series of studies in 2019 by audio companies Veritonic, Amper Music, and Tidio discovered that listeners often don’t trust themselves enough to recognize machine-generated music. The study’s participants would, more often not, just guess that the most complicated track in any given list of songs must be the one written by a computer.

A 2023 study by the University of York, however, found that listeners do prefer human-created music to its AI counterpart and deep learning didn’t make much of a difference in their preferences. Old-fashioned computer compositions, the sort the ILLIAC I might have written, scored about the same as the latest models, and none of them did as well as music written by a person. Even when listeners don’t believe they can tell the difference, there’s a genuine emotional element in human music that’s still lacking in AI sound. We might not be consciously aware of it, but we do sense it.

There’s always the worry that AI could replace human artists, that it might become better than us or just crowd us out in the market. But it also has the power to help unlock our imagination and empower creators, whether it’s a young songwriter who can use a Vocaloid on their album or an artist or writer who can use AI to transform their creative visions into melodies. And human music - isn’t going anywhere just yet. As Patten puts it, “Making music that feels like something—people find that quite difficult to do. There’s no formula for a piece of music that people find touching.” Likewise, our human voice is the sum of our lived experiences – and computers just don’t have that. Not yet, at least.

Connect with the Audio Branding Podcast:

Book your project with Voice Overs and Vocals https://voiceoversandvocals.com

Tweet with me on Twitter - https://twitter.com/JodiKrangle

Watch the Audio Branding Podcast on YouTube - https://www.youtube.com/c/JodiKrangleVO

Connect with me on LinkedIn - https://www.linkedin.com/in/jodikrangle/

Leave the Audio Branding Podcast a written review at https://lovethepodcast.com/audiobranding

or leave a spoken review at https://voiceoversandvocals.com/talktome/ (Thank you!)

Share your passion effectively with these Tips for Sounding Your Best as a Podcast Guest!

https://voiceoversandvocals.com/tips-for-sounding-your-best-as-a-podcast-guest/

Get my Top Five Tips for Implementing an Intentional Audio Strategy

https://voiceoversandvocals.com/audio-branding-strategy/

This podcast uses the following third-party services for analysis:

OP3 - https://op3.dev/privacy

Sign In

Audio Branding

Uncanny Audio: Is AI-generated Content Music to Our Ears?

Uncanny Audio: Is AI-generated Content Music to Our Ears?

Description