Bob Pittman, CEO of iHeartMedia, recently explained the enduring appeal of audio during a CNBC interview as the role each plays: broadcast radio keeps the listener company while a podcast is like a conversation. “Radio is really about the companionship. We’re your friend riding with you in the empty seat in the car; we’re your friend chatting with you while you’re cooking, brushing your teeth, doing some work that you want to get your mind off of, or you’re walking or whatever,” said Pittman. “Podcasting is not some separate bolt-on business, but it’s really an extension of what we do on radio.” Now a new study is adding credence to that audio-as-a-companion idea, even when it comes to voice assistants.
The Toronto-based voiceover marketplace Voices.com collaborated with Voicebot.ai and Pulse Labs on a voice user experience study and it found a strong preference for human voices over synthetic voices and higher information recall with human voices. The study found that consumers across genders and age groups not only expressed a 71.6% higher preference for human voices, but they were also much more likely to correctly remember a call-to-action embedded in audible content compared to synthetic voices. And in a finding with potential implications for advertising, the data shows information recall more than doubled from 14.3% to 32.5% when human voices were used rather than synthetic ones.
"We all suspected that voice assistant users preferred human over synthetic voices, but there were no empirical studies attempting to quantify the difference. Now, we have data instead of conjecture," says Bret Kinsella, founder and CEO of Voicebot.ai. "These findings should be particularly interesting to marketers building voice apps today."
The study also found that voice assistant users show a small, but measurable, preference for female human voices over male ones. And female synthetic voices are preferred over male synthetic voices by 12.5%. Regardless of gender, the data shows voice assistant users also prefer shorter dialogue.
“Smart marketers know that people prefer listening to people that sound like them. With this new research, it’s further evidence that the voice—a human quality full of emotion—is not easily replicated,” says Voice.com co-founder and CEO David Ciccarelli. “While there’s a time and place for synthetic voices to provide navigational prompts or brief instructions, communicating important messages with the intent to inform, educate, and inspire audiences should be left strictly to professional voice actors.”
It’s not just voice assistants where people are interacting with synthetic voices. Descript, the audio tech company used by podcasters to create and edit their shows, struck a deal in September to buy to buy Lyrebird, a Canadian company that uses Artificial Intelligence technology to allow podcasters to generate a realistic sounding text-to-speech audio clone of themselves based on a brief audio clip that’s used as a template. CEO Andrew Mason said they’re now putting that new technology to use with newly-used podcast production studio software. It includes a beta version of what they’re calling Overdub, which allows podcast hosts and producers to edit in fixes to a podcast using their own voice, simply by converting the text into audio. “Making corrections is as simple as typing them,” Mason explained in a blog post.