We’ve come a long way from the computer-generated voices of a few decades ago that sounded completely robotic. Siri and Alexa have changed that, sounding more human, but mispronunciations and a certain metered tone still inform you that it’s not real and computer-generated.
New artificial intelligence technology, however, has become more advanced. Not only can they now clone our voices, they can do it very quickly, in mere seconds.
It was reported that it only takes 3.7 seconds of audio to clone your voice, according to the Chinese tech company Baidu. It’s both thrilling and frightening at the same time.
This is a marked improvement in just a year. Back then Baidu created Deep Voice, a voice cloning tool, that could duplicate your voice by using 30 minutes of audio. But now they can do it in 1/600 of the previous time, if my quick math is correct.
Google is working on voice technology as well. They released Tacotron2, a text to speech tool that uses a deep neural network and WaveNet, a speech generation method that is used to generate the voice they use for Google Assistant. It’s said to be so authentic-sounding that it’s hard to tell the difference between a human voice and an AI-generated voice. Alexa and Siri could stand a few improvements in this area as well.
They’ve even found a way to get AI to pronounce words correctly, something often used for a few laughs. My father’s phone system announces his caller ID out loud for him, and he gets a kick out of it telling him he has a call from “A-nah-nee-muss” instead of pronouncing Anonymous correctly.
This technology has led to Google Assistant offering celebrity cameos, such as John Legend’s voice. There’s also a sample of the author Jordan Peterson talking that was used to make realistic audio of him rapping an Eminem song.
This could lead to all sorts of improved tech products, not just voice assistants. But the sad realization is that while this technology could be used for good, we have to realize it will be used for evil as well once nefarious people get their hands on it.
Someone could call you on the phone and engage you in speaking for a few moments, then take that audio and duplicate your voice, using it to empty bank accounts and other fraud-related crimes. We also have to realize there will be kids cutting classes at school and using this to duplicate their mom’s voice.
Good vs. Evil
However, we can’t shut out advances in technology for fear that it will introduce evil as well. We just have to realize that evil is a possibility and protect ourselves from it from the get-go. Regardless of possible evil intentions, the advances in this technology in just a year are still amazing.
Do you find this AI voice technology that can clone your voice exciting, or does it just make you worry about the evil that will follow it? Let us know your thoughts in the comments below.