Supertone Nuvo: Voice Synthesis AI Model That Captures Emotions

KEY POINTS

Supertone received the CES 2022 Innovation Award
The award is particularly for the company's latest voice synthesis AI called Nuvo
The Supertone Nuvo is not yet available to the general public

Over the years, machines have learned to turn text into speech and humans have gotten used to communicating with GPS devices and smart speakers. And, these days, the quality of synthesized speech has dramatically improved. Supertone, an artificial intelligence (AI) audio technology startup received the CES 2022 Innovation Award for Nuvo -- its newest voice synthesis AI model that captures emotion.

Superb AI audio tech

Supertone was founded with one goal–to provide creators with a more innovative voice content production environment. The company has its voice synthesis technology that doesn't only speak and sing but also captures emotions.

As a result, the company can produce voices that are hardly distinguishable from humans. Using its proprietary Controllable Voice Conversion (CVC) technology, Supertone developed the project dubbed Nuvo.

CVC (Controllable Voice Conversion) is technology that can convert one's voice into any target voice. Supertone Official Website

This particular voice synthesis AI has the ability to convert one's voice into any random voice in real-time. Soon, everyone will be able to clone their voices with the help of such technological progress.

Simple process and applications

For those unfamiliar with this kind of innovation, Supertone offers a straightforward process. As for its applications, users have a wide range of options to choose from in terms of what they want to do with their voice.

CVC (Controllable Voice Conversion) is technology that can convert one's voice into any target voice. Supertone Official Website

The Supertone Nuvo voice synthesis AI can convert the user's voice into any of these -- a man, an old person, a woman or even a child. With multiple voices, users can play multiple characters.

Users can also turn their voice into Donald Duck's voice or Angelina Jolie's, they can be whoever they want to be -- it's their choice.

SVS (Singing Voice Synthesis) is the world's first commercialized singing voice synthesis technology. Supertone Official Website

Furthermore, the Supertone Nuvo can also let users customize the kind of voice they want to have. There are pre-selected base voices that users can choose and design to further customize the voice.

Unfortunately, at the moment, the Supertone voice synthesis technology is available to authorized partners only. The AI audio technology startup also refrains from working on projects related to religious, political and economic figures.

As for the length of time, the Supertone voice synthesis AI can take depends on the project's complexity.

Final verdict

Speech synthesis has massively improved in recent years, thanks to the advances in machine learning. Earlier, the most realistic synthetic voices were created by recording audio of a human voice actor and cutting their speech into components sounds.

After that, the cut speeches are spliced back to form new words and sentences. The entire process is tedious but all of these become simpler with the use of Supertone Nuvo.

Right now, however, neural networks can be trained easily even on unsorted data of the target voice to produce raw audio of someone. While the result is a realistic voice, the process is easier and a lot faster.

The ease of use as well as the kind of technology and applications of the Supertone Nuvo make it a product deserving of our Best of CES 2022 award.

Ces 2022 Artificial intelligence

Join the Discussion