Is it a human talking behind the camera or an AI clone? A surprising innovation from a unicorn startup powered by Nvidia makes it almost impossible to tell the difference.
AI startup Synthesia, which hit unicorn status at a billion-dollar valuation last year, released new technology Thursday called Expressive Avatars; The world's first digital AI clones capable of producing human facial expressions and the correct tone of voice from written prompts.
The technology starts with an AI avatar that can be customized to reflect real faces.
Photo: Synthesia
AI makes a digital copy of a person based on footage recorded through their webcam or on a certified studio. It can also clone the person's voice to tap into their digital likeness.
Those wary of creating an AI avatar that takes on their face and voice can instead choose one of more than 160 AI preloaded avatars that Synthesia has in its database.
Connected: 'This is a serious problem': Mr Beast slams AI Deepfakes
Once a user creates or selects an AI avatar, they only need to do one more thing: type what their digital self wants to say.
In a demonstration seen by CNBC, one user wrote “I'm happy. I'm sad. I'm frustrated.” and the AI-generated digital clone read the text. The avatar conveyed facial expressions and tone associated with happiness when saying the text “I'm happy” and changed its inflection appropriately when saying “I'm frustrated.” The tone matched the words.
With an AI clone and a written request, a free user can generate 36 minutes of personalized videos in more than 120 languages every year. Paid plans go as low as $67 per month for up to 360 minutes of video per year or unlimited minutes of video for businesses that choose an enterprise plan.
Synthesia is a startup that big companies are using behind the scenes. Zoom, Xerox, Microsoft and Reuters are all using Synthesia software. Synthesia CEO Victor Riparbelli said about it MIT Technology Review that 56% of the Fortune 100 were using the technology.
Synthesia markets the technology as a way to create expressive digital avatars for training and corporate presentations. For example, Zoom designers created Synthesia sales training videos in 90% less time than it took human beings to create the videos.
Connected: JPMorgan says AI cash flow software reduces human labor by almost 90%
“Zoom subject matter experts no longer need to log themselves in, freeing up 15-20 hours each month to work on their current job,” Synthesia website read.
However, the ability to create creepy “deepfakes,” or AI that clones and manipulates the voice, likeness, or other aspects of a human being without their permission, could lead to misuse.
Last month, Tennessee became the first US state to pass legislation protecting music industry professionals from deep counterfeits.