You might be speaking in English, but to your colleague in Paris tuning into the Microsoft Teams meeting, you'll look like you're speaking in French.
Microsoft is currently testing a new Interpreter AI feature that clones your voice and converts it to another language in real time. The result is a voice that sounds “just like you in another language,” according to the company. The translation program will be previewed early next year in up to nine languages, including Italian, German, Japanese, Korean, Portuguese, French, English, Mandarin Chinese and Spanish. Only accounts with a Microsoft 365 Copilot license will be able to access Interpreter, for Washington Post.
Microsoft's AI business is booming. CEO Satya Nadella said in an earnings call last month that Microsoft's AI division “is on track to surpass a $10 billion annual revenue milestone next quarter” and become “the fastest business in our history to reach this milestone.”
Microsoft Translator in action
In one video demoInterpreter translates from Spanish to English in real time in a team meeting, changing what the listener hears while preserving the characteristics of the speaker's voice.
In another demo, Interpreter does the same thing from English to Korean.
Here's how Microsoft Teams' translator feature works to make it sound like you're speaking a foreign language during calls https://t.co/92al0jkG9u pic.twitter.com/B9zMLdFlBd
– Tom Warren (@tomwarren) November 19, 2024
Microsoft reassures users that it will not store their biometric information and will only allow voice simulation with their consent.
Pros and cons of voice cloning
Sound cloning technology is useful for more than just real-time rendering. In July, AI startup ElevenLabs introduced an app which featured the cloned voices of Judy Garland, James Dean, Burt Reynolds and Sir Laurence Olivier. Users can tap these voices to narrate any books, documents, or files they've uploaded.
However, technology has a downside: it makes cheating even more personal. An AI cloning scheme copies someone's voice by itself three seconds of audiolike a video posted on social networks. After cloning the voice, the scammers cold call the victim's friends and family to get money.
Related: The growing threat of AI sounds like your loved one on the phone – but they're not
Some AI companies have been held back from releasing sophisticated voice cloning technology because it could be used for the wrong purposes. In April, ChatGPT creator OpenAI DESIGNATED a Voice Engine AI generator that it said could realistically imitate someone's voice from 15 seconds of audio – but decided not to release it widely because of the “potential for synthetic voice abuse”.