Microsoft's VALL-E AI can clone your voice in just three seconds

0 0

By Matthew Griffin Intelligence and the Senses 21st January 2023

WHY THIS MATTERS IN BRIEF

Imagine all the times you use your voice, now imagine what would happen if people could clone it – for good and bad …

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

Voice cloning tools aren’t new, in fact they’ve been able to mimic people quite well for years including Bill Gate’s voice which was oddly the result of a Facebook experiment, as well as create podcasts that include the now dead Steve Jobs. Now though, Microsoft has announced that it’s working on its own Artificial Intelligence (AI) called VALL-E that can clone someone’s voice from just a 3 second audio clip. And that’s world beating fast.

Researchers networked animal brains together to create an organic computer

VALL-E, which was trained with 60,000 hours of English speech, is capable of mimicking a voice in “Zero-shot scenarios,” meaning it can make a voice say words it has never heard the voice say before, according to a paper published by Cornell University in which the developers introduced the tool.

VALL-E uses Text-to-Speech technology to convert written words into spoken words in “high-quality personalized” speeches, according to the 16-page paper.

The Future of Cyber and Spoofing, by keynote Matthew Griffin

It used recordings of more than 7,000 real speakers from LibriLight– an audiobook dataset made up of public-domain texts read by volunteers – to conduct its sampling. The tech giant released samples of how VALL-E would work, showcasing how the voice of a speaker is cloned.

Scientists may have discovered the algorithm for Human intelligence

The AI tool is not currently available for public use, and Adobe who also created a similar tool a while ago called VoCo canned the project fearing it would unleash the equivalent of “Photoshop for voice content,” and so far Microsoft hasn’t made it clear what its intended purpose is. The researchers also said the results so far showed that VALL-E “significantly outperforms” the most advanced systems of its kind, “in terms of speech naturalness and speaker similarity.”

But they pointed out the lack of diversity of accents among speakers, and that some words in the synthesized speech were “unclear, missed, or duplicated.”

They also included an ethical warning about VALL-E and its risks, saying the tool could be misused, for example in “spoofing voice identification or impersonating a specific speaker,” the latter of which a while ago meant that a company transferred $243,000 after it’s CFO, whose voice got cloned, “told” them to.

Colorful bio-engineered bacteria could end fashions obsession with toxic dyes

“To mitigate such risks, it is possible to build a detection model to discriminate whether an audio clip was synthesized by VALL-E,” the developers wrote in the paper. They didn’t give details of how this could be done.

They added that “if the model is generalized to unseen speakers in the real world, it should include a protocol to ensure that the speaker approves the use of their voice.”

Meanwhile, Microsoft announced Monday it will make OpenAI’s ChatGPT available to its own services after announcing its interest in investing $10 billion in the AI writing tool.

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.