AI can now listen to individuals in a crowd

0 0

By Matthew Griffin Intelligence and the Senses 20th November 2017

WHY THIS MATTERS

This new breakthrough will let our Connected Home devices hear us better, but on the other it also gives organisations the ability to eavesdrop on everyone.

Devices like Amazon’s Echo and Google Home can normally deal with requests from a single person, but like us they often still struggle when there are lots of people talking at once, like, say at a party.

AI is getting better at generating movies and Hollywood's nervous

Now though that might be less of a problem thanks to a new Artificial Intelligence (AI) agent that can separate the voices of multiple speakers out in real time, and it promises to give automatic speech recognition a big boost, and if it was ever combined with something like Google DeepMind’s AI lip reading technology, that was recently proven to be much more accurate than the best human lip readers, then you’ll never have to worry about being heard by AI ever again. And I’m almost certain that noone will ever think of using this technology to eavesdrop on you all…

The technology, which was developed by researchers at the Mitsubishi Electric Research Laboratory in Cambridge, Massachusetts, was demonstrated in public for the first time at this month’s Combined Exhibition of Advanced Technologies show in Tokyo.

It uses a machine learning technique the team calls “Deep Clustering” to identify unique features in the voiceprint of multiple speakers, and then it groups the distinct features from each speaker’s voice together, letting it disentangle multiple voices and then letting it reconstruct what each person was saying.

“It was trained using one hundred English speakers, but it can separate voices even if a speaker is Japanese,” says Niels Meinke, a spokesperson for Mitsubishi Electric.

Slow human co-workers get Flippy the burger flipping robot fired

Meinke says the system can separate and reconstruct the speech of five people speaking into a single microphone with up to 90 per cent accuracy, and if there are ten speakers the accuracy dips, but is still up to 80 per cent. In both cases though, this was with speakers the system had never encountered before.

Conventional approaches to this problem, such as using two microphones to replicate the position of a listener’s ears, have only managed 51 per cent accuracy.

In overcoming the “cocktail party effect” that has dogged AI research for decades, the new technology could help smart assistants in homes and cars work better, it could also improve automatic speech transcription, and, naturally, be used to help law enforcement agencies reconstruct recordings of conversations that could otherwise be incomprehensible.

ChatGPT's creators say AI Super Intelligence is impossible to stop

In preliminary tests the system was able to successfully separate the voices of up to five people at once.

“The system could be used to separate speech in a range of products including lifts, air-conditioning units and household products,” says Meinke, and now he and his team are looking to integrate the technology into a number of products that they expect to be released into the market soon.

Their work was published in arxiv.org/abs/1508.04306

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.