Meta's new algorithm is teaching AI how to multi-task

0 0

By Matthew Griffin Intelligence and the Senses 29th January 2022

WHY THIS MATTERS IN BRIEF

Humans multi-task, AI’s don’t – yet. This new algorithm will help us reach AGI faster …

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

If you can recognize a dog by sight, then you can probably recognize a dog when it is described to you in words. But that’s not the case for today’s Artificial Intelligence (AI). While deep neural networks have become very good at identifying objects in photos and conversing in natural language they can’t do it at the same time, and being able to do this is crucial if we’re ever going to realise a time when AI bests humans in all cognitive tasks – the age of so called Artificial General Intelligence (AGI) which is estimated to arrive circa 2035.

Boston Dynamic's robot Spot the dog learns to dance

This is also a problem that the world’s largest tech companies like Google, Microsoft, and OpenAI are spending billions of dollars to solve and making some interesting inroads into by developing frameworks like DeepMind’s AGI Impala framework.

Learn about the Future of AI, by Futurist Keynote Matthew Griffin

As far as Meta are concerned part of the problem is that these models learn different skills using different techniques. This is a major obstacle for the development of more general-purpose AI, machines that can multi-task and adapt. It also means that advances in deep learning for one skill often do not transfer to others.

A team at Meta AI (previously Facebook AI Research) wants to change that. The researchers have now developed a single algorithm that can be used to train a neural network to recognize images, text, or speech. The algorithm, called Data2vec, not only unifies the learning process but performs at least as well as existing techniques in all three skills.

Fortune CEO's plan to abandon shareholder first model and be more future focused

“We hope it will change the way people think about doing this type of work,” says Michael Auli, a researcher at Meta AI.

The research builds on an approach known as self-supervised learning, in which neural networks learn to spot patterns in data sets by themselves, without being guided by labelled examples. This is how large language models like the stunning and promising GPT-3 learn from vast bodies of unlabelled text scraped from the internet, and it has driven many of the recent advances in deep learning.

Auli and his colleagues at Meta AI had been working on self-supervised learning for speech recognition. But when they looked at what other researchers were doing with self-supervised learning for images and text, they realized that they were all using different techniques to chase the same goals.

JPMorgan unleashes artificial intelligence to automate its legal work

Data2vec uses two neural networks – a student and a teacher. First, the teacher network is trained on images, text, or speech in the usual way, learning an internal representation of this data that allows it to predict what it is seeing when shown new examples. When it is shown a photo of a dog, it recognizes it as a dog.

The twist is that the student network is then trained to predict the internal representations of the teacher. In other words, it is trained not to guess that it is looking at a photo of a dog when shown a dog, but to guess what the teacher sees when shown that image.

Because the student does not try to guess the actual image or sentence but, rather, the teacher’s representation of that image or sentence, the algorithm does not need to be tailored to a particular type of input.

US defense contractors announce directed energy weapons are ready for prime time

Data2vec is part of a big trend in AI toward models that can learn to understand the world in more than one way. “It’s a clever idea,” says Ani Kembhavi at the Allen Institute for AI in Seattle, who works on vision and language. “It’s a promising advance when it comes to generalized systems for learning.”

An important caveat is that although the same learning algorithm can be used for different skills, it can only learn one skill at a time. Once it has learned to recognize images, it must start from scratch to learn to recognize speech. Giving an AI multiple skills at once is hard, but that’s something the Meta AI team wants to look at next.

The researchers were surprised to find that their approach actually performed better than existing techniques at recognizing images and speech, and performed as well as leading language models on text understanding.

ChatGPT passes another US medical exam and diagnosed 1 in 100,000 conditions in seconds

Mark Zuckerberg is already dreaming up potential metaverse applications:

“This will all eventually get built into AR glasses with an AI assistant,” he posted to Facebook’s blog. “It could help you cook dinner, noticing if you miss an ingredient, prompting you to turn down the heat, or more complex tasks.”

For Auli, the main takeaway is that researchers should step out of their silos.

“Hey, you don’t need to focus on one thing,” he says. “If you have a good idea, it might actually help across the board.”

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.