MIT's newest chip could bring speech recognition to every device

1 0

By Matthew Griffin Computing 20th February 2017

WHY THIS MATTERS IN BRIEF

The way we use and interact with devices is changing, increasingly we’ll be using our voices to control, interact and manage them.

The butt of jokes as little as 10 years ago, automatic speech recognition is now on the verge of becoming people’s chief means of interacting with the computers and devices around them. After all, did you really think, for example, that you were going to use a keyboard and mouse to control and interact with your smartwatch, or your self-driving car? Uh-uh.

Cray prepares to ship the world's first ARM powered supercomputer

In anticipation of the age of voice controlled electronics, MIT researchers have built what’s considered by many to be the world’s most efficient low power chip that’s specialised for automatic speech recognition.

Whereas a cell phone running speech recognition software might require about 1 watt of power, the new chip requires between 0.2 and 10 milliwatts, depending on the number of words it has to recognize and that’s a game changer.

In the real world that’ll likely translate into power savings of between 90 to 99 percent, which could make voice control a reality for many relatively simple electronic devices, including power constrained devices that have to harvest energy from their environments – or go months between battery charges – many of which form the backbone of the Internet of Things (IoT) which could include everything from appliances all the way through to city infrastructure, and even cows. Yes. Cows – they’re sensor packed and internet connected too you know.

Please try to keep up, we’re changing the world here. So next time you’re talking to your watch, kettle, or connected cow think of this article, and of course the people at MIT.

World first as two quantum computers go head to head

“Speech input will become a natural interface for many wearable applications and intelligent devices,” says Anantha Chandrakasan, a professor of electrical engineering and computer science at MIT, whose group developed the new chip, “the miniaturization of these devices will require a different interface than touch or keyboard. It will be critical to embed the speech functionality locally to save system energy consumption compared to performing this operation in the cloud.”

“I don’t think that we really developed this technology for a particular application,” adds Michael Price, who led the design of the chip, “we’ve tried to put the infrastructure in place to provide better trade offs to a system designer than they would have had with previous technology, whether it was software or hardware acceleration.”

Today, the best performing speech recognition systems are, like many other state-of-the-art artificial intelligence (AI) systems, based on neural networks – virtual networks of simple information processing systems modelled on the human brain. As a consequence most of the new chip’s architecture is concerned with making this speech recognition neural network as efficient as possible.

But even the most power efficient speech recognition system would quickly drain a device’s battery if it ran without interruption, so the chip also includes a simpler “voice activity detection” circuit that monitors ambient noise to determine whether it might be speech. If the answer is yes, the chip fires up the larger, more complex speech recognition circuit.

Quantum computers with DNA and atom scale storage systems will rule the future

In fact, for experimental purposes, the researchers’ chip had three different voice activity detection circuits, with different degrees of complexity and, consequently, different power demands. Which circuit is most power efficient depends on the context, but in tests simulating a wide range of conditions, the most complex of the three circuits led to the greatest power savings for the system as a whole. Even though it consumed almost three times as much power as the simplest circuit, it generated far fewer false positives; the simpler circuits often chewed through their energy savings by spuriously activating the rest of the chip.

A typical neural network consists of thousands of processing “nodes” capable of only simple computations but densely connected to each other. In the type of network commonly used for voice recognition, the nodes are arranged into layers. Voice data are fed into the bottom layer of the network, whose nodes process and pass them to the nodes of the next layer, whose nodes process and pass them to the next layer, and so on. The output of the top layer indicates the probability that the voice data represents a particular speech sound.

A voice recognition network is too big to fit in a chip’s on board memory, which is a problem because going “off chip” for data is much more energy intensive than retrieving it from local stores. So the MIT researchers’ design concentrates on minimizing the amount of data that the chip has to retrieve from off chip memory.

Google's democratic AI re-distributes wealth better than politicians

A node in the middle of a neural network might receive data from a dozen other nodes and transmit data to another dozen. Each of those two dozen connections has an associated “weight,” a number that indicates how prominently data sent across it should factor into the receiving node’s computations. The first step in minimizing the new chip’s memory bandwidth is to compress the weights associated with each node. The data are decompressed only after they’re brought on-chip.

The chip also exploits the fact that, with speech recognition, wave upon wave of data must pass through the neural network. The incoming audio signal is split up into 10-millisecond increments, each of which must be evaluated separately. The MIT researchers’ chip brings in a single node of the neural network at a time, but it passes the data from 32 consecutive 10-millisecond increments through it.

If a node has a dozen outputs, then the 32 passes result in 384 output values, which the chip stores locally. Each of those must be coupled with 11 other values when fed to the next layer of nodes, and so on. So the chip ends up requiring a sizable on board memory circuit for its intermediate computations. But it fetches only one compressed node from off chip memory at a time, keeping its power requirements low.

Google DeepMind is teaching AI to play Diplomacy before taking on the real thing

The research was funded through the Qmulus Project, a joint venture between MIT and Quanta Computer, the OEM server manufacturer that supplies the majority of the hyperscale datacentre companies, like Facebook and Google, with their cloud server systems, and the chip was prototyped by the Taiwan Semiconductor Manufacturing Company.

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.