WHY THIS MATTERS IN BRIEF
Artificial intelligence can only progress as fast as it can learn, and now that it can learn at almost human speed it’s going to be progressing much, much faster.
Make no mistake, artificial intelligence (AI) has humans in its perfect, heightened, machine vision sights. Deep learning machines already have superhuman skills when it comes to tasks such as seat of your pants air to air combat, facial recognition, language translation, lip reading, running hedge funds and annihilating troops of hardened online gamers. And as a consequence you’d be forgiven for thinking that in many fields humans are already outgunned.
But it’s not time to pack our bags and head home just yet because there’s one crucial area where we’re still the undisputed masters of the universe – we still learn faster than they do, and by a country mile.
When it comes to mastering classic video games, for example, even the best deep learning machines can still take over 200 hours of gameplay to reach the same skill levels that we can achieve in just two hours. As a consequence, it should come as no surprise that computer scientists would love to speed up how fast their AI pets learn.
Oh how we can still scoff at those stupid AI’s. Ha ha ha.
However, last week Alexander Pritzel and his DeepMind team made a breakthrough, and made me pause mid scoff – and that’s another breakthrough, in what’s almost becoming a daily occurrence for the folks down at DeepMind, which is arguably one of the world’s most advanced AI companies. Last week for example they announced that they’d given their DeepMind AI a human memory, and it’s that breakthrough, which is just the tip of the iceberg, that has now helped them achieve this breakthrough.
Pritzel and his team have just built a deep learning system that’s capable of assimilating new learning experiences and then acting on them – the result is a machine that learns ten times faster than it did previously, and which is now edging towards learning at almost human speed.
As a result it might be time to scoff less, and soon, according to the Law of Accelerating Returns, which is the same law that’s often used to predict the exponential rate of technological progress, we might soon see the day when these systems don’t just catch us up, but they overtake us like a hypersonic jet racing a bumble bee.
First a technology lesson, I know you like those. Deep learning uses layers of neural networks to look for patterns in data, and when it spots one it sends this information on to the next layer, which looks for other patterns in the signal, and so on and so on. For example, in facial recognition, one layer might look for the edges in an image in order to try to identify the outline of the persons face. The next layer then looks for circular patterns, such as the shapes that make up our eyes and mouths, and the next layer might look for a triangulation pattern that’s identifies the two eyes and mouth as a human face.
As always though the devil is in the detail. Did the deep learning system just identify the face of a human, or a Gorilla? And so the pattern matching, and the feedback systems, which learn by adjusting a variety of internal parameters go on and on until they can identify the image correctly.
Traditionally these parameters must change slowly, since a big change in one layer can catastrophically affect learning in the subsequent layers. That’s why deep neural networks need so much training, and it’s also why it takes so long. But now Pritzel and his team have overcome this problem using a technique they catchily call “Neural Episodic Control.”
“Neural episodic control demonstrates dramatic improvements in the speed of learning for a wide range of environments,” he says, “critically, our agent is able to rapidly latch onto highly successful strategies as soon as they are experienced, instead of waiting for many steps of optimisation.”
The basic idea behind DeepMind’s approach is to copy the way humans and animals learn quickly. The general consensus is that humans can tackle situations in two different ways. If we’ve seen the situation before then our brains have already formed a “model” of it, something called the Jennifer Aniston Neuro Model (JANM), and no, I didn’t just make that up – how dare you, which they use to work out how best to behave. This learning takes place in our Prefrontal Cortex.
But when the situation isn’t familiar our brains have to fall back on another strategy. This is thought to involve a much simpler “Test and remember” approach involving our Hippocampus. So we try something and then remember the outcome episode. If our try is successful, we try it again and again, but if it isn’t then we try to avoid it in the future. This “episodic” approach works in the short term while our prefrontal brain learns, but over time it’s outperformed by the prefrontal cortex and its JANM based approach.
It’s this system that Pritzel and his team have used as their inspiration, so as a consequence their latest DeepMind agent now uses two similar approaches. The first is a conventional deep learning approach that mimics the behaviour of the prefrontal cortex, and the second is more like the hippocampus – when it tries something new it remembers the outcome. Crucially, however, it doesn’t try to learn what to remember. Instead, it remembers everything.
“Our architecture doesn’t try to learn when to write episodes to memory, as this can be slow to learn and take a significant amount of time,” says Pritzel, “instead, we elect to write all experiences to the memory, and allow its memory to grow very large compared to existing memory architectures.”
The team then use a set of strategies to read from this large memory quickly, and the result is that the system can latch onto successful strategies much more quickly than conventional deep learning systems. The team then went on to demonstrate how well all this works by training their machine to play classic Atari video games, such as Breakout, Pong, and Space Invaders. The result was a system that vastly outperforms other deep learning approaches in the speed at which it learns.
“Our experiments show that neural episodic control requires an order of magnitude fewer interactions with the environment,” he said, and that’s impressive work with significant potential.
The team say that an obvious extension of this work is to test their new approach on more complex 3D environments, and it’ll be interesting, albeit somewhat terrifying, to see the system crank up through the gears as it edges towards exponential improvement… stay tuned.