Researchers have found a way to successfully debug neural networks

By Matthew Griffin Intelligence and the Senses 8th November 2017

WHY THIS MATTERS IN BRIEF

A bug in your self-driving car’s AI software caused it to drive you off a cliff and you’re dead, you’re probably now wishing that it’d been debugged – and now it can be.

If you listen to the venture capitalists in Silicon Valley then you’ll hear the phrase “Software is eating the world” every time you sit down to have a coffee and every time you turn a corner, but the fact is that most software bugs won’t kill you. The worst that might happen is your Powerpoint presentation could blue screen and you’d be left hanging in front of an audience, but you’ll brush it off. Self-driving cars on the other hand, well, that’s another story. A bug in that software and you’ll find yourself having a completely different type of crash experience.

ALPHA AI just changed the balance of global warfare

Today most of the Artificial Intelligence (AI) platforms that underpin self-driving cars are neural network black boxes, and even the experts, like those from Google and Elon Musks’ OpenAI, by their own admission, don’t really understand how they learn or how they do the things they do – like spontaneously learn.

Over the past year there have been a few attempts, from the likes of MIT and Nvidia, to get these black boxes to explain their decision making and there is some progress being made, but now a new team of researchers from Columbia University, who also recently found a way to create more nimble robots, and Lehigh University have come up with another method to get these magical boxes of mystery to reveal their darkest secrets, and their bug hunting method, called DeepXplore, aims to expose any AI’s bad decision making – whether those AI’s are deployed in online services and autonomous vehicles.

Drones get new eyes in the sky with Echodynes miniature radar

The new method uses at least three neural networks, the basic architecture of deep learning algorithms, to act as “cross-referencing oracles” in checking each other’s accuracy.

Originally the team designed DeepXplore to solve an optimization problem where they looked to strike the best balance between two objectives – maximizing the number of neurons activated within neural networks, and triggering as many conflicting decisions as possible among different neural networks.

By assuming that the majority of neural networks will generally make the right decision, DeepXplore automatically retrains the neural network that made the lone dissenting decision to follow the example of the majority in a given scenario.

“This is a differential testing framework that can find thousands of errors in self-driving systems and in similar neural network systems,” says Yinzhi Cao, Assistant Professor of Computer Science at Lehigh University in Bethlehem, Pa.

Big Tech giants increase AI welfare research as they worry it's edging towards sentience

Cao and his colleagues on the DeepXplore team recently won best paper after presenting their research at the 2017 Symposium on Operating Systems Principles (SOSP) held in Shanghai, China late last month, and their win may signal a growing recognition of the need for debugging tools in deep learning AI.

Typically, deep learning algorithms become better at certain tasks by filtering huge amounts of training data that humans have labelled with the correct answers, and that’s enabled such algorithms to achieve accuracies of well over 90 percent on certain test datasets that involve tasks such as identifying the correct human faces in Facebook photos or choosing the correct phrase in a Google translation between, say, Chinese and English. In these cases, it’s not the end of the world if a friend occasionally gets misidentified or if a certain esoteric phrase gets translated incorrectly.

First Tokamak component gets installed in a commercial Fusion plant

But the consequences of mistakes rise sharply once tech companies begin using deep learning algorithms in applications such as controlling an armed military drone, or where a two ton machine is moving at highway speeds. A wrong decision by a self-driving AI here could lead to the car crashing into a guard rail, colliding with another vehicle or worse still, mowing down pedestrians and cyclists.

Bang you’re dead

In one example from DeepXplore compared the images of two hill top roads. One was a normally exposed image and the identical image was slightly darker and DeepXplore discovered that in this case Nvidia’s DAVE-2 self driving car software would have sent the car crashing into the guard rail. And possibly off a cliff. Remind me not to get in that car… AI debugging software is quickly becoming my favourite type of software.

New MIT tech lets Pokemon magically interact with the real world

Similarly government regulators will want to know for sure that self-driving cars can meet a certain safety standards, and random test datasets may not uncover all those rare “corner cases,” as they’re called, that could lead an algorithm to make a catastrophic mistake.

“I think this push toward secure and reliable AI kind of fits in nicely with explainable AI,” says Suman Jana, an Assistant Professor of Computer Science at Columbia University in New York City, “transparency, explanation and robustness all have to be improved a lot in machine learning systems before these systems can start working together with human beings or start running on roads.”

Jana and Cao come from a group of researchers who share backgrounds in software security and debugging. In their world, even software that is 99-percent error free could still be vulnerable if malicious hackers can exploit that one lone bug in the system, and that has made them far less tolerant of errors than many deep learning researchers who see mistakes as a natural part of the training process. It’s also made them fairly ideal candidates to figure out a new and more comprehensive approach for debugging deep learning.

GE teams up with NIH to put a COVID-19 sensor in your smartphone

Until now, debugging of the neural networks in self-driving cars has involved fairly tedious or random methods. One random testing approach involves human researchers manually creating test images and feeding those into the networks until they triggered a wrong decision. Meanwhile a second approach, called Adversarial Testing, can automatically create a sequence of test images by slightly tweaking one particular image until it trips up the neural network.

DeepXplore took a different approach by automatically creating test images most likely to cause three or more neural networks to make conflicting decisions. For example, DeepXplore might look for just the right amount of lighting in a given image that could lead two neural networks to identify a vehicle as a car while a third neural network identifies it as a face – a problem that, arguably, led to the death of a Tesla driver in Florida last year when the car’s autopilot mistook a semi-truck for the sun.

Legislators vote to let San Francisco police use robots to remotely kill suspects

At the same time, DeepXplore also aimed to maximize neuron coverage in its testing by activating the maximum number of neurons and different neural network pathways. Such neuron coverage is based on a similar concept in traditional software testing called code coverage, Cao explains.

This process was able to activate 100 percent of network neurons, or about 30 percent more on average than either the random or adversarial testing methods previously used in deep learning algorithms.

Rabbit R1 sells out as customers continue to queue up for the must have new gadget

Testing with 15 state of the art neural networks looking at five different public datasets showed how DeepXplore could find thousands of previously undiscovered errors in a wide variety of deep learning applications. The test datasets included scenarios for self-driving car AI, automatic object recognition in online images, and automatic detection of malware masquerading as ordinary software.

While DeepXplore cannot yet guarantee that it has found every single possible bug in a system, and nor is it every likely to be able to make that claim, it’s a huge step forwards in an area that is increasingly crucial if we are going to ever realise the full potential of AI.

Matthew Griffin / About Author

Matthew Griffin is a multi-award winning Futurist and expert in Disruption and Innovation, Geopolitics, Leadership, and Technology, who NASA have described as a "walking encyclopaedia of the future" and a "futurist Polymath." 15-time best selling author of the "Codex of the Future" series, Matthew is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working with royal households, world leaders, G7, G20, and G77 governments, NGOs, and multi-national mid and mega cap firms to help them explore, shape, and lead the next 50 years of business and society.

An award-winning YouTube creator with over a million followers, with an unrivalled global reach and impact, Matthew is a highly sought-after international keynote speaker, lecturer, and mentor who collaborates with global leaders through the United Nations Alliance of Civilizations (UNAOC) and United Nations General Assembly (UNGA) to shape pivotal initiatives such as the UN’s AI for Humanity program, the United Nations Conference of the Parties (UN COP), and the World Economic Forum in Davos.

As the former Global Head of Cloud, National Security, and Enterprise Sales for companies including Atos, Dell-EMC, and IBM, Matthew has a proven track record of building multi-billion dollar business units and turning failing divisions into market leaders. His ability to identify, analyse, and communicate the implications of hundreds of emerging technologies and trends is unparalleled, and his insights are trusted by many of the world’s most respected organisations, including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi, Coca-Cola, Dentons, Deloitte, Dow Jones, EY, Google, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, Siemens AG and Siemens Energy, T-Mobile, UBS, VISA, Walmart, Workday, Worldpay and many others.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.