DeepMind's AI now has human like 3D vision

0 0

By Matthew Griffin Intelligence and the Senses 10th September 2018

WHY THIS MATERS IN BRIEF

As we come to rely on robots and automated systems more in our daily lives their ability to accurately perceive our world is crucial.

If I showed you a single picture of a room it’s likely you’d be able to tell me about it right away, such as the fact there’s a table with a chair in front of it, that they’re probably about the same size, about this far from each other, and with the walls this far away. Either way it’d be enough for you to be able to draw a rough map of the room. Machine vision systems don’t yet have this deeply human intuitive understanding of “space,” but the latest research from DeepMind has now bought it within reach after a successful demonstration.

A new AI tutor is going to be teaching coding at Harvard this semester

The teams paper was published in the journal Science and it details a system whereby a neural network, knowing practically nothing, can look at one or two static 2D images of a scene and reconstruct a reasonably accurate 3D representation of it.

How the system works

Most machine vision algorithms work via what’s called supervised learning, where they’re fed a huge amount of information that’s already been labelled with everything outlined and named.

The new system from DeepMind though has no such knowledge to draw on and it wasn’t fed any information. It works entirely independently of the way we see the world, such as how objects’ colours change toward their edges, how they get bigger and smaller as their distance changes and so on.

It works, roughly speaking, like this. One half of the system is its “representation” part, which can observe a given 3D scene from some angle, encoding it in a complex mathematical form called a vector. Then there’s the “generative” part, which, based only on the vectors created earlier, predicts what a different part of the scene would look like.

Artists copyrights claims agains Generative AI companies mostly dismissed

Think of it like someone handing you a couple of pictures of a room, then asking you to draw what you’d see if you were standing in a specific spot in it. Again, this is simple enough for us, but computers have no natural ability to do it, their sense of “sight,” if we can call it that, is extremely rudimentary, and of course machines lack imagination, which is something else DeepMind has given its AI’s recently.

“It wasn’t at all clear that a neural network could ever learn to create [3D] images in such a precise and controlled manner,” said lead author of the paper, Ali Eslami, in a statement, “however we found that sufficiently deep neural networks can learn about perspective, occlusion and lighting, without the need for any human engineering. This was a super surprising finding.”

It also allows the system to accurately recreate a 3D object from a single viewpoint. This kind of ability is critical for drones, robots, such as Amazon’s warehouse picking robots, as well as perhaps tomorrow’s surgical robots, and self-driving cars, because they have to navigate the real world by sensing it and reacting to what they see. With limited information, such as some important clue that’s temporarily hidden from view, they can freeze up or make illogical choices which, for example, if the AI operating a self-driving car could be lethal.

South Korea introduces the world's first "Robot Tax"

With a system like this though their robotic brains could make reasonable assumptions about, say, the layout of a room, or objects on the side of the road, without having to ground truth every inch.

“Although we need more data and faster hardware before we can deploy this new type of system in the real world,” Eslami said, “it takes us one step closer to understanding how we may build agents that learn by themselves.”

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.