Nvidia and MIT open source an AI that creates crazy good synthetic videos

1 0

By Matthew Griffin Intelligence and the Senses 26th August 2019

WHY THIS MATTERS IN BRIEF

Creating, and also then converting, video content is crazy laborious so companies are creating AI’s that do the work for you, and they’re getting better fast.

Interested in the Exponential Future? Connect, download a free E-Book, watch a keynote, or browse my blog.

Nvidia and MIT have announced that they’ve open sourced their stunning Video-to-Video Artificial Intelligence (AI) synthesis model. In short, they’ve just thrown a highly advanced AI that’s frighteningly good at creating synthetic content, in other words converting real video into synthetic video, which could be used to create not just new VR content but also help create better fake content. And while I’m going to walk you through what it is and why it’s so interesting frankly you might just want to watch the video, but put a cushion on the floor because you’re going to fall off your chair when you see what they’ve created with it.

Norwegian robot learns to self-evolve and 3D print itself in the lab

Anyway, onto the article… by using a Generative Adversarial Network (GAN) the team were able to “generate high resolution, photorealistic and temporally (time) coherent results with various input formats,” including segmentation masks, sketches, and poses – and that’s a huge leap forwards in a field where huge leaps take place almost daily.

Take a look at the amazing results

Compared to Image-to-Image (I2I) translation and it’s close relative Text-to-Video (T2V) translation, which lets people type in text and then have an AI auto-generate the corresponding video, like the ones I’ve discussed before and which is amazing in itself, there’s been a lot less research into making AI’s that can perform Video-to-Video (V2V) translation and synthesis.

And why might you ask should anyone care about V2V? Well, for starters it would allow you to capture video of a city and instantly convert it into digital footage that you could then use to instantly create a realistic Virtual Reality (VR) world – with the added perk being that you could then use another AI to modify that world on the fly in any way you like – as the video above demonstrates nicely for you by turning buildings in a city into trees. And so on…

Nvidia chief says everyone will soon be a programmer

One of the problems of V2V translation so far though has been trying to solve the problem of low visual quality and the incoherency of video results in existing image synthesis approaches, both of which the team has been able to solve to the point that their new AI can create 2K resolution videos that are up to 30 seconds in length – another set of breakthroughs.

During their research the authors performed “extensive experimental validation on various datasets” and “the model showed better results than existing approaches from both quantitative and qualitative perspectives.” And in addition to that when they extended the method to multimodal video synthesis with identical input data, the model produced new visual properties in the scene, with both high resolution and coherency.

Stephen Hawking says creating AI is the biggest event of our civilisation

The team then went on to suggest that the model could be improved in the future by adding additional 3D cues such as depth maps to better synthesise turning cars; using object tracking to ensure an object maintains its colour and appearance throughout the video; and training with coarser semantic labels to solve issues in semantic manipulation.

The Video-to-Video Synthesis paper is on arVix, the team’s model and data are here.

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.

Comments (1)

Fabio

26th September 2021 at 1:42 pm

This technology is so amazing.
There is an AI based video generator platform called Synthesia which is awesome too.
Both for the video and for the voice that is almost indistinguishable from the human

Nvidia and MIT open source an AI that creates crazy good synthetic videos

WHY THIS MATTERS IN BRIEF

Creating, and also then converting, video content is crazy laborious so companies are creating AI’s that do the work for you, and they’re getting better fast.

Comments (1)

Leave a comment Cancel reply

ORGANISING AN EVENT OR WORKSHOP?

STAY CONNECTED

FREE BOOKS AND STUFF

MY PLEDGE TO THE PLANET

NET ZERO .

ZERO HARM .

ZERO IMPACT .

ZERO WASTE .

EXPLORE MORE!

You have Successfully Subscribed!

Pin It on Pinterest

Nvidia and MIT open source an AI that creates crazy good synthetic videos

WHY THIS MATTERS IN BRIEF

Creating, and also then converting, video content is crazy laborious so companies are creating AI’s that do the work for you, and they’re getting better fast.

Related Posts

Comments (1)

Leave a comment Cancel reply

ORGANISING AN EVENT OR WORKSHOP?

STAY CONNECTED

FREE BOOKS AND STUFF

MY PLEDGE TO THE PLANET

NET ZERO .

ZERO HARM .

ZERO IMPACT .

ZERO WASTE .

EXPLORE MORE!

You have Successfully Subscribed!

Pin It on Pinterest