Scroll Top

Companies are turning to synthetic data to fill in gaps in their AI models



When you don’t have real data to work with why not create some? That’s synthetic data …


Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trendsconnect, watch a keynote, or browse my blog.

Deep learning has pushed the capabilities of Artificial Intelligence (AI) to new levels, but there are still some kinks to straighten out, such as AI bias, as well as how to best train AI models to handle safety-critical applications such as self-driving cars.

If an AI recommendation engine gets its predictions wrong and puts a strange advert in your browser window, you might raise an eyebrow. But no long-term damage would have been done, but things are very different when algorithms get behind the wheel and encounter something they’ve never seen before.


DeepMind's protein folding AI conquers biology's 50 year grand challenge


Unsurprisingly rare events, or edge cases as they’re known, represent an especially tricky problem for self-driving car developers so now many of them are using Synthetic Data – which is “fake” data generated by AI’s that’s based on lifelike simulations of real-world events – to help them train their AI’s better and faster.

Examples include solutions such as NVIDIA’s Omniverse Replicator, which I’ve spoken about before, that lets developers augment real-world environments with digitally rendered scenarios such as a layer of thick snow covering the road or obscuring street signs. Another illustration of its capabilities is to digitally simulate a child running into the road chasing after a ball.

Developers could, of course, use crash test dummies and various props to achieve the same thing, but the time and expense of training an AI this way generally outweighs any benefits. Plus, if things went wrong, you’d risk damaging the vehicle and its sensors, whereas in a digitally simulated environment everything can be simply refreshed and rerun if things don’t go according to plan.


MIT's newest chip could bring speech recognition to every device


Elsewhere firms such as Synthesis AI have shown how synthetic data can be used to test the effectiveness of vehicle driver safety monitoring systems. These tools work by tracking the driver’s face to identify signs of drowsiness or distraction and then the output can be linked to Advanced Driver Assistance Systems (ADAS), for example to prime pre-crash mechanisms if the safety monitoring alerts fail to trigger a response from the driver.

Naturally, developers wouldn’t ask a test driver to fall asleep at the wheel on purpose – as a vehicle speeds along the road – so that they could put a potential facial detection algorithm and the mitigations that go with it to the test instead.

The result of Synthesis AI’s newest algorithm is a service called FaceAPI. The tool allows users to create millions of unique 3D driver models with different facial expressions “FaceAPI is already able to produce a wide variety of emotions, including, of course, closed eyes and drowsiness,” write the creators. Now, expanding on the capabilities of the synthetic data-generating software, the model can also represent a driver looking down at their phone or turning to talk to a passenger rather than focus on the road ahead.


DARPA wants to genetically engineer smart plants to act as surveillance devices


Undoubtedly the availability of realistic synthetic data for AI training can give firms a helping hand in entering markets where competitors may hold large datasets that would otherwise provide a high barrier to entry. Making it straightforward for start-ups to generate useful AI training sets based on synthetic data gives newer companies the capacity to quickly build momentum without needing to invest large amounts of capital.

Synthetic data also goes beyond just recreating scenarios that would be problematic in the real world, it can let developers dig into all manner of real world areas where training AI’s would ordinarily be expensive, time-consuming, or both.

Related Posts

Leave a comment


1000's of articles about the exponential future, 1000's of pages of insights, 1000's of videos, and 100's of exponential technologies: Get The Email from 311, your no-nonsense briefing on all the biggest stories in exponential technology and science.

You have Successfully Subscribed!

Pin It on Pinterest

Share This