Major AI research breakthrough helps AI's forget copyrighted content

0 12

By Matthew Griffin Security and Privacy 31st March 2024

WHY THIS MATTERS IN BRIEF

Previously trying to get AI to forget anything was impossible because it retained “memories” of what it had learned. But now it might actually be able to forget …

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

One of the big problems that companies have been having when it comes to realising that their Artificial Intelligences (AI) have infringed copyright, or are being asked to forget something, perhaps as the result of a GDPR request, is that they can’t find an effective way to actually get their AIs to forget what they’ve learned … either not at all or not very effectively because “old memories” remain no matter how much you try to get them to forget or retrain them.

Aembit unveils Workload IAM to secure every company workload

When people learn things they should not know, getting them to forget that information can be tough. This is also true of rapidly growing AI programs that are trained to think as we do, and it has become a problem as they run into challenges based on the use of copyright protected material and privacy issues.

To respond to this challenge, researchers at the University of Texas at Austin have developed what they believe is the first “Machine Unlearning” method applied to image-based Generative AI. This method offers the ability to look under the hood and actively block and remove any violent images or copyrighted works without losing the rest of the information in the model. The study is published on the arXiv preprint server.

This new tool becomes especially valuable when you realise that under new EU AI laws European law makers can request that AI’s that aren’t compliant to the letter of the law, whether that’s from a copyright, ethics, or even safety perspective, can be deleted which in some cases, bearing in mind companies are spending billions training their AI’s, is understandably causing some companies to freak out.

Fedex signs Nuro partnership to bring autonomous deliveries to your doorstep

“When you train these models on such massive data sets, you’re bound to include some data that is undesirable,” said Radu Marculescu, a professor in the Cockrell School of Engineering’s Chandra Family Department of Electrical and Computer Engineering and one of the leaders on the project.

“Previously, the only way to remove problematic content was to scrap everything, start anew, manually take out all that data and retrain the model. Our approach offers the opportunity to do this without having to retrain the model from scratch.”

Generative AI models are trained primarily with data on the internet because of the unrivalled amount of information it contains. But it also contains massive amounts of data that is protected by copyright, in addition to personal information and inappropriate content.

Underscoring this issue, The New York Times recently sued OpenAI, and so did hundreds of artists, maker of ChatGPT, arguing that the AI company illegally used its articles as training data to help its chatbots generate content.

Tech giants prepare for the next giant leap in quantum computing

“If we want to make generative AI models useful for commercial purposes, this is a step we need to build in, the ability to ensure that we’re not breaking copyright laws or abusing personal information or using harmful content,” said Guihong Li, a graduate research assistant in Marculescu’s lab who worked on the project as an intern at JPMorgan Chase and finalized it at UT.

Image-to-image models are the primary focus of this research. They take an input image and transform it – such as creating a sketch, changing a particular scene and more – based on a given context or instruction.

This new machine unlearning algorithm provides the ability of a machine learning model to “forget” or remove content if it is flagged for any reason without the need for retraining the model from scratch. Human teams handle the moderation and removal of content, providing an extra check on the model and ability to respond to user feedback.

The Pentagon hires the world's best poker playing AI to play War Games

Machine unlearning is an evolving branch of the field that has been primarily applied to classification models. Those models are trained to sort data into different categories, such as whether an image shows a dog or a cat.

Applying machine unlearning to generative models is “relatively unexplored,” the researchers write in the paper, especially when it comes to images.

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.