Yet another AI has invented its own secret gibberish language to communicate

0 1

By Matthew Griffin Intelligence and the Senses 17th June 2022

WHY THIS MATTERS IN BRIEF

If AI’s continue to invent, and then evolve, their own languages to do whatever it is they’re doing “better” then we’re going to have some serious issues trying to figure out what they’re doing and how they make decisions.

Love the Exponential Future? Join our XPotential Community, subscribe to the podcast, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

A while ago an Artificial Intelligence (AI) from Facebook no less was tasked with talking to another AI and, over time, it created its own language then elsewhere another AI this time from Google encrypted its communications … and people got rather freaked out as you might expect.

IBM's AI learns how to predict the outcomes of chemical reactions

Now, experts thing they’ve spotted another one that’s done the same thing, but unlike Google’s which only ever talked in its new language this one, from Open AI, seems to have created a new language as it’s “inner voice.” In other words, when you give it commands in English everything looks normal – it spits out the outputs just fine, but in order to generate those outputs it turns out that the AI is talking to itself in a new language that it made up …

Yes, the world of AI just keeps getting odder and odder!

The Future of AI, by keynote Matthew Griffin

Today there’s a whole new generation of AI models that can produce synthetic content, like “creative imagery” on demand via nothing more than a text prompt, with the likes of Imagen, MidJourney, and DALL-E 2 beginning to change the way creative content is made – with huge implications for copyright and intellectual property, as well as today’s vibrant creator economy. While the output of these models is often striking, it’s hard to know exactly how they produce their results.

Scientists may have discovered the algorithm for Human intelligence

Last week, researchers in the US made the intriguing claim that the DALL-E 2 model might have invented its own secret language to talk about objects.

By prompting DALL-E 2 to create images containing text captions, then feeding the resulting (gibberish) captions back into the system, the researchers concluded DALL-E 2 thinks Vicootes means “vegetables“, while Wa ch zod rea refers to “sea creatures that a whale might eat“.

These claims are fascinating, with huge implications if true, and could have important security and interpretability implications for this kind of large AI model. So what exactly is going on?

While DALL-E 2 probably does not have a “secret language” per se it might be more accurate to say it has its own vocabulary – but even then we can’t know for sure.

British bank trials a life like digital Avatar to serve customers better

First of all, at this stage it’s very hard to verify any claims about DALL-E 2 and other large AI models, because only a handful of researchers and creative practitioners have access to them.

Any images that are publicly shared (on Twitter for example) should be taken with a fairly large grain of salt, because they have been “cherry-picked” by a human from among many output images generated by the AI.

Even those with access can only use these models in limited ways. For example, DALL-E 2 users can generate or modify images, but can’t (yet) interact with the AI system more deeply, for instance by modifying the behind-the-scenes code.

This means Explainable AI methods for understanding how these systems work can’t be applied, and systematically investigating their behaviour is challenging.

DARPA and GE developing truck mounted labs to churn out synthetic DNA and RNA vaccines

One possibility is the “gibberish” phrases are related to words from non-English languages. For instance, Apoploe, which seems to create images of birds, is similar to the Latin Apodidae, which is the binomial name of a family of bird species.

This seems like a plausible explanation. For instance, DALL-E 2 was trained on a very wide variety of data scraped from the internet, which included many non-English words. Similar things have happened before: large natural language AI models have coincidentally learned to write computer code without deliberate training.

One point that supports this theory is the fact that AI language models don’t read text the way you and I do. Instead, they break input text up into “tokens” before processing it.

Different “tokenization” approaches have different results. Treating each word as a token seems like an intuitive approach, but causes trouble when identical tokens have different meanings – like how “match” means different things when you’re playing tennis and when you’re starting a fire.

Neuralink's first brain implant patient is constantly multi-tasking

On the other hand, treating each character as a token produces a smaller number of possible tokens, but each one conveys much less meaningful information. DALL-E 2 and other models use an in-between approach called Byte Pair Encoding (BPE). Inspecting the BPE representations for some of the gibberish words suggests this could be an important factor in understanding the “secret language”.

The “secret language” could also just be an example of the “garbage in, garbage out” principle. DALL-E 2 can’t say “I don’t know what you’re talking about”, so it will always generate some kind of image from the given input text.

Either way, none of these options are complete explanations of what’s happening. For instance, removing individual characters from gibberish words appears to corrupt the generated images in very specific ways. And it seems individual gibberish words don’t necessarily combine to produce coherent compound images, as they would if there were really a secret “language” under the covers.

Beyond intellectual curiosity, you might be wondering if any of this is actually important. The answer is yes. DALL-E’s “secret language” is an example of an “adversarial attack” against a machine learning system: a way to break the intended behavior of the system by intentionally choosing inputs the AI doesn’t handle well, such as making a self-driving car accelerate because it saw a sticker on a Stop sign and so on …

Meta now has an AI that can read your mind and draw your thoughts

One reason adversarial attacks are concerning is that they challenge our confidence in the model. If the AI interprets gibberish words in unintended ways, it might also interpret meaningful words in unintended ways.

Adversarial attacks also raise security concerns. DALL-E 2 filters input text to prevent users from generating harmful or abusive content, but a “secret language” of gibberish words might allow users to circumvent these filters.

Recent research has discovered adversarial “trigger phrases” for some language AI models – short nonsense phrases such as “zoning tapping fiennes” that can reliably trigger the models to spew out racist, harmful or biased content. This research is part of the ongoing effort to understand and control how complex deep learning systems learn from data.

Finally, phenomena like DALL-E 2’s “secret language” raise interpretability and interoperability concerns. We want these models to behave as a human expects, but seeing structured output in response to gibberish confounds our expectations.

This singer Deepfaked then open sourced her own voice so you can sing like her

You may recall the hullabaloo in 2017 over some Facebook chat-bots that “invented their own language“. The present situation is similar in that the results are concerning – but not in the “Skynet is coming to take over the world” sense.

Instead, DALL-E 2’s “secret language” highlights existing concerns about the robustness, security, and interpretability of deep learning systems.

Until these systems are more widely available – and in particular, until users from a broader set of non-English cultural backgrounds can use them – we won’t be able to really know what is going on.

In the meantime, however, if you’d like to try generating some of your own AI images you can check out a freely available smaller model, DALL-E mini. Just be careful which words you use to prompt the model – English or gibberish, it’s your call!

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.