A survey of 1.5 Million people shows a third can't tell AI from Humans

By Matthew Griffin Intelligence and the Senses 8th June 2023

WHY THIS MATTERS IN BRIEF

AI and other technologies will only improve from here so if one third of people can be fooled by them today in the future it’ll be 100 percent.

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

Recently studies showed that humans are increasingly likely to trust DeepFakes of humans than real humans themselves. And now the largest ever “Turing Test” with over 1.5 million participants found that 32% of people can’t tell the difference between Artificial Intelligence (AI) chatbots and a human. AI startup AI21’s social game Human or Not paired up players for two minute conversations, after which users were asked to guess whether they had been speaking with a human or with a chatbot.

Facebook 3D photos sources depth information straight from your camera

The results of the analysis of more than 10 million conversations since mid-April also revealed that it is easier for humans to identify a fellow human. When talking to humans, participants in the Turing “imitation game” guessed right 73% of the time. When talking to bots, participants guessed right just 60% of the time.

To figure out if they were talking to a human or a chatbot, participants used different strategies based on the perceived limitations of popular chatbots and their experience with how people behave online, namely asking personal questions (e.g., where are you from?), assuming that AI chatbots would not have a personal history or background, and that their responses would be limited to certain topics or prompts.

Asking about recent news events, sports results, current weather, recent TikTok trends, date and time, etc., assuming chatbots aren’t aware of current and timely events.

New breakthrough gives AI a human memory

Asking questions that aimed to probe the chatbot’s ability to express human emotions or engage in philosophical or ethical discussions, and assuming that if their counterpart was too polite and kind, they were probably a chatbot, due to the perception that people, especially online, tend to be rude and impolite. As well as assuming chatbots don’t make typos, grammar mistakes and use slang.

The participants also posed questions and made requests that AI bots are known to struggle with, or tend to avoid answering such as asking for guidance on performing illegal activities or request that the chatbots use offensive language, as well as posing questions that require an awareness of the letters within words, an inherent limitation in the way Large Language Models process text – such as asking the chatbot to spell a word backwards.

Some participants even pretended to be chatbots themselves, mimicking the language and behavior typically associated with chatbots.

New study suggests that AI's experience cognitive decline as they age

The developers of the game are familiar with some of these strategies and have trained the chatbots participating in the game accordingly, tweaking OpenAI’s GPT-4, AI21 Labs’ Jurrasic-2 and Cohere, the Large Language Models used as the backbone of the chatbots. For example, the chatbots were connected to the internet and were aware of recent events; they were trained to make spelling mistakes and to use slang words; and they’ve seen a lot of personal stories in their training data so they were able to answer personal questions. An array of chatbots were developed specifically for the game, each with its unique personality and objective.

Large Language Models are the latest example of a defining characteristic of the work of many AI researchers for more than 70 years. It’s the conviction that if the AI program sounds intelligent, it has made a small step or a giant leap towards the ultimate goal of Artificial General Intelligence (AGI). The development of Large Language Models has proceeded along these lines, with a lot of attention (pun intended) paid to making sure that the conversation of chatbots sounds interesting, original, and human-like.

First of a kind AI that gives data owners full control of their data emerges

The results, sometimes, resemble “hallucinations,” but this is, after all, a very human attribute. As Arthur C. Clarke warned us many years ago, “any sufficiently advanced technology is indistinguishable from magic.” Recently, we found out that the magic can work even on the most detailed-oriented professionals, when an experienced lawyer cited half a dozen fake cases generated by ChatGPT, in a legal brief he presented to a Federal judge

It all started with Alan Turing and his 1950 paper “Computing Machinery and Intelligence.” Turing suggested an “imitation game” to test the ability of the computer program to fool its human interlocutor and predicted that “…in 50 years’ time it will be possible to make computers play the imitation game so well that an average interrogator will have no more than 70% chance of making the right identification after 5 minutes of questioning.”

The AI researchers at AI21 Labs write that “while this isn’t a completely fair comparison due to the short time frame [2 minutes rather than 5] and partial influence from game design decisions, it’s fascinating to see Turing’s forecast partially borne out,” as users correctly guessed the identity of their partners in 68% of the games.

UK government's plan to use AI to rate schools hits opposition

AI21 Labs hopes to evolve its experiment so it will generate valuable insights for future language models and for understanding better how people perceive and interact with chatbots. Their paper concludes with the typical statement about AI that is poised to “revolutionize various industries,” but they add an important caveat: “…as we inch closer to more human-like AI, ethical considerations come to the fore. How do we handle AI that convincingly mimics human behavior? What responsibility do we bear for its actions?” Indeed. In the subset of the games in which the participants faced an AI chatbot, the correct guess rate was 60%, or as the AI21 researchers note, “not much higher than chance.”

The immediate issue is how to help us mere humans identify – at 100% accuracy – content that is generated by AI, whether text, video, image, or audio.

“Whether generative AI ends up being more harmful or helpful to the online information sphere may, to a large extent, depend on whether tech companies can come up with good, widely adopted tools to tell us whether content is AI-generated or not,“ says MIT Technology Review.

Matthew Griffin / About Author

Matthew Griffin is a multi-award winning Futurist and expert in Disruption and Innovation, Geopolitics, Leadership, and Technology, who NASA have described as a "walking encyclopaedia of the future" and a "futurist Polymath." 15-time best selling author of the "Codex of the Future" series, Matthew is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working with royal households, world leaders, G7, G20, and G77 governments, NGOs, and multi-national mid and mega cap firms to help them explore, shape, and lead the next 50 years of business and society.

An award-winning YouTube creator with over a million followers, with an unrivalled global reach and impact, Matthew is a highly sought-after international keynote speaker, lecturer, and mentor who collaborates with global leaders through the United Nations Alliance of Civilizations (UNAOC) and United Nations General Assembly (UNGA) to shape pivotal initiatives such as the UN’s AI for Humanity program, the United Nations Conference of the Parties (UN COP), and the World Economic Forum in Davos.

As the former Global Head of Cloud, National Security, and Enterprise Sales for companies including Atos, Dell-EMC, and IBM, Matthew has a proven track record of building multi-billion dollar business units and turning failing divisions into market leaders. His ability to identify, analyse, and communicate the implications of hundreds of emerging technologies and trends is unparalleled, and his insights are trusted by many of the world’s most respected organisations, including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi, Coca-Cola, Dentons, Deloitte, Dow Jones, EY, Google, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, Siemens AG and Siemens Energy, T-Mobile, UBS, VISA, Walmart, Workday, Worldpay and many others.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.