OpenAI's GPT4 isn't actually a single AI it's many and a Master of Experts

1 0

By Matthew Griffin Intelligence and the Senses 30th July 2023

WHY THIS MATTERS IN BRIEF

Many people think OpenAI’s products are a single AI, but they’re many AI’s linked together to form what’s known as a “Master of Experts” model, and this could form the foundation of future AGI.

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

In March, OpenAI launched GPT-4 with much fanfare, but a dark cloud loomed over the horizon. Scientists and Artificial Intelligence (AI) enthusiasts alike panned the company for not releasing any specifics about the model, like the parameter size or architecture. However, now months later a top AI researcher has speculated the inner workings of GPT-4 revealing why OpenAI chose to hide this information — and it’s disappointing.

Researchers have found a way to make hydrogen from biogas

OpenAI CEO Sam Altman famously stated on GPT-4 that “people are begging to be disappointed, and they will be,” speaking about the potential size of the model. Rumour mills ahead of the model’s launch suggested that it would have trillions of parameters and be the best thing that the world has ever seen. However, the reality is different. In the process of making GPT-4 better than GPT-3.5, OpenAI might have bitten off more than it could possibly chew.

George Hotz, world-renowned hacker and software engineer, recently appeared on a podcast to speculate about the architectural nature of GPT-4. Hotz stated that the model might be a set of eight distinct models, each featuring 220 billion parameters. This speculation was later confirmed by Soumith Chintala, the co-founder of PyTorch.

While this puts the parameter count of GPT-4 at 1.76 trillion, which makes it one of the largest AI models out there, the notable part is that all of these models don’t work at the same time. Instead, they are deployed in a so called Model of Experts (MoE) architecture. This architecture makes each model into different components, also known as expert models. Each of these models is fine-tuned for a specific purpose or domain, and is able to provide better responses for that field. Then, all of the expert models work together with the complete model drawing on the collective intelligence of the expert models.

MIT researchers can now use lasers to look at your insides from a distance

This approach has many benefits. One is that of more accurate responses due to models being fine-tuned on various subject matters. MoE architecture also lends itself to being easily updated as the maintainers of the model can improve it in a modular fashion, as opposed to updating a monolithic model. Hotz also speculated that the model may be relying on the process of iterative inference for better outputs. Through this process, the output, or inference result of the model, is refined through multiple iterations.

This method also might allow GPT-4 to get inputs from each of its expert models, which could reduce the hallucinations in the model. Hotz stated that this process might be done 16 times, which would vastly increase the operating cost of the model. This approach has been likened to the old trope of three children in a trenchcoat masquerading as an adult. Many have likened GPT-4 to be 8 GPT-3s in a trench coat, trying to pull the wool over the world’s eyes.

While GPT-4 aced benchmarks that GPT-3 has had difficulties with, the MoE architecture seems to have become a pain point for OpenAI. In a now-deleted interview, Altman admitted to the scaling issues OpenAI is facing, especially in terms of GPU shortages.

Google's AI enlisted to help crack the secret of nuclear fusion

Running inference 16 times on a model with MoE architecture is sure to increase cloud costs on a similar scale. When blown up to ChatGPT’s millions of users, it’s no surprise that even Azure’s supercomputer fell short of power. This seems to be one of the biggest problems that OpenAI is facing currently, with Altman stating that cheaper and faster GPT-4 is the company’s top priority as of now.

This has also resulted in a reported degradation of quality in ChatGPT’s output. All over the Internet, users have reported that the quality of even ChatGPT Plus’ responses have gone down.

I found a release note for ChatGPT that seems to confirm this, which stated, “We’ve updated performance of the ChatGPT model on our free plan in order to serve more users.” In the same note, OpenAI also informed users that Plus users would be defaulted to the “Turbo” variant of the model, which has been optimised for inference speed.

API users, on the other hand, seem to have avoided this problem altogether. Reddit users have noticed that other products which use the OpenAI API provide better answers to their queries than even ChatGPT Plus. This might be because users of the OpenAI API are lower in volume when compared to ChatGPT users, resulting in OpenAI cutting costs at ChatGPT while ignoring the API.

Ford's new electric F-150 pickup can power your home for up to 3 days

In a mad rush to get GPT-4 out to the market, it seems that OpenAI has cut corners, and just how many we’ll likely never know, but while the purported MoE model is a good step forward for making the GPT series more performant, the scaling issues that it is facing show that the company might just have bitten off more than it can chew.

Looking forwards though towards GPT-5 this approach might yield some interesting and problematic results for the company as they begin to connect these independent “expert” AI sub-models together to create what many would regard as the first plausible prototype Artificial General Intelligence (AGI), which has been their stated goal all along, to give GPT-5 cross domain knowledge – something which it’s lacking at the moment.

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.

Comments (1)

New Microsoft paper claims GPT-4 shows "Sparks of AGI" – By Futurist and Virtual Keynote Speaker Matthew Griffin

19th September 2023 at 7:33 am

[…] the company now claims that GPT-4, which researchers eventually found out uses something called a Master of Experts (MOE) model, rather than being a single Artificial Intelligence (AI) model, to “be so good,” is an early […]

OpenAI’s GPT4 isn’t actually a single AI it’s many and a Master of Experts

WHY THIS MATTERS IN BRIEF

Many people think OpenAI’s products are a single AI, but they’re many AI’s linked together to form what’s known as a “Master of Experts” model, and this could form the foundation of future AGI.

Comments (1)

Leave a comment Cancel reply

ORGANISING AN EVENT OR WORKSHOP?

STAY CONNECTED

FREE BOOKS AND STUFF

MY PLEDGE TO THE PLANET

NET ZERO .

ZERO HARM .

ZERO IMPACT .

ZERO WASTE .

EXPLORE MORE!

You have Successfully Subscribed!

Pin It on Pinterest

OpenAI’s GPT4 isn’t actually a single AI it’s many and a Master of Experts

WHY THIS MATTERS IN BRIEF

Many people think OpenAI’s products are a single AI, but they’re many AI’s linked together to form what’s known as a “Master of Experts” model, and this could form the foundation of future AGI.

Related Posts

Comments (1)

Leave a comment Cancel reply

ORGANISING AN EVENT OR WORKSHOP?

STAY CONNECTED

FREE BOOKS AND STUFF

MY PLEDGE TO THE PLANET

NET ZERO .

ZERO HARM .

ZERO IMPACT .

ZERO WASTE .

EXPLORE MORE!

You have Successfully Subscribed!

Pin It on Pinterest