0

WHY THIS MATTERS IN BRIEF

Many people think OpenAI’s products are a single AI, but they’re many AI’s linked together to form what’s known as a “Master of Experts” model, and this could form the foundation of future AGI.

 

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trendsconnect, watch a keynote, or browse my blog.

In March, OpenAI launched GPT-4 with much fanfare, but a dark cloud loomed over the horizon. Scientists and Artificial Intelligence (AI) enthusiasts alike panned the company for not releasing any specifics about the model, like the parameter size or architecture. However, now months later a top AI researcher has speculated the inner workings of GPT-4 revealing why OpenAI chose to hide this information — and it’s disappointing.

 

RELATED
Google says Quantum Supremacy is just months away

 

OpenAI CEO Sam Altman famously stated on GPT-4 that “people are begging to be disappointed, and they will be,” speaking about the potential size of the model. Rumour mills ahead of the model’s launch suggested that it would have trillions of parameters and be the best thing that the world has ever seen. However, the reality is different. In the process of making GPT-4 better than GPT-3.5, OpenAI might have bitten off more than it could possibly chew.

George Hotz, world-renowned hacker and software engineer, recently appeared on a podcast to speculate about the architectural nature of GPT-4. Hotz stated that the model might be a set of eight distinct models, each featuring 220 billion parameters. This speculation was later confirmed by Soumith Chintala, the co-founder of PyTorch.

While this puts the parameter count of GPT-4 at 1.76 trillion, which makes it one of the largest AI models out there, the notable part is that all of these models don’t work at the same time. Instead, they are deployed in a so called Model of Experts (MoE) architecture. This architecture makes each model into different components, also known as expert models. Each of these models is fine-tuned for a specific purpose or domain, and is able to provide better responses for that field. Then, all of the expert models work together with the complete model drawing on the collective intelligence of the expert models.

 

RELATED
Futurist keynote, Oxford: The Future of Disruption, University of Oxford

 

This approach has many benefits. One is that of more accurate responses due to models being fine-tuned on various subject matters. MoE architecture also lends itself to being easily updated as the maintainers of the model can improve it in a modular fashion, as opposed to updating a monolithic model. Hotz also speculated that the model may be relying on the process of iterative inference for better outputs. Through this process, the output, or inference result of the model, is refined through multiple iterations.

This method also might allow GPT-4 to get inputs from each of its expert models, which could reduce the hallucinations in the model. Hotz stated that this process might be done 16 times, which would vastly increase the operating cost of the model. This approach has been likened to the old trope of three children in a trenchcoat masquerading as an adult. Many have likened GPT-4 to be 8 GPT-3s in a trench coat, trying to pull the wool over the world’s eyes.

While GPT-4 aced benchmarks that GPT-3 has had difficulties with, the MoE architecture seems to have become a pain point for OpenAI. In a now-deleted interview, Altman admitted to the scaling issues OpenAI is facing, especially in terms of GPU shortages.

 

RELATED
Google gave an AI intuition, then it drove a race car without being taught

 

Running inference 16 times on a model with MoE architecture is sure to increase cloud costs on a similar scale. When blown up to ChatGPT’s millions of users, it’s no surprise that even Azure’s supercomputer fell short of power. This seems to be one of the biggest problems that OpenAI is facing currently, with Altman stating that cheaper and faster GPT-4 is the company’s top priority as of now.

This has also resulted in a reported degradation of quality in ChatGPT’s output. All over the Internet, users have reported that the quality of even ChatGPT Plus’ responses have gone down.

I found a release note for ChatGPT that seems to confirm this, which stated, “We’ve updated performance of the ChatGPT model on our free plan in order to serve more users.” In the same note, OpenAI also informed users that Plus users would be defaulted to the “Turbo” variant of the model, which has been optimised for inference speed.

API users, on the other hand, seem to have avoided this problem altogether. Reddit users have noticed that other products which use the OpenAI API provide better answers to their queries than even ChatGPT Plus. This might be because users of the OpenAI API are lower in volume when compared to ChatGPT users, resulting in OpenAI cutting costs at ChatGPT while ignoring the API.

 

RELATED
Researchers have found a way to successfully debug neural networks

 

In a mad rush to get GPT-4 out to the market, it seems that OpenAI has cut corners, and just how many we’ll likely never know, but while the purported MoE model is a good step forward for making the GPT series more performant, the scaling issues that it is facing show that the company might just have bitten off more than it can chew.

Looking forwards though towards GPT-5 this approach might yield some interesting and problematic results for the company as they begin to connect these independent “expert” AI sub-models together to create what many would regard as the first plausible prototype Artificial General Intelligence (AGI), which has been their stated goal all along, to give GPT-5 cross domain knowledge – something which it’s lacking at the moment.

About author

Matthew Griffin

Matthew Griffin, described as “The Adviser behind the Advisers” and a “Young Kurzweil,” is the founder and CEO of the World Futures Forum and the 311 Institute, a global Futures and Deep Futures consultancy working between the dates of 2020 to 2070, and is an award winning futurist, and author of “Codex of the Future” series. Regularly featured in the global media, including AP, BBC, Bloomberg, CNBC, Discovery, RT, Viacom, and WIRED, Matthew’s ability to identify, track, and explain the impacts of hundreds of revolutionary emerging technologies on global culture, industry and society, is unparalleled. Recognised for the past six years as one of the world’s foremost futurists, innovation and strategy experts Matthew is an international speaker who helps governments, investors, multi-nationals and regulators around the world envision, build and lead an inclusive, sustainable future. A rare talent Matthew’s recent work includes mentoring Lunar XPrize teams, re-envisioning global education and training with the G20, and helping the world’s largest organisations envision and ideate the future of their products and services, industries, and countries. Matthew's clients include three Prime Ministers and several governments, including the G7, Accenture, Aon, Bain & Co, BCG, Credit Suisse, Dell EMC, Dentons, Deloitte, E&Y, GEMS, Huawei, JPMorgan Chase, KPMG, Lego, McKinsey, PWC, Qualcomm, SAP, Samsung, Sopra Steria, T-Mobile, and many more.

Your email address will not be published. Required fields are marked *