An AI just aced a university maths course for the first time

By Matthew Griffin Intelligence and the Senses 7th October 2022

WHY THIS MATTERS IN BRIEF

It’s often said that “maths” underpins everything in the known universe and AI is getting better and better at mastering it.

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

Multivariable calculus, differential equations, linear algebra — topics that many MIT students can ace without breaking a sweat — have consistently stumped machine learning models. The best models have only been able to answer elementary or high school-level math questions, and they don’t always find the correct solutions.

This robot taught itself to walk in simulation then went for a stroll

Now though a multidisciplinary team of researchers from MIT and elsewhere, led by Iddo Drori, has used a neural network model to solve university-level math problems in a few seconds at a human level.

The model also automatically explains solutions and rapidly generates new problems in university math subjects. When the researchers showed these machine-generated questions to university students, the students were unable to tell whether the questions were generated by an algorithm or a human.

This work could be used to streamline content generation for courses, which could help revolutionise education – especially adaptive education where Artificial Intelligence (AI) assesses and changes the students level of difficulty depending on their skills and responses – and could be especially useful in large residential courses and Massive Open Online Courses (MOOCs) that in many cases have hundreds of thousands of students. The system could also be combined with Digital Human teachers like Will, who’s already taught more that 250,000 students about energy, to create an automated tutor that shows students the steps involved in solving undergraduate math problems. And beyond.

AI is starting to manage its own data and replacing data scientists

“We think this will improve higher education,” says Drori, the work’s lead author who is also an adjunct associate professor in the Department of Computer Science at Columbia University, and who will join the faculty at Boston University this summer. “It will help students improve, and it will help teachers create new content, and it could help increase the level of difficulty in some courses. It also allows us to build a graph of questions and courses, which helps us understand the relationship between courses and their pre-requisites, not just by historically contemplating them, but based on data.”

The work is a collaboration including students, researchers, and faculty at MIT, Columbia University, Harvard University, and the University of Waterloo. The senior author is Gilbert Strang, a professor of mathematics at MIT. The research appears this week in the Proceedings of the National Academy of Sciences.

Drori and his students and colleagues have been working on this project for nearly two years. They were finding that models pretrained using text only could not do better than 8 percent accuracy on high school math problems, and those using graph neural networks could ace machine learning course questions but would take a week to train.

Oracle unveils the world's first Zettascale AI supercomputing cluster

Then Drori had what he describes as a “eureka” moment: He decided to try taking questions from undergraduate math courses offered by MIT and one from Columbia University that had never been seen before by a model, turning them into programming tasks, and applying techniques known as program synthesis and few-shot learning. Turning a question into a programming task could be as simple as rewriting the question “find the distance between two points” as “write a program that finds the difference between two points,” or providing a few question-program pairs as examples.

Before feeding those programming tasks to a neural network, however, the researchers added a new step that enabled it to vastly outperform their previous attempts.

In the past, they and others who’ve approached this problem have used a neural network, such as GPT-3, that was pretrained on text only, meaning it was shown millions of examples of text to learn the patterns of natural language. This time, they used a neural network pretrained on text that was also “fine-tuned” on code. This network, called Codex, was produced by OpenAI. Fine-tuning is essentially another pretraining step that can improve the performance of a machine learning model.

DeepMind's newest AI thrashes human gamers at Stratego

The pretrained model was shown millions of examples of code from online repositories. Because this model’s training data included millions of natural language words as well as millions of lines of code, it learns the relationships between pieces of text and pieces of code.

Many math problems can be solved using a computational graph or tree, but it is difficult to turn a problem written in text into this type of representation, Drori explains. Because this model has learned the relationships between text and code, however, it can turn a text question into code, given just a few question-code examples, and then run the code to answer the problem.

“When you just ask a question in text, it is hard for a machine-learning model to come up with an answer, even though the answer may be in the text,” he says. “This work fills in the that missing piece of using code and program synthesis.”

“This work is the first to solve undergraduate math problems and moves the needle from 8 percent accuracy to over 80 percent,” Drori adds.

Airspeeder unveils its first crewed eVTOL racer as it eyes a future flying car series

Turning math questions into programming tasks is not always simple, Drori says. Some problems require researchers to add context so the neural network can process the question correctly. A student would pick up this context while taking the course, but a neural network doesn’t have this background knowledge unless the researchers specify it.

For instance, they might need to clarify that the “network” in a question’s text refers to “neural networks” rather than “communications networks.” Or they might need to tell the model which programming package to use. They may also need to provide certain definitions; in a question about poker hands, they may need to tell the model that each deck contains 52 cards.

They automatically feed these programming tasks, with the included context and examples, to the pretrained and fine-tuned neural network, which outputs a program that usually produces the correct answer. It was correct for more than 80 percent of the questions.

OpenAI's GPT 5.3 Codex created itself according to Sam Altman

The researchers also used their model to generate questions by giving the neural network a series of math problems on a topic and then asking it to create a new one.

“In some topics, it surprised us. For example, there were questions about quantum detection of horizontal and vertical lines, and it generated new questions about quantum detection of diagonal lines. So, it is not just generating new questions by replacing values and variables in the existing questions,” Drori says.

The researchers tested the machine-generated questions by showing them to university students. The researchers gave students 10 questions from each undergraduate math course in a random order; five were created by humans and five were machine-generated.

Students were unable to tell whether the machine-generated questions were produced by an algorithm or a human, and they gave human-generated and machine-generated questions similar marks for level of difficulty and appropriateness for the course.

In a virtual first virtual offices in New York are being sold as NFTs

Drori is quick to point out that this work is not intended to replace human professors.

“Automation is now at 80 percent, but automation will never be 100 percent accurate. Every time you solve something, someone will come up with a harder question. But this work opens the field for people to start solving harder and harder questions with machine learning. We think it will have a great impact on higher education,” he says.

The team is excited by the success of their approach, and have extended the work to handle math proofs, but there are some limitations they plan to tackle. Currently, the model isn’t able to answer questions with a visual component and cannot solve problems that are computationally intractable due to computational complexity.

Microsoft is turning Azure into the world's largest supercomputer

In addition to overcoming these hurdles, they are working to scale the model up to hundreds of courses. With those hundreds of courses, they will generate more data that can enhance automation and provide insights into course design and curricula.

Matthew Griffin / About Author

Matthew Griffin is a multi-award winning Futurist and expert in Disruption and Innovation, Geopolitics, Leadership, and Technology, who NASA have described as a "walking encyclopaedia of the future" and a "futurist Polymath." 15-time best selling author of the "Codex of the Future" series, Matthew is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working with royal households, world leaders, G7, G20, and G77 governments, NGOs, and multi-national mid and mega cap firms to help them explore, shape, and lead the next 50 years of business and society.

An award-winning YouTube creator with over a million followers, with an unrivalled global reach and impact, Matthew is a highly sought-after international keynote speaker, lecturer, and mentor who collaborates with global leaders through the United Nations Alliance of Civilizations (UNAOC) and United Nations General Assembly (UNGA) to shape pivotal initiatives such as the UN’s AI for Humanity program, the United Nations Conference of the Parties (UN COP), and the World Economic Forum in Davos.

As the former Global Head of Cloud, National Security, and Enterprise Sales for companies including Atos, Dell-EMC, and IBM, Matthew has a proven track record of building multi-billion dollar business units and turning failing divisions into market leaders. His ability to identify, analyse, and communicate the implications of hundreds of emerging technologies and trends is unparalleled, and his insights are trusted by many of the world’s most respected organisations, including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi, Coca-Cola, Dentons, Deloitte, Dow Jones, EY, Google, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, Siemens AG and Siemens Energy, T-Mobile, UBS, VISA, Walmart, Workday, Worldpay and many others.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.