WHY THIS MATTERS IN BRIEF
Materials are a fundamental building block of everything in existence today, whether it’s cells, our power plants or our smartphones and devices, now researchers have built an AI to help them automate the discovery of new materials and it could revolutionise everything.
Over the past number of years research efforts such as the Materials Genome Initiative and the Materials Project have developed loads of computational tools that come in very handy when researchers are trying to discover new materials with new properties that could be useful for a range of applications, from energy and electronics to aerospace and civil engineering, but overall the processes for producing these next gen materials has continued to depend on a combination of human experience, intuition, and manual literature reviews, which, let’s face it, are all thoroughly exciting.
Now though a team of researchers at MIT, the University of Massachusetts at Amherst, and the University of California Berkeley are changing the game and closing what they’re catchily calling the “materials-science automation gap,” using new Artificial Intelligence (AI) system that analyses research papers in the hope of finding new “recipes” to create new materials – whether those materials are mundane, or exotic, like the ones theoretical mathematicians say we need to create in order to realise time travel. But I digress, slightly, so back to the story.
“Computational materials scientists have made a lot of progress in the ‘what’ to make, such as what material to design based on [a materials] desired properties,” says Elsa Olivetti, Assistant Professor of Energy Studies at MIT’s Department of Materials Science and Engineering, “but because of that success, the bottleneck has shifted to, ‘Okay, now how do I make it?'”
Ultimately the team envision a database that contains materials recipes that have been extracted from millions of papers, and hope that one day scientists and engineers could simply enter the name of a target material and any other criteria, such as precursor materials, reaction conditions, fabrication processes, and so on, and pull up suggested recipes.
As a step toward realising that vision, Olivetti and her team have developed an AI that can analyse a research paper, figure out which of its paragraphs contain materials recipes, and classify the words in those paragraphs according to their roles within the recipes, such as names of target materials, numeric quantities, names of pieces of equipment, operating conditions, descriptive adjectives, and so on.
In a paper appearing in the latest issue of the journal Chemistry of Materials they also demonstrate that their new AI can analyse the extracted data to infer the general characteristics of classes of materials, such as the different temperature ranges that their synthesis requires or the particular characteristics of individual materials like the different physical forms they will take when their fabrication conditions vary.
In order to achieve their goals the team trained their system using a mixture of supervised and unsupervised machine learning techniques. “Supervised” simply means that the training data fed to the system was first annotated by humans, and that the system tried to find correlations between the data and the annotations, while “unsupervised” means that the training data was unannotated, and the system instead had to learn how to cluster data together according to its structural similarities.
Because “materials-recipe extraction” is a new area of AI research though Olivetti didn’t have the luxury of large, annotated data sets accumulated over years by diverse teams of researchers, so instead they had to annotate the data themselves. Ultimately this was about 100 papers, and by AI standards is a very small data set.
To improve it though they then used an algorithm that had first been developed at Google called Word2vec. Word2vec looks at the contexts in which words occur, like the words’ syntactic roles within sentences and the words around them, and groups together words that tend to have similar contexts. So, for instance, if one paper contained the sentence “We heated the Titanium Tetracholoride to 500C,” and another contained the sentence “The Sodium Hydroxide was heated to 500C,” Word2vec would group “Titanium Tetracholoride” and “Sodium Hydroxide” together.
Using Word2vec all of a sudden the team were able to massively expand their training data set, and in the end they ended up being able to train it using not 100 paper, but 640,000.
However, in order to test the system’s accuracy they had to rely on the labelled data, since they had no criteria for evaluating its performance on the unlabelled data.
In those tests, the system was able to identify with 99 percent accuracy the paragraphs that contained recipes, and to label with 86 percent accuracy the words within those paragraphs.
The team now hope that further work will improve the system’s accuracy, and they’re now exploring a number of deep learning techniques that they hope could come up with new material recipes for new materials that aren’t mentioned in existing literature.
“This is landmark work,” says Ram Seshadri, the Fred and Linda R. Wudl Professor of Materials Science at the University of California at Santa Barbara, “the authors have taken on the difficult and ambitious challenge of capturing, using AI, strategies used for the preparation of new materials. The work demonstrates the power of machine learning, but it would be accurate to say that the eventual judge of success or failure would require convincing practitioners that the utility of such methods can enable them to abandon their more instinctual approaches.”
So who knows, as MIT and AI continue to push the boundaries we might one day soon see the emergence of new, previously unimaginable metamaterials, programmable materials and maybe even new self-healing materials, to name just a few, and since materials are at the heart of everything that mankind produces it’s quite easy to see how MIT’s work here could one day, quite literally, touch every person on Earth.