You may have recently bought something from Amazon.com or watched a video on YouTube. A few days later, you did it again. As soon as the window popped up on your computer, you saw a few suggestions about what you may like to buy or to watch next. You may have been thrilled to see something that you had long been searching for. You smiled and said to yourself, “how the heck did this website know what I was looking for?” It’s a miracle, right?
But how did the website actually learn your personal choices?
Every click you made during your foray onto the website and every piece of data you provided were stored, and then an algorithm, a piece of computer software, did an intricate analysis on your personal data. First, the website learned from your behavior or the traces of information you left behind, and then on future visits, they suggested something they thought would match your preferences!
We are living in the height of a technological era! The very processes of how a computer algorithm learns and then suggests a future outcome is called machine learning, a branch of artificial intelligence.
Let’s be more scientific. Machine-learning software containing algorithms learns from datasets. The software makes a ‘model’ from what it has learned. Finally, the software predicts a future outcome. It’s similar to how we predict our future from our knowledge of the past.
But how are researchers funded by the Department of Energy Office of Science’s Energy Frontier Research Centers (EFRCs) solving serious problems in energy and climate change using machine learning?
Well, here’s how.
Knowing how molecules hang out
On a sunny afternoon, as a deep blue sky hung above Seattle in Washington State, I walked from my laboratory to another department to meet with my colleague, François Baneyx, who’s also the Director of the Center for the Science of Synthesis Across Scales (CSSAS), an EFRC led by the University of Washington. CSSAS is going big with machine learning, and I wanted to learn more about it. Once I reached the director’s office, he led me to his computer. The screen displayed obscure images of various protein molecules, each a few nanometers in size, which were imaged using a technique called atomic force microscopy. The proteins were supposed to assemble into nicely ordered structures, like a military parade, which could then be used for making materials to address various energy-related problems. I peered at the images but could not distinguish the supposed orderliness. Confused, I looked at him. Baneyx responded to my confusion, smiling: “That’s why we are using machine learning, to make them distinguishable.”
Molecules like to hang out together with their friends and families, just like us. They do amazing new things when they get together. In nature, such as in our body, certain building block materials, like proteins, hang out to form larger, ordered materials (a process known as self-assembly). This orderliness gives them new functions.
The primary aim of CSSAS is to understand this hanging-out process: how do these molecules, who possess numerous odd behaviors and attitudes, get together to form orderly assemblies? It’s a tough problem, because the building blocks carry obscure and massive molecular information and often it is difficult to understand what kind of molecular information governs this hanging-out process.
Scientists at CSSAS use machine learning to find patterns in the hidden properties of these materials. For example, when I was looking at one of the images in Baneyx’s computer, it was hard for me to detect any assembly patterns. But machine-learning software extracted information from hundreds of thousands of such images and predicted how the building blocks were assembled. This helped scientists learn about the emergence of order from individual building blocks. Such knowledge will eventually help them make new materials to solve important problems in energy, as Baneyx noted.
Knowing a reaction’s journey
Many of us who work or have worked in synthetic chemistry know how hard it is to make a new molecule in just one trial. Scientists often use cook-and-look techniques: they first seal chemical reactants in a vessel and “cook” to initiate a reaction, and then after some time, they “look” at the recovered products to see if these products are of any use. If they find something interesting, they move on. Otherwise, they start over—again and again.

A few thousand miles away from Washington State, in New York, Johan Parise leads a center at which the scientists are trying to learn from the “failure again and again” using machine learning. Their essential question—can we eventually get a product made in just one shot? At A Next Generation Synthesis Center (GENESIS), scientists are working towards understanding the journey of a reaction. GENESIS uses a two-track approach—first, understanding how the environment around a reaction leads to desirable products, and second, understanding what happens to the molecules during the journey of reactants into products.
By learning about what happens during a chemical reaction, they can make new materials for applications in energy. To be able to do that, they first need to collect data on how a reaction progresses. A reaction travels through many intermediate states (and often stops for a while), just like we pass through many stations when we travel in a train, and often have to transfer multiple times to get to our destination. But for the reactions, the intermediate stops determine which direction the reaction will continue. The length of time a reaction spends at an intermediate stop is dependent on variables of the surrounding environment, such as pressure and temperature. Scientists at GENESIS collect all these data points, creating an enormous dataset, and then train machine-learning algorithms to learn about possible routes to a destination or the final product. This technique is powerful because with a better understanding of the reaction, thanks to machine learning, researchers can try to get the desired product in one shot.
The acid gas problem
Acid gases, like carbon dioxide and hydrogen sulfide, which come along with natural gases as impurities from deep mines or the ones that hang around us in the atmosphere, can be notorious for their interactions. For example, these gases can interact with everyday materials around us and cause corrosion. However, the details of how these materials interact with acid gases are poorly understood. Scientists at the Center for Understanding and Control of Acid Gas-induced Evolution of Materials for Energy (UNCAGE-ME) have collected huge amounts of data on the interaction of acid gases with materials like metal-organic frameworks over the past couple of years, and they are now employing machine learning to understand the interaction.
As David Sholl, Deputy Director of UNCAGE-ME, told me over the phone, “We strongly believe there is an enormous amount to learn from the data we have seen in our experiments in the last couple of years—that’s why we are using machine learning to learn more.” Eventually, they want to design materials to selectively remove these harmful gases from around us or from crude natural gas.
The climate connection
On yet another afternoon, I made a phone call to the Director of the Center for Mechanistic Control of Water-Hydrocarbon-Rock Interactions in Unconventional and Tight Oil Formations (CMC-UF) to discuss their research. The sky was gloomy in Seattle, as the city is notorious for unpredictable weather. But when I called Tony Kovscek, a professor at Stanford University and Director of CMC-UF, my weather-caused despondence disappeared, and I was delighted to hear what Kovscek had to say about their mission.
CMC-UF studies materials with tiny holes, commonly referred to as shales, extracted from oil and gas wells. But ultimately, the researchers at the center are aiming to solve one of the grandest challenges of our time—global warming—by removing carbon and reducing the impacts of fossil fuel usage.
But how is CMC-UF using machine learning as part of their attack on that problem?
They investigate the molecular signatures—like the structure and chemical composition of shale materials—using fancy technologies like X-ray computed tomography, X-ray fluorescence, scanning electron microscopy, and so on. They get a huge amount of data from these studies, but it’s difficult to understand the relationships between the different data types.
This is where machine learning comes into play—they want to extract useful information from the data sets to learn about how water and gases like carbon dioxide interact with shales. Understanding how carbon dioxide interacts with shales could be greatly beneficial for humanity. With this understanding, one day, atmospheric carbon dioxide could be drained down into oil and gas wells through a technique called hydraulic fracturing.
In hydraulic fracturing, water is injected into shale to open pathways so that crude oil or other fluids can be drained effectively. CMC-UF wants to replace water with carbon dioxide in the future, which will help with removing atmospheric carbon dioxide—a potential solution to reduce global warming.
A right move
The world, and hence the problems, around us is becoming too complicated to solve by ourselves. We need help from machines, first to help us learn the problems, and then to help us find solutions. When philosopher René Descartes in 17th century said machina animata, meaning animals are machines and hence they can’t think, he might have gotten it wrong, because in the 21st century, even a real machine seems to think, and then learn, and then give us solutions.
And multiple EFRCs are leveraging the potential of artificial intelligence for solving problems in energy and climate.