Using machine learning to interpret complex infrared spectra of catalysts
Cora Went

What do FM radio, your microwave, 5G cell phone service, light from a lightbulb, and X-rays have in common?

They are all forms of electromagnetic waves—oscillating electric and magnetic fields that can travel through space. Electromagnetic waves are everywhere. The wavelength of the radiation, or the distance between consecutive peaks of the wave, determines what kind of radiation it is. FM radio transmits on electromagnetic waves with wavelengths of about a meter. The microwave in your kitchen has wavelengths about the length of a credit card. 5G service propagates to our cell phones with wavelengths about the diameter of a pea. Visible light coming out of a lightbulb has wavelengths that are much smaller, about 1 micron, or one-tenth the thickness of plastic wrap. And medical X-rays use electromagnetic waves with wavelengths over one thousand times smaller than that! 

Whether an electromagnetic wave with a certain wavelength can travel through a material depends on whether that material absorbs light at that wavelength. For example, materials do not absorb light at the 5G wavelength, which is why cell phone service can propagate over such long distances. Visible light, however, is absorbed as it boosts electrons to higher energies in a material. Infrared light, which has wavelengths between those of visible light and microwaves, can be absorbed by molecules that use the energy to rotate and vibrate.

The ability of a material to absorb light is based on its physical and chemical properties. The precise wavelengths of light that a material absorbs can tell you a lot about that material—down to the level of chemical bonds. For example, materials that absorb light with wavelengths around five microns may have carbon-oxygen bonds, and materials that absorb light with wavelengths close to three microns may have carbon-hydrogen bonds.

Scientists often use infrared (IR) spectroscopy to understand the material they are investigating by studying the bonds present in the material, and how those bonds vibrate because of their surrounding environment. By shining different wavelengths of IR light on a material, they can collect a spectrum of absorption (or transmission) versus wavelength. However, you can imagine that with all the bonds that exist within a material—all of them rotating and vibrating in different ways—IR spectra quickly get complicated and difficult to interpret.

Enter machine learning. Scientists at the Catalysis Center for Energy Innovation Energy Frontier Research Center (CCEI EFRC) recently showed that they can better predict material structures from their infrared spectra using a combination of physics modeling and machine learning. By applying these techniques, they can determine the precise surface structures of complex catalysts, closing the long-standing “materials gap.”

The “materials gap”

Catalysts are essential components of our energy future, from their use in fuel cells to their role in the sustainable production of chemicals. Generally, a catalyst is any material that speeds up a reaction without being permanently changed by it. In one class of catalysts, called heterogeneous catalysts, a molecule that participates in the reaction adsorbs onto the surface of a catalyst at an active site. The interaction between the catalyst and the reactant lowers the energy of the reaction, removing the bottleneck of the reaction and allowing it to go faster. Specifically in this work, the researchers examined platinum nanoparticle catalysts and how they interact with carbon monoxide (CO) molecules.

The types of active sites on the surface of a catalyst can be described by two characteristics: the binding-type and the coordination number. The binding-type tells you where on the catalyst the molecule sits. For example, an “atop” binding-type means that the molecule sits directly on top of a platinum atom, and a “bridge” binding-type means that it sits between two platinum atoms. The coordination number tells you how many platinum atoms are near the adsorbed molecule, which depends on the structure of the platinum surface.

Different possible active sites. (a,b) “Bridge” (a) versus “atop” (b) binding-sites. (c,d) Different platinum surface structures have different coordination numbers, or numbers of surrounding platinum atoms.

To understand how a catalyst works, researchers need to characterize these active sites. However, doing so is complicated, because many different types of active sites exist within a single catalyst. Until now, researchers have attempted to characterize the surface structures with computationally intensive calculations of a single active site on a perfect crystal. These calculations can then be verified by single crystal experiments done under high vacuum conditions. But catalysts change under working conditions, so these experiments are often not representative.

“The disparity between single crystal experiments, and associated calculations, and real-world materials is known as the materials gap,” the researchers write in their paper.

IR spectroscopy is, in theory, a good technique for addressing the materials gap. It allows researchers to study the surface of their catalysts under working conditions, also known as operando conditions. However, determining the microstructure—or the distribution of binding-types and coordination numbers of the active sites—of a catalyst from complex IR spectra is a daunting task. So, these researchers turned to machine learning, a technique that allows a computer to learn from experience without being directly programmed.

The “materials gap” describes the discrepancy between perfect crystals in high-vacuum conditions and real catalysts in working conditions. The authors used machine learning and IR spectroscopy to bridge the materials gap.

Machine learning bridges the gap

These researchers had to somehow bridge the gap from using physics to model a single active site on a perfect crystal to detecting the entire microstructure of a catalyst from an IR spectrum. They broke it down into several steps in their paper.

First, they used physics-based theory to generate IR spectra from CO adsorbed on single active sites on perfect crystals. They looked at different binding-types and different platinum surface structures to cover all possible active sites. Second, they looked at how these spectra changed when they increased the CO coverage, or the density of CO molecules on the surface. They then generated synthetic spectra that would be representative of real materials, with many different active sites represented in each spectrum, and many different levels of CO coverage. Third, they fed these synthetic spectra into a machine learning algorithm composed of two neural networks so it could learn the mapping between the spectrum and microstructure. Finally, they fed their algorithm real spectra and asked it to backtrack and guess the microstructure. Based on the very limited data that does fill the materials gap, they showed that their algorithm could accurately guess the distribution of the binding-type and coordination number of the active sites in their catalyst.

“In total, hundreds of thousands of synthetic spectra were necessary to learn the mapping between spectra and microstructure,” they explain.

A general method

Although the researchers applied this technique specifically to platinum catalysts and CO adsorbates, their methodology is broadly applicable. The technique—using physics-based models, coverage scaling, and machine learning to determine microstructure from IR spectra—could be applied to many combinations of catalyst and adsorbate. By looking at the active sites present in the most successful catalysts, researchers could then accelerate their research and the rational design of new catalysts. 

By making sense out of complex infrared spectra using machine learning, these scientists have managed to make major strides toward closing the “materials gap.”

More Information

Lansford, J. L. & Vlachos, D. G. Infrared spectroscopy data- and physics-driven machine learning for characterizing surface microstructure of complex materials. Nat. Commun. 11, 1513 (2020).


“This material is based upon work supported as part of the Catalysis Center for Energy Innovation, an Energy Frontier Research Center funded by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences under Award Number DESC0001004. Computational time from the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (awards OCI-0725070 and ACI-1238993) and the state of Illinois, is gratefully acknowledged. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications. The 2019-2020 Blue Waters Graduate Fellowship to J.L.L. is also gratefully acknowledged.”

About the author(s):

Cora Went is a graduate student in physics in Harry Atwater’s group at Caltech. She studies two-dimensional transition metal dichalcogenides for photovoltaic applications. Through her collaboration with other researchers in the Photonics at Thermodynamic Limits EFRC, she is working to understand how fundamental properties of these materials affect their performance as solar cells.

Newsletter Articles

Research Highlights