ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
autoencoder_chemrxivsubmit1.pdf (3.14 MB)
0/0

Machine Learning of Optical Properties of Materials - Predicting Spectra from Images and Images from Spectra

preprint
submitted on 10.07.2018 and posted on 11.07.2018 by Helge S. Stein, Dan Guevarra, Paul F Newhouse, Edwin Soedarmadji, John Gregoire
As the materials science community seeks to capitalize on recent advancements in computer science, the sparsity of well-labelled experimental data and limited throughput by which it can be generated have inhibited deployment of machine learning algorithms to date. Several successful examples in computational chemistry have inspired further adoption of machine learning algorithms, and in the present work we present autoencoding algorithms for measured optical properties of metal oxides, which can serve as an exemplar for the breadth and depth of data required for modern algorithms to learn the underlying structure of experimental materials science data. Our set of 180,902 distinct materials samples spans 78 distinct composition spaces, includes 45 elements, and contains more than 80,000 unique quinary oxide and 67,000 unique quaternary oxide compositions, making it the largest and most diverse experimental materials set utilized in machine learning studies. The extensive dataset enabled training and validation of 3 distinct models for mapping between sample images and absorption spectra, including a conditional variational autoencoder that generates images of hypothetical materials with tailored absorption properties. The absorption patterns auto-generated from sample images capture the salient features of ground truth spectra, and direct band gap energies extracted from these auto-generated patterns are quite accurate with a mean absolute error of 240 meV, which is the approximate uncertainty from traditional extraction of the band gap energy from measurements of the full transmission and reflection spectra. Optical properties of materials are not only ubiquitous in materials applications but also emblematic of the confluence of underlying physical phenomena that yield the type of complex data relationships that merit and benefit from neural network-type modelling.

Funding

Department of Energy, Office of Science, DE-SC0004993

History

Email Address of Submitting Author

gregoire@caltech.edu

Email Address(es) for Other Author(s)

stein@caltech.edu,guevarra@caltech.edu,paulfnew@caltech.edu,edwin@caltech.edu

Institution

California Institute of Technology

Country

United States of America

ORCID For Submitting Author

0000-0002-2863-5265

Declaration of Conflict of Interest

no conflicts of interest

Version Notes

initial version

Exports