ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
Machine_Learning_Energetics_combined.pdf (4.81 MB)
0/0

Applying Machine Learning Techniques to Predict the Properties of Energetic Materials

preprint
revised on 15.02.2018 and posted on 16.02.2018 by Daniel Elton, Zois Boukouvalas, Mark S. Butrico, Mark D. Fuge, Peter W. Chung
We present a proof of concept that machine learning techniques can be used to predict the properties of CNOHF energetic molecules from their molecular structures. We focus on a small but diverse dataset consisting of 109 molecular structures spread across ten compound classes. Up until now, candidate molecules for energetic materials have been screened using predictions from expensive quantum simulations and thermochemical codes. We present a comprehensive comparison of machine learning models and several molecular featurization methods - sum over bonds, custom descriptors, Coulomb matrices, bag of bonds, and fingerprints. The best featurization was sum over bonds (bond counting), and the best model was kernel ridge regression. Despite having a small data set, we obtain acceptable errors and Pearson correlations for the prediction of detonation pressure, detonation velocity, explosive energy, heat of formation, density, and other properties out of sample. By including another dataset with 309 additional molecules in our training we show how the error can be pushed lower, although the convergence with number of molecules is slow. Our work paves the way for future applications of machine learning in this domain, including automated lead generation and interpreting machine learning models to obtain novel chemical insights.

History

Email Address of Submitting Author

delton@umd.edu

Email Address(es) for Other Author(s)

pchung15@umd.edu

Institution

University of Maryland, College Park

Country

United States

ORCID For Submitting Author

0000-0003-0249-1387

Declaration of Conflict of Interest

The authors declare no competing financial interests.

Exports

Logo branding

Exports