Machine Learning Transition Temperatures from 2D Structure

20 November 2020, Version 3
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

A priori knowledge of physicochemical properties such as melting and boiling could expedite materials discovery. However, theoretical modeling from first principles poses a challenge for efficient virtual screening of potential candidates.

As an alternative, the tools of data science are becoming increasingly important for exploring chemical datasets and predicting material properties. Herein, we extend a molecular representation, or set of descriptors, first developed for quantitative structure-property relationship modeling by Yalkowsky and coworkers known as the Unified Physicochemical Property Estimation Relationships (UPPER). This molecular representation has group-constitutive and geometrical descriptors that map to enthalpy and entropy; two thermodynamic quantities that drive thermal phase transitions. We extend the UPPER representation to include additional information about sp2-bonded fragments. Additionally, instead of using the UPPER descriptors in a series of thermodynamically-inspired calculations, as per Yalkowsky, we use the descriptors to construct a vector representation for use with machine learning techniques. The concise and easy-to-compute representation, combined with a gradient-boosting decision tree model, provides an appealing framework for predicting experimental transition temperatures in a diverse chemical space. An application to energetic materials shows that the method is predictive, despite a relatively modest energetics reference dataset. We also report competitive results on diverse public datasets of melting points (i.e., OCHEM, Enamine, Bradley, and Bergstrom) comprised of over 47k structures. Open source software is available at https://github.com/USArmyResearchLab/ARL-UPPER.

Keywords

Machine Learning
Melting Point
Boiling Point
Enthalpy of Transition
Entropy of Transition
XGBoost
Molecular Featurization

Supplementary materials

Title
Description
Actions
Title
si
Description
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.