ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
pKa-ML-final.pdf (1023.3 kB)

Using Machine Learning to Predict the pKa of C–H Bonds. Relevance to Catalytic Methane Functionalization

preprint
submitted on 13.07.2020 and posted on 14.07.2020 by Christopher Zhou, William Grumbles, Thomas Cundari
Six machine learning models (random forest, neural network, support vector machine, k-nearest neighbors, Bayesian ridge regression, least squares linear regression) were trained on a dataset of 3d transition metal-methyl and -methane complexes to predict pKa(C–H), a property demonstrated to be important in catalytic activity and selectivity. Results illustrate that the machine learning models are quite promising, with RMSE metrics ranging from 4.6 to 8.8 pKa units, despite the relatively modest amount of data available to train on. Importantly, the machine learning models agreed that (a) conjugate base properties were more impactful than those of the corresponding conjugate acid, and (b) the energy of the highest occupied molecular orbital conjugate base was the most significant input feature in the prediction of pKa(C–H). Furthermore, results from additional testing conducted using an external dataset of Sc-methyl complexes demonstrated the robustness of all models, with RMSE metrics ranging from 1.5 to 6.6 pKa units. In all, this research demonstrates the potential of machine learning models in organometallic catalyst development.

Funding

CHE-1464943

CHE-1531468

History

Email Address of Submitting Author

t@unt.edu

Institution

University of North Texas

Country

United States

ORCID For Submitting Author

0000-0003-1822-6473

Declaration of Conflict of Interest

No conflict of interest

Exports