These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
Preprints are manuscripts made publicly available before they have been submitted for formal peer review and publication. They might contain new research findings or data. Preprints can be a draft or final version of an author's research but must not have been accepted for publication at the time of submission.
submitted on 31.08.2019 and posted on 03.09.2019by Akber Raza, Lihua Xu, Sharma Yamijala, Chao Lian, Hyuna Kwon, Bryan Wong
present the first application of machine learning on per- and polyfluoroalkyl substances (PFAS) for predicting and rationalizing carbon-fluorine (C–F)
bond dissociation energies to aid in their efficient treatment and removal. Using
a variety of machine learning algorithms (including Random Forest, Least
Absolute Shrinkage and Selection Operator Regression, and Feed-forward Neural
Networks), we were able to obtain extremely accurate predictions for C–F bond dissociation energies (with
deviations less than 0.70 kcal/mol) that are within chemical accuracy of
the PFAS reference data. In addition, we show that our machine learning
approach is extremely efficient (requiring less than 10 minutes to train the
data and less than a second to predict the C–F bond dissociation energy of a
new compound) and only needs knowledge
of the simple chemical connectivity in a PFAS structure to yield reliable
results – without recourse to a computationally expensive quantum
mechanical calculation or a three-dimensional structure. Finally, we present an
unsupervised machine learning algorithm that can automatically classify and
rationalize chemical trends in PFAS structures that would otherwise have been
difficult to humanly visualize/process manually. Collectively, these studies (1)
comprise the first application
of machine learning techniques for PFAS structures to predict/rationalize C–F bond dissociation energies and (2) show immense
promise for assisting experimentalists in the targeted defluorination of
specific bonds in PFAS structures (or other unknown environmental contaminants)
of increasing complexity.