Abstract
The introduction of fluorine in compounds plays a crucial role in drug development as it greatly influences their final pharmacokinetic and dynamic properties. Due to the increasing prevalence of fluorine in FDA-approved drugs in recent years, identifying the underlying mechanisms driving their chemical transformations has become crucial in the drug discovery landscape. 19F NMR spectroscopy is a powerful analytical technique that allows for the examination of fluorine-containing compounds, offering valuable information about their structure, dynamics, and reactivity. Consequently, this technique has become a cornerstone in the mechanistic evaluation of fluorine-containing chemical transformations. NMR spectra can be interpreted through the leveraging of Density Functional Theory (DFT), an ab initio modeling method that can be harnessed for the prediction of NMR chemical shifts. However, the screening of compounds and discovery of feasible drug candidates is severely limited due to the computational cost of DFT. Here, we present a machine learning approach to accelerate the prediction of DFT-calculated 19F NMR chemical shifts. The fluorine atoms’ features in the models were derived from their local three-dimensional structural environments, representing their neighboring atoms within a radius of n Å away from the given fluorine atom in the compound. A comparative analysis of thirteen regression models was conducted using features extracted from 501 fluorinated compounds in our laboratory’s chemical inventory. The target chemical shift values were calculated using DFT with the quantum chemistry software ORCA. Among the models evaluated, Gradient Boosting Regression (GBR) exhibited the highest performance, achieving a mean absolute error of 2.89 ppm - 3.73 ppm with a local environment radius of 3 Å. This demonstrates a comparable accuracy to DFT calculations while significantly reducing computational time per compound from several hundred seconds to milliseconds. 3 Å was also found to be the most optimal radius across all models when encoding features for local atomic environments.
Supplementary materials
Title
Supplementary Information for “Evaluation of machine learning models for the accelerated prediction of Density Functional Theory calculated 19F chemical shifts based on local atomic environments”
Description
Data documentation and source code for results disclosed in the manuscript.
Actions
Supplementary weblinks
Title
GitHub Repository for “Evaluation of machine learning models for the accelerated prediction of Density Functional Theory calculated 19F chemical shifts based on local atomic environments”
Description
GitHub Repository for “Evaluation of machine learning models for the accelerated prediction of Density Functional Theory calculated 19F chemical shifts based on local atomic environments” - contains raw datafiles and source code
Actions
View