A General Protocol for the Accurate Predictions of Molecular 13C/1H NMR Chemical Shifts via Machine Learning

10 December 2019, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Accurate prediction of NMR chemical shifts with affordable computational cost is of great importance for rigorous structural assignments of experimental studies. However, the most popular computational schemes for NMR calculation—based on density functional theory (DFT) and gauge-including atomic orbital (GIAO) methods—still suffer from ambiguities in structural assignments. Using state-of-the-art machine learning (ML) techniques, we have developed a DFT+ML model that is capable of predicting 13C/1H NMR chemical shifts of organic molecules with high accuracy. The input for this generalizable DFT+ML model contains two critical parts: one is a vector providing insights into chemical environments, which can be evaluated without knowing the exact geometry of the molecule; the other one is the DFT-calculated isotropic shielding constant. The DFT+ML model was trained with a dataset containing 476 13C and 270 1H experimental chemical shifts. For the DFT methods used here, the root-mean-square-derivations (RMSDs) for the errors between predicted and experimental 13C/1H chemical shifts are as small as 2.10/0.18 ppm, which is much lower than the typical DFT (5.54/0.25 ppm), or DFT+linear regression (4.77/0.23 ppm) approaches. It also has smaller RMSDs and maximum absolute errors than two previously reported NMR-predicting ML models. We test the robustness of the model on two classes of organic molecules (TIC10 and hyacinthacines), where we unambiguously assigned the correct isomers to the experimental ones. This DFT+ML model is a promising way of predicting NMR chemical shifts and can be easily adapted to calculated shifts for any chemical compound.

Keywords

organic molecule
neural network
chemical environment
NMR chemical shift

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.