Holistic Prediction of Nucleophilicity and Electrophilicity Based on a Machine Learning Approach

12 January 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Nucleophilicity and electrophilicity dictate the reactivity of polar organic reactions. In the past decades, Mayr et al. established a quantitative scale for nucleophilicity (N) and electrophilicity (E), which proved to be useful tools for the rationalization of chemical reactivity. In this study, a holistic prediction model was developed through a machine-learning approach. rSPOC, an ensemble molecular representation with structural, physicochemical, and solvent features, was developed for this purpose. With 1115 nucleophiles, 285 electrophiles and 22 solvents, the dataset was currently the largest one for reactivity prediction. The rSPOC model trained with the Extra Trees algorithm showed high accuracy in predicting Mayr’s N and E parameters with R2 of 0.96 and 0.92, MAE of 0.99 and 1.47, respectively. Furthermore, the practical applications of the model, for instance, nucleophilicity prediction of NAD(P)H and a series of enamines showed potential in predicting molecules with unknown reactivity within seconds. An online prediction platform (http://isyn.luoszgroup.com/) was constructed based on the current model, which is available free to the scientific community.


Machine learning
Molecular descriptors


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.