DeePEST-OS: A Generic Machine Learning Potential for Accelerating Transition State Search in Organic Synthesis

11 June 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Organic synthesis, central to modern chemistry, relies heavily on precise understanding of reaction kinetics, where accurate transition state structures and energies are essential. While density functional theory (DFT) remains the mainstream method for transition state searches, inherent trade-offs between accuracy and computational cost persist. To bridge this gap, DeePEST-OS—a generic machine learning potential integrating Δ-learning with a high-order equivariant message passing neural network—is developed to enable rapid and precise transition state searches for organic synthesis. The challenge of data scarcity is addressed through the establishment of a novel reaction database containing ~75,000 DFT-calculated transition states for model training. DeePEST-OS rapidly predicts potential energy surfaces along intrinsic reaction coordinate pathways, achieving speeds nearly three orders of magnitude faster than rigorous DFT computations. High accuracy is simultaneously maintained, exhibiting a root mean square deviation of 0.14 Å for transition states geometries and a mean absolute error of 0.64 kcal/mol for reaction barriers across 1,000 external test reactions—representing significant improvement over semi-empirical quantum chemistry methods. Comparative analysis against the state-of-the-art React-OT model further highlights the superior precision and computational efficiency of DeePEST-OS. A case study involving the retrosynthesis of the drug Zatosetron is also presented to demonstrate the practical utility of DeePEST-OS in accelerating exploration of complex reaction networks.

Keywords

Machine learning potential
Organic synthesis
Transition state search
Quantum chemistry

Supplementary materials

Title
Description
Actions
Title
Supporting materials
Description
The folder has three subfolders and a Supporting Information file. Supporting Information includes supplemental figures and tables. Cross-dataset validation of DeePEST-OS contains all geometries from that section. Conformational isomer and multi-step organic reactions include geometries from their respective sections in the main text.
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.