Challenges and Opportunities for Machine Learning Potentials in Transition Path Sampling: Alanine Dipeptide and Azobenzene Studies

23 July 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The growing interest in machine learning (ML) tools within chemistry and material science stems from their novelty and ability to predict properties almost as accurately as underlying electronic structure calculations or experiments. Transition path sampling (TPS) offers a practical way to explore transition routes between metastable minima such as conformers and isomers on the multidimensional potential energy surface. However, TPS has historically suffered from the computational cost vs. accuracy trade-off between affordable force-field simulations and expensive high-fidelity quantum mechanical calculations. ML interatomic potentials combined with TPS offer a new approach for the exploration of transition pathways at near-quantum mechanical accuracy, while keeping the computational cost comparable to classical force fields. In this study, we employ the HIP-NN-TS and ANI-1x neural network-based ML potentials, both trained on the ANI-1x dataset of 5 million HCNO structures. We first verify the correctness of our approach by applying it to alanine dipeptide and compare the resulting energy surface and transition paths to the literature. Our findings suggest that proposed approach holds promise for conformational searches, as evidenced by the chemical accuracy (errors ≲ 1 kcal/mol) for thermal molecular dynamics trajectories of alanine dipeptide. While we were able to successfully reconstruct alanine dipeptide’s potential landscape using both HIP-NN-TS and ANI-1x frameworks, we observed that ML models with lower accuracy may still recover additional important conformations. We also find that active learning, augmenting the training data by structures taken from TPS trajectories, improved the accuracy by ~30% with small amounts of additional data. Finally, we evaluated a more intricate case, azobenzene, and reiterated that seemingly simple torsions may bear a challenge for ML potentials and limit their applications in TPS. Inability of HIP-NN-TS to correctly describe the energetics of major rotational pathway in azobenzene isomerization highlights deficiencies of the reference method in describing the electronic degrees of freedom. Our study underscores the importance of domain expertise in selecting physically meaningful pathways for benchmarking ML potentials, especially considering the intricacies of electronic structure in chemical dynamics and non-equilibrium processes.

Keywords

transition path sampling
machine learning
potential energy surface
benchmark

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.