Extending the Applicability of the ANI Deep Learning Molecular Potential to Sulfur and Halogens

Machine learning (ML) methods have become powerful, predictive tools in a wide range of applications, such as facial recognition and autonomous vehicles. In the sciences, computational chemists and physicists have been using ML for the prediction of physical phenomena, such as atomistic potential energy surfaces and reaction pathways. Transferable ML potentials, such as ANI-1x, have been developed with the goal of accurately simulating organic molecules containing the chemical elements H, C, N, and O. Here we provide an extension of the ANI-1x model. The new model, dubbed ANI-2x, is trained to three additional chemical elements: S, F, and Cl. Additionally, ANI-2x underwent torsional refinement training to better predict molecular torsion profiles. These new features open a wide range of new applications within organic chemistry and drug development. These seven elements (H, C, N, O, F, Cl, S) make up ~90% of drug like molecules. To show that these additions do not sacrifice accuracy, we have tested this model across a range of organic molecules and applications, including the COMP6 benchmark, dihedral rotations, conformer scoring, and non-bonded interactions. ANI-2x is shown to accurately predict molecular energies compared to DFT with a ~106 factor speedup and a negligible slowdown compared to ANI-1x. The resulting model is a valuable tool for drug development that can potentially replace both quantum calculations and classical force fields for myriad applications.