AIMNet2: A Neural Network Potential to Meet your Neutral, Charged, Organic, and Elemental-Organic Needs

20 December 2024, Version 3
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Machine learned interatomic potentials (MLIPs) are reshaping computational chemistry practices because of their ability to drastically exceed the accuracy-length/time scale tradeoff. Despite this attraction, the benefits of such efficiency are only impactful when an MLIP uniquely enables insight into a target system or is broadly transferable outside of the training dataset, where models achieving the latter are seldom reported. In this work, we present the 2nd generation of our atoms-in-molecules neural network potential (AIMNet2), which is applicable to species composed of up to 14 chemical elements in both neutral and charged states, making it a valuable method for modeling the majority of non-metallic compounds. Using an exhaustive dataset of 2 x 107 hybrid DFT level of theory quantum chemical calculations, AIMNet2 combines ML-parameterized short-range and physics-based long-range terms to attain generalizability that reaches from simple organics to diverse molecules with “exotic” element-organic bonding. We show that AIMNet2 outperforms semi-empirical GFN-xTB and is on par with reference density functional theory for interaction energy contributions, conformer search tasks, torsion rotation profiles, and molecular-to-macromolecular geometry optimization. Overall, the demonstrated chemical coverage and computational efficiency of AIMNet2 is a significant step toward providing access to MLIPs that avoid the crucial limitation of curating additional quantum chemical data and retraining with each new application.

Supplementary materials

Title
Description
Actions
Title
Supplementary Information
Description
Supplementary Table 1: Number of molecules and conformers in training and test datasets. Supplementary Figure 1: Distribution of molecule sizes in training and test datasets. Supplementary Figure 2: Distribution of elements in training and test datasets. Supplementary Figure 3: Distribution of molecular charges for training and test datasets. Supplementary Note 1: Diverse element-organic CSD benchmark set. Supplementary Table 2: Benchmark performance statistics of GFN2-xTB and two AIMNet2 variants against experimentally observed geometries with diverse element CSD conformation benchmark set. Supplementary Figure 4: Distribution of RMSD for dihedral angles of GFN2-xTB and two AIMNet2 variants against experimentally observed geometries in diverse element CSD conformation benchmark set. Supplementary Note 2: CSD conformer benchmark set Supplementary Tabe 3: Benchmark performance of various methods on CSD conformer benchmark set Supplementary Figure 5: Distribution of RMSE and MAE errors for various Supplementary Table 4: MAE for energy predictions (kcal mol-1) on GMTKN55 subsets
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.