Modeling molecular ensembles with gradient-domain machine learning force fields

Alex M. Maldonado; Igor Poltavsky; Valentin Vassilev-Galindo; Alexandre Tkatchenko; John A. Keith

doi:10.26434/chemrxiv-2023-wdd1r-v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Modeling molecular ensembles with gradient-domain machine learning force fields

03 May 2023, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Gradient-domain machine learning (GDML) force fields have shown excellent accuracy, data efficiency, and applicability for molecules with hundreds of atoms, but the employed global descriptor limits transferability to ensembles of molecules. Many-body expansions (MBEs) should provide a rigorous procedure for size-transferable GDML by training models on fundamental n-body interactions. We developed many-body GDML (mbGDML) force fields for water, acetonitrile, and methanol by training 1-, 2-, and 3-body models on only 1000 MP2/def2-TZVP calculations each. Our mbGDML force field includes intramolecular flexibility and intermolecular interactions, providing that the reference data adequately describe these effects. Energy and force predictions of clusters containing up to 20 molecules are within 0.38 kcal/mol per monomer and 0.06 kcal/(mol Å) per atom of reference supersystem calculations. This deviation partially arises from the restriction of the mbGDML model to 3-body interactions. GAP and SchNet in this MBE framework achieved similar accuracies but occasionally had abnormally high errors up to 17 kcal/mol. NequIP trained on total energies and forces of trimers experienced much larger energy errors (at least 15 kcal/mol) as the number of monomers increased—demonstrating the effectiveness of size transferability with MBEs. Given these approximations, our automated mbGDML training schemes also resulted in fair agreement with reference radial distribution functions (RDFs) of bulk solvents. These results highlight mbGDML as valuable for modeling explicitly solvated systems with quantum-mechanical accuracy.

Supplementary materials

Title

Description

Actions

Title

Supplementary information: Modeling molecular ensembles with gradient-domain machine learning force fields

Description

Supplementary information for "Modeling molecular ensembles with gradient-domain machine learning force fields".

Actions

Supplementary weblinks

Title

Description

Actions

Title

mbGDML Python package

Description

Foundational Python code for preparing, training, and analyzing many-body machine learning models.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

Modeling molecular ensembles with gradient-domain machine learning force fields

Alex M. Maldonado, Igor Poltavsky, Valentin Vassilev-Galindo, Alexandre Tkatchenko, John A. Keith journal article

Digital Discovery , Volume 2, Issue 3

Online publication date: 2023

Version History

May 03, 2023 Version 2

Jan 12, 2023 Version 1

Version Notes

Incorporates reviewer feedback primarily on mentioning other works and clarifying potentially confusing points. Other typos and grammatical errors are corrected.

Metrics

1,441

619

Views

Downloads

Citations

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2023-wdd1r-v2

Funding

U.S. National Science Foundation

CBET-1653392

U.S. National Science Foundation

CBET-1705592

U.S. National Science Foundation

CHE-1856460

Luxembourg National Research Fund

C19/MS/13718694/QML-FLEX

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Modeling molecular ensembles with gradient-domain machine learning force fields

Authors

Abstract

Supplementary materials

Supplementary weblinks

Comments

Now Published

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share