Data-Driven Many-Body Potential Energy Functions for Generic Molecules: Linear Alkanes as a Proof-of-Concept Application

22 June 2022, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


We present a generalization of the many-body energy (MB-nrg) theoretical/computational framework that enables the development of data-driven potential energy functions (PEFs) for generic covalently bonded molecules, with arbitrary quantum mechanical accuracy. The “nearsightedness of electronic matter” is exploited to define monomers as “natural building blocks” based on their distinct chemical identity. The energy of generic molecules is then expressed as a sum of individual many-body energies of incrementally larger subsystems. The MB-nrg PEFs represent the low-order n-body energies, with n = 1 − 4, using permutationally invariant polynomials derived from electronic structure data carried out at an arbitrary quantum mechanical level of theory, while all higher-order n-body terms (n > 4) are represented by a classical many-body polarization term. As a proof-of-concept application of the general MB-nrg framework, we present MB-nrg PEFs for linear alkanes. The MB-nrg PEFs are shown to accurately reproduce reference energies, harmonic frequencies, and potential energy scans of alkanes, independently of their length. Since, by construction, the MB-nrg framework introduced here can be applied to generic covalently bonded molecules, we envision future computer simulations of complex molecular systems using data-driven MB-nrg PEFs, with arbitrary quantum mechanical accuracy.


many-body interactions
data-driven models
electronic structure
machine learning

Supplementary materials

Supporting Information
Details on the composition of the n-body permutationally invariant polynomials and training sets, description of the MB-nrg parameters, and additional correlation plots between the RI-MP2/avdz reference n-body energies and corresponding MB-nrg values.


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.