Theoretical and Computational Chemistry

Data-driven analysis of the number of Lennard-Jones types needed in a force field

Abstract

We optimized force fields with smaller and larger sets of chemically motivated Lennard-Jones types against the experimental properties of organic liquids. Surprisingly, we obtained results as good as or better than those from much more complex typing schemes from exceedingly simple sets of LJ types; e.g. a model with only two types of hydrogen and only one type apiece for carbon, nitrogen and oxygen.

The results justify sharply limiting the number of parameters to be optimized in future force field development work, thus reducing the risks of overfitting and the difficulties of reaching a global optimum in the multidimensional parameter space. They thus increase our chances of arriving at well-optimized force fields that will improve predictive accuracy, with applications in biomolecular modeling and computer-aided drug design. The results also prove the feasibility and value of a rigorous, data-driven approach to advancing the science of force field development.

Content

Thumbnail image of LJ_types_paper_v8.pdf

Supplementary material

Thumbnail image of SI_LJ-Types-Paper-2a.pdf
SI LJ-Types-Paper-2a
Thumbnail image of SI_optimized_parameters.txt
SI optimized parameters
Thumbnail image of SI_training_test_set_molecules.txt
SI training test set molecules
Thumbnail image of SI_RangeDataSchauperl-LJ-Paper.xlsx
SI RangeDataSchauperl-LJ-Paper