Data-driven analysis of the number of Lennard-Jones types needed in a force field

We optimized force fields with smaller and larger sets of chemically motivated Lennard-Jones types against the experimental properties of organic liquids. Surprisingly, we obtained results as good as or better than those from much more complex typing schemes from exceedingly simple sets of LJ types; e.g. a model with only two types of hydrogen and only one type apiece for carbon, nitrogen and oxygen.

The results justify sharply limiting the number of parameters to be optimized in future force field development work, thus reducing the risks of overfitting and the difficulties of reaching a global optimum in the multidimensional parameter space. They thus increase our chances of arriving at well-optimized force fields that will improve predictive accuracy, with applications in biomolecular modeling and computer-aided drug design. The results also prove the feasibility and value of a rigorous, data-driven approach to advancing the science of force field development.