- Simon Boothroyd University of Colorado Boulder & Memorial Sloan Kettering Cancer Center & Boothroyd Scientific Consulting Ltd. ,
- Lee-Ping Wang University of California, Davis ,
- David Mobley University of California, Irvine ,
- John Chodera Memorial Sloan Kettering Cancer Center ,
- Michael Shirts University of Colorado Boulder
Developing accurate classical force field representations of molecules is key to realizing the full potential of molecular simulations, both as a powerful route to gaining fundamental insight into a broad spectrum of chemical and biological phenomena, and for predicting physicochemical and mechanical properties of substances. The Open Force Field Consortium is an industry-funded open science effort to this end, developing open source tools for rapidly generating new, high-quality small molecule force fields. An integral aspect of this is the parameterization and assessment of force fields against high-quality, condensed phase physical property data, curated from open data sources such the NIST ThermoML Archive, alongside quantum chemical data. The quantity of such experimental data in open data archives alone would require an onerous amount of human and compute resources to both curate and estimate manually, especially when estimations must be made for numerous sets of force field parameters. Here we present an entirely automated, highly scalable framework for evaluating physical properties and their gradients in terms of force field parameters. It is written as a modular and extensible Python framework, which employs an intelligent multiscale estimation approach that allows for the automated estimation of properties from simulation and cached simulation data, and a pluggable API for estimation of new properties. In this study we demonstrate the utility of the framework by benchmarking the OpenFF 1.0.0 small molecule force field, GAFF 1.8 and GAFF 2.1 force fields against a test set of binary density and enthalpy of mixing measurements curated using the frameworks utilities. Further, we demonstrate the framework's utility as part of force field optimization by using it alongside ForceBalance, a framework for systematic force field optimization, to retrain a set of non-bonded van der Waals parameters against a training set of density and enthalpy of vaporization measurements.