Development and Benchmarking of Open Force Field v1.0.0, the Parsley Small Molecule Force Field



We present a methodology for defining and optimizing a general force field for classical molecular simulations, and we describe its use to derive the Open Force Field 1.0.0 small molecule force field, code-named Parsley. Rather than traditional atom-typing, our approach builds on the SMIRKS-native Open Force Field (SMIRNOFF) parameter assignment formalism, which handles increases in the diversity and specificity of the force field definition without needlessly increasing the complexity of the specification. Parameters are optimized with the ForceBalance tool, based on reference quantum chemical data that include torsion potential energy profiles, optimized gas-phase structures, and vibrational frequencies. These quantum reference data are computed and are maintained with QCArchive, an open-source and freely available distributed computing and database software ecosystem. In this initial application of the method, we present essentially a full optimization of all valence parameters and report tests of the resulting force field against compounds and data types outside the training set. These tests show improvements in optimized geometries and conformational energetics and demonstrate that Parsley's accuracy for liquid properties is similar to that of other general force fields, as is accuracy on binding free energies. We find that this initial Parsley force field affords accuracy similar to that of other general force fields when used to calculate relative binding free energies spanning 199 protein-ligand systems. Additionally, the resulting infrastructure allows us to rapidly optimize an entire new force field with minimal human intervention.

Version notes

MInor edits mainly involving more detailed explanations of methodological choices.


Supplementary material

Supporting Information for Parsley Force Field Preprint
This document provides key additional details relating to the Parsley force field, including information on datasets and tools used in training and testing the force field as well as details on how to access these datasets and reproduce the calculations done in training and testing. Much of this information is provided in software/scripts available on GitHub and datasets available in QCArchive and elsewhere, as detailed herein.