Current state of open source force fields in protein-ligand binding affinity predictions

29 August 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


In drug discovery, the in-silico prediction of binding affinity is one of the major means to prioritize compounds for synthesis. Alchemical relative binding free energy (RBFE) calculations based on molecular dynamics (MD) simulations is nowadays a popular approach for accurate affinity ranking of compounds. MD simulations rely on empirical force field parameters, which strongly influence the accuracy of the predicted affinities. Here, we evaluate the ability of six different small-molecule force fields to predict experimental protein-ligand binding affinities in RBFE calculations on a set of 598 ligands and 22 protein targets. The public force fields OpenFF Parsley and Sage, GAFF and CGenFF show comparable accuracy, while OPLS3e is significantly more accurate. However, a Consensus approach using Sage, GAFF and CGenFF leads to accuracies comparable to OPLS3e. While Parsley and Sage are performing comparable based on aggregated statistics across the whole dataset, there are differences in terms of outliers. Analysis of the force field reveals that improved parameters lead to significant improvement in the accuracy of affinity predictions on subsets of the dataset involving those parameters. Lower accuracy cannot only be attributed to the force field parameters, but is also dependent of input preparation, and sampling convergence of the calculations. Especially large perturbations and non-converged simulations lead to less accurate predictions. The input structures, Gromacs force field files as well as the analysis python notebooks are available on github.


Open Force Field
Force Field
Free Energy Calculation
Molecular Dynamics
Binding Affinity
Small Molecule Force Field
Drug Discovery

Supplementary materials

Supplementary Information: Current state of open source force fields in protein-ligand binding affinity predictions
The Supplementary Information lists details about the employed target set, shows additional graphs and tables containing aggregated statistics and correlations with experiment in greater detail, and shows various properties of the simulated perturbations.

Supplementary weblinks


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.