The maximal and current accuracy of rigorous protein-ligand binding free energy calculations

13 October 2023, Version 2
This content is a preprint and has not undergone peer review at the time of posting.


It is now well recognized that computational techniques can greatly speed up the identification of hits and accelerate the optimization of such hits to lead series and development candidate molecules. In particular, a class of rigorous physics-based methods known as free energy perturbation (FEP) have emerged as the most consistently accurate relative affinity prediction tool available to support such efforts. Yet, there still remains uncertainty about how accurate these techniques are, and indeed, how accurate they can ever be. In this study, we assemble what we believe to be the largest publicly available data set of proteins and congeneric series of small molecules to date and assess the accuracy of the leading FEP workflow in predicting relative binding affinities. To ascertain the limit of achievable accuracy, we survey the reproducibility of experimental relative affinity measurements by comparing chemical series that have been assayed by two or more different techniques. We find a wide variability in experimental accuracy and a general correspondence between binding and functional assays. When the protein and ligand structures from our data set are prepared with care, we find that FEP can achieve a level of accuracy close to what we find in our experimental survey. Throughout, we highlight reliable protocols that can help maximize the accuracy of FEP in prospective studies.


free energy perturbation
experimental accuracy
binding free energy
molecular dynamics
computer aided drug design

Supplementary materials

Supporting information for the maximal and current accuracy of rigorous protein-ligand binding free energy calculations
The supporting information contains details on the sources of data used in the experimental reproducibility survey as well extensive details of the structural modifications and additional calculations that were applied to each protein and congeneric series in the free energy perturbation study.


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.