Enhancing hit discovery in virtual screening through accurate calculation of absolute protein-ligand binding free energies


In the hit identification stage of drug discovery, a diverse chemical space needs to be explored to identify initial hits. Contrary to empirical scoring functions, absolute protein-ligand binding free energy perturbation (ABFEP) provides a theoretically more rigorous and accurate description of protein-ligand binding thermodynamics and could in principle greatly improve the hit rates in virtual screening. In this work, we describe an implementation of an accurate and reliable ABFEP method in FEP+. We validated the ABFEP method on eight congeneric compound series binding to eight protein receptors including both neutral and charged ligands. For ligands with net charges, the alchemical ion approach is adopted to avoid artifacts in electrostatic potential energy calculations. The calculated binding free energies are highly correlated with experimental results with the weighted average of R2 of 0.55 for the entire dataset and an overall RMSE of 1.1 kcal/mol when protein reorganization effect upon ligand binding was accounted for. Through ABFEP calculations using apo versus holo protein structures, we demonstrated that the protein conformational and protonation state changes between the apo and holo proteins are the main physical factors contributing to the protein reorganization free energy manifested by the overestimation of raw ABFEP calculated binding free energies using the holo structures of the proteins. Furthermore, we performed ABFEP calculations in three virtual screening applications for hit enrichment. ABFEP greatly improves the hit rates as compared to docking scores or other methods like metadynamics. The highly accurate ABFEP results demonstrated in this work position it as a useful tool to improve the hit rates in virtual screening, thus facilitate hit discovery.

Version notes

Results of ABFEP calculations on three virtual screening applications were added. The title, author list, abstract, main text, figures and tables are modified.


Supplementary material

Supporting Information
Supporting information includes supplementary method description, figures, tables and the weblink for input structures.
ABFEP data on congeneric compound series
The spreadsheet contains the raw data for ABFEP on congeneric compound series.
ABFEP data on JAK2 virtual screening
The spreadsheet contains the raw data for ABFEP on JAK2 virtual screening.