Abstract
The structure-based technologies most widely used to rank the affinities of candidate small molecule drugs for proteins range from faster but less reliable docking methods to slower but more accurate explicit solvent free energy methods. In recent years, we have advanced another technology, which is called mining minima because it “mines” out the main contributions to the chemical potentials of the free and bound molecular species by identifying and characterizing their main local energy minima. The present study provides systematic benchmarks of the accuracy and computational speed of mining minima, as implemented in the VeraChem Mining Minima Generation 2 (VM2) code, across two well-regarded protein-ligand benchmark datasets for which there are already benchmark data for docking, free energy, and other computational methods. A core result is that VM2’s accuracy approaches that of explicit solvent free energy methods at far lower computational cost. In finer grained analyses, we also examine the influence of various run settings, such as the treatment of crystallographic water molecules, on accuracy, and define the costs in time and dollars of representative runs on Amazon Web Services (AWS) compute instances with various CPU and GPU combinations. We also use the benchmark data to determine the importance of VM2’s correction from generalized Born to finite-difference Poisson-Boltzmann results for each energy well and find that this correction affords a remarkably consistent improvement in accuracy at modest computational cost. The present results establish VM2 as a distinctive technology for early-stage drug discovery, which provides a strong combination of efficiency and predictivity.
Supplementary materials
Title
Overview of materials in the Zenodo SI archive
Description
Describes and summarizes the input and output files provided on Zenodo to accompany this preprint.
Actions
Supplementary weblinks
Title
Supporting Information
Description
Detailed VM2 input and output files, including timing data, with corresponding experimental data, for all
replicates of all systems.
Actions
View