Identifying hit compounds is a principal step in early-stage drug discovery. While many machine learning (ML) approaches have been proposed, in the absence of binding data, molecular docking is the most widely used option to predict binding modes and score hundreds of thousands of compounds for binding affinity to the target protein. Docking's effectiveness is critically dependent on the protein-ligand (P-L) scoring function (SF), thus re-scoring with more rigorous SFs is a common practice. In this pilot study, we scrutinize the PM6-D3H4X/COSMO semi-empirical quantum mechanical (SQM) method as a docking pose re-scoring tool on 17 diverse receptors and ligand decoy sets, totaling 1.5 million P-L complexes. We investigate the effect of explicitly computed ligand conformational entropy and ligand deformation energy on SQM P-L scoring in a virtual screening (VS) setting, as well as molecular mechanics (MM) versus hybrid SQM/MM structure optimization prior to re-scoring. Our results proclaim that there is no obvious benefit from computing ligand conformational entropies or deformation energies and that optimizing only the ligand's geometry on the SQM level is sufficient to achieve the best possible scores. Instead, we leverage machine learning (ML) to include implicitly the missing entropy terms to the SQM score using ligand topology, physicochemical, and P-L interaction descriptors. Our new hybrid scoring function, named SQM-ML, is transparent and explainable, and achieves in average 9% higher AUC-ROC than PM6-D3H4X/COSMO and 3% higher than Glide SP, but with consistent and predictable performance across all test sets, unlike the former two SFs, whose performance is considerably target-dependent and sometimes resembles that of a random classifier. The code to prepare and train SQM-ML models is available at https://github.com/tevang/sqm-ml.git and we believe that will pave the way for a new generation of hybrid SQM/ML protein-ligand scoring functions.
removed LaTEX symbols from the abstract