Evaluation of Binding Site Comparison Algorithms and Proteometric Machine Learning Models in the Detection of Protein Pockets Capable of Binding the Same Ligand
Preprints are manuscripts made publicly available before they have been submitted for formal peer review and publication. They might contain new research findings or data. Preprints can be a draft or final version of an author's research but must not have been accepted for publication at the time of submission.
Non linearities of biological networks present ample opportunity for synergistic protein targeting combinations. Yet, to date, our ability to design multi-target inhibitors and predict polypharmacology binding profiles remains limited. Herein, we present a systematic benchmarking of protein pocket comparison algorithms from the literature, as well as novel machine learning models developed to predict whether two proteins will bind the same ligand. The results demonstrate that previously reported performance metrics from the literature could be inflated due to a bias towards proteins of similar folds when identifying protein capable of binding the same ligand. This observation motivated a more in-depth evaluation of the methods against two subsets of same and cross protein fold comparisons. In a head to head comparison using the cross protein fold subset, we found that the proteometric machine learning models were the best performing models overall.