Abstract
Non linearities of biological networks present ample opportunity for synergistic protein targeting combinations. Yet, to date, our ability to design multi-target inhibitors and predict polypharmacology binding profiles remains limited. Herein, we present a systematic benchmarking of protein pocket comparison algorithms from the literature, as well as novel machine learning models developed to predict whether two proteins will bind the same ligand. The results demonstrate that previously reported performance metrics from the literature could be inflated due to a bias towards proteins of similar folds when identifying protein capable of binding the same ligand. This observation motivated a more in-depth evaluation of the methods against two subsets of same and cross protein fold comparisons. In a head to head comparison using the cross protein fold subset, we found that the proteometric machine learning models were the best performing models overall.