A Comparison of Scaling Methods to Obtain Calibrated Probabilities of Activity for Ligand-Target Predictions
Lewis Mervin
Avid M. Afzal
Ola Engkvist
Andreas Bender
10.26434/chemrxiv.11526132.v1
https://chemrxiv.org/articles/A_Comparison_of_Scaling_Methods_to_Obtain_Calibrated_Probabilities_of_Activity_for_Ligand-Target_Predictions/11526132
In the context of bioactivity prediction, the question of how to calibrate a score produced by a machine learning method into reliable probability of binding to a protein target is not yet satisfactorily addressed. In this study, we compared the performance of three such methods, namely Platt Scaling, Isotonic Regression and Venn-ABERS in calibrating prediction scores for ligand-target prediction comprising the Naïve Bayes, Support Vector Machines and Random Forest algorithms with bioactivity data available at AstraZeneca (40 million data points (compound-target pairs) across 2112 targets). Performance was assessed using Stratified Shuffle Split (SSS) and Leave 20% of Scaffolds Out (L20SO) validation.
2020-01-08 08:48:30
in silico target prediction
chemoinformatics
cheminformatics
QSAR Modeling
probability threshold
probability
probability calibration
probability scaling
venn abers
venn predictors
isotonic regression
platt scaling