These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
2 files

A Comparison of Scaling Methods to Obtain Calibrated Probabilities of Activity for Ligand-Target Predictions

revised on 06.05.2020, 16:13 and posted on 07.05.2020, 09:55 by Lewis Mervin, Avid M. Afzal, Ola Engkvist, Andreas Bender
In the context of bioactivity prediction, the question of how to calibrate a score produced by a machine learning method into reliable probability of binding to a protein target is not yet satisfactorily addressed. In this study, we compared the performance of three such methods, namely Platt Scaling, Isotonic Regression and Venn-ABERS in calibrating prediction scores for ligand-target prediction comprising the Naïve Bayes, Support Vector Machines and Random Forest algorithms with bioactivity data available at AstraZeneca (40 million data points (compound-target pairs) across 2112 targets). Performance was assessed using Stratified Shuffle Split (SSS) and Leave 20% of Scaffolds Out (L20SO) validation.


Email Address of Submitting Author




United Kingdom

ORCID For Submitting Author


Declaration of Conflict of Interest

None declared

Version Notes

Fixed error in Figure 1 and inaccuracies in the description of the inductive (cross-validated) Platt scaling and Isotonic Regression scaling methods. General improvements to the flow/main body of text