Abstract
Relative binding affinity prediction is a critical component in computer aided drug design. Significant amount of effort has been dedicated to developing rapid and reliable in silico methods. However, robust assessment of their performance is still a complicated issue, as it requires a performance measure applicable in the prospective setting and more importantly a true null model that defines the expected performance of random in an objective manner. Although many performance metrics, such as correlation coefficient (r2), mean unsigned error (MUE), and room mean square error (RMSE), are frequently used in the literature, a true and non-trivial null model has yet been identified. To address this problem, here we introduce an interval estimate as an additional measure, namely prediction interval (PI), which can be estimated from the error distribution of the predictions. The benefits of using the interval estimate are 1) it provides the uncertainty range in the predicted activities, which is important in prospective applications; 2) a true null model with well-defined PI can be established. We provide one such example termed Gaussian Random Affinity Model (GRAM), which is based on the empirical observation that the affinity change in a typical lead optimization effort has the tendency to distribute normally N (0, s). Having an analytically defined PI that only depends on the variation in the activities, GRAM should in principle allow us to compare the performance of relative binding affinity prediction methods in a standard way, ultimately critical to measuring the progress made in algorithm development.