ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
an analysis of proteochemometric and conformal prediction machine learning protein-ligand binding affinity models.pdf (1.14 MB)
0/0

An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models

preprint
submitted on 28.01.2020 and posted on 29.01.2020 by conor parks, Zied Gaieb, Rommie Amaro

Protein-ligand binding affinity is a key pharmacodynamic endpoint in drug discovery. Sole reliance on experimental design, make, and test cycles is costly and time consuming, providing an opportunity for computational methods to assist. Herein, we present results comparing random forest and feed-forward neural network proteochemometric models for their ability to predict pIC50 measurements for held out generic Bemis-Murcko scaffolds. In addition, we assess the ability of conformal prediction to provide calibrated prediction intervals in both a retrospective and semi-prospective test using the recently released Grand Challenge 4 data set as an external test set. In total, random forest and deep neural network proteochemometric models show quality retrospective performance but suffer in the semi-prospective setting. However, the conformal predictor prediction intervals prove to be well calibrated both retrospectively and semi-prospectively showing that they can be used to guide hit discovery and lead optimization campaigns.

History

Email Address of Submitting Author

coparks2012@gmail.com

Institution

University of California San Diego

Country

USA

ORCID For Submitting Author

0000-0001-8158-5116

Declaration of Conflict of Interest

REA has equity interest in and is a co- founder and scientific advisor of Actavalon, Inc

Exports