Abstract
We present a novel computational approach for predicting human pharmacokinetics (PK) that addresses the challenges of early-stage drug design. Our study introduces and describes a large-scale dataset of 11 clinical PK endpoints, encompassing over 2700 unique chemical structures to train machine learning models. To that end multiple advanced training strategies are compared, including the integration of in vitro data and a novel self-supervised pre-training task. In addition to the predictions, our final model provides meaningful epistemic uncertainties for every data point. This allows us to successfully identify regions of exceptional predictive performance, with an Absolute Average Fold Error (AAFE/GMFE) of less than 2.5 across multiple endpoints. These advancements represent a significant leap towards actionable PK predictions, which can be utilized early on in the drug design process to expedite development and reduce reliance on nonclinical
studies.
Supplementary materials
Title
Supporting Tables & Figures
Description
PDF file with additional descriptors and of the human PK data set per source, tables containing all discussed metrics from the cross-validation experiments, additional descriptors and metrics for the test set predictions, and an
overview of the GPS architecture and hyperparameters.
Actions