ChemProp multi-task models for predicting ADME properties in the Polaris challenge

10 June 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Accurate prediction of ADME (Absorption, Distribution, Metabolism, and Excretion) properties is a key challenge in drug discovery. In the Polaris Antiviral ADME Prediction Challenge, we developed and benchmarked multi-task directed message passing neural network (D-MPNN) models using ChemProp, trained exclusively on a curated collection of public datasets comprising over 55 tasks. We demonstrate that high-quality data curation, coupled with multi-task learning, worked well in the context of prediction of the ADME endpoints in this challenge. Our final models incorporated both experimental assay data and calculated properties. Using only public data, our approach achieved second place among 39 participants, surpassed only by a model using proprietary data. These results highlight the power of combining well-curated data together with multi-task ChemProp models for robust ADME prediction and set a benchmark for future community-driven efforts. To support continued progress, we publicly release the dataset and data processing scripts used in this work.

Keywords

: D-MPNN
multi-task machine learning
ADME prediction

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.