Abstract
Accurate prediction of ADME (Absorption, Distribution, Metabolism, and Excretion) properties is a key challenge in drug discovery. In the Polaris Antiviral ADME Prediction Challenge, we developed and benchmarked multi-task directed message passing neural network (D-MPNN) models using ChemProp, trained exclusively on a curated collection of public datasets comprising over 55 tasks. We demonstrate that high-quality data curation, coupled with multi-task learning, worked well in the context of prediction of the ADME endpoints in this challenge. Our final models incorporated both experimental assay data and calculated properties. Using only public data, our approach achieved second place among 39 participants, surpassed only by a model using proprietary data. These results highlight the power of combining well-curated data together with multi-task ChemProp models for robust ADME prediction and set a benchmark for future community-driven efforts. To support continued progress, we publicly release the dataset and data processing scripts used in this work.