Abstract
Quantifying uncertainty in machine learning is important in new research areas with scarce high-quality data. In this work, we develop an explainable uncertainty quantification method for deep learning-based molecular property prediction. This method can capture aleatoric and epistemic uncertainties separately and attribute the uncertainties to atoms present in the molecule. The atom-based uncertainty method provides an extra layer of explainability to the estimated uncertainties, i.e., one can analyze individual atomic uncertainty values to diagnose the chemical component that introduces uncertainty to the prediction. Our experiments demonstrate that atomic uncertainty can detect unseen chemical structures and identify chemical species whose data are potentially associated with significant noise. Furthermore, we propose a post-hoc calibration method to refine the uncertainty quantified by ensemble models for better confidence interval estimates. This work improves uncertainty calibration and provides a framework for assessing whether and why a prediction should be considered unreliable.
Supplementary materials
Title
Supporting Information
Description
Additional information as noted in the text, including confidence- and error-based calibration curves for the datasets listed in Table 2, correlation coefficient matrixes for atomic uncertainties, and complete lists of atom and bond features, are provided in the Supporting Information.
Actions