Deep Graph Kernel Learning for Material and Atomic Level Uncertainty Quantification in Adsorption Energy Prediction

21 March 2025, Version 2
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

For high-throughput catalytic material discovery, Graph Neural Networks (GNNs) provide an efficient method for predicting the adsorption energies of adsorbates on transition metal surfaces. While GNNs perform well on in-domain prediction tasks, they often struggle to generalize to out-of-domain scenarios. This limitation necessitates a robust method for quantifying prediction uncertainty to enable informed catalyst discovery. Gaussian Processes (GPs) offer a principled approach to uncertainty quantification within a Bayesian framework. However, traditional implementations suffer from key challenges, including cubic time complexity, high memory requirements, and an inability to learn meaningful representations from graph structures. To address these issues, we introduce Deep Graph Kernel Learning (DGKL), a scalable framework that integrates a GNN backbone with sparse variational Gaussian Processes (SVGP) for uncertainty quantification in adsorption energy prediction. We benchmark DGKL against state-of-the-art methods such as ensemble/query-by-committee and Monte Carlo dropout, using both ranking-based metrics (e.g., Spearman's rank correlation, negative log-likelihood, miscalibration area) and error-based metrics (e.g., RMSE vs. RMV and error vs. standard deviation plots). DGKL consistently outperforms existing methods across all evaluation metrics while maintaining computational efficiency and scalability. For example, the correlation coefficient between RMSE and RMV for DGKL ranges from 0.98 to 1.00, slightly exceeding the next best method (ensemble learning). More significantly, the expected normalized calibration error (ENCE) for DGKL ranges from 0.06 to 0.15 across different datasets and GNN backbones, while the ensemble method exhibits a wider range of 0.36 to 1.55. DGKL can be incorporated into an active learning framework to iteratively explore catalytic material space, guiding the discovery of novel active catalysts. Additionally, we propose a variation of DGKL capable of predicting atomic-level uncertainty, a feature absent in existing methods. This enables fine-grained insights into out-of-domain data and provides a pathway for enhancing predictive model performance.

Keywords

Adsorption Energy Prediction
Graph Kernel Learning
Uncertainty Quantification
Gaussian Processes

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.