Deep Graph Kernel Learning for Material and Atomic Level      Uncertainty Quantification in Adsorption Energy Prediction

Osman  Mamun; Chenlu Yang; Shuwen Yue

doi:10.26434/chemrxiv-2025-pfng2-v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Deep Graph Kernel Learning for Material and Atomic Level Uncertainty Quantification in Adsorption Energy Prediction

21 March 2025, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

For high-throughput catalytic material discovery, Graph Neural Networks (GNNs) provide an efficient method for predicting the adsorption energies of adsorbates on transition metal surfaces. While GNNs perform well on in-domain prediction tasks, they often struggle to generalize to out-of-domain scenarios. This limitation necessitates a robust method for quantifying prediction uncertainty to enable informed catalyst discovery. Gaussian Processes (GPs) offer a principled approach to uncertainty quantification within a Bayesian framework. However, traditional implementations suffer from key challenges, including cubic time complexity, high memory requirements, and an inability to learn meaningful representations from graph structures. To address these issues, we introduce Deep Graph Kernel Learning (DGKL), a scalable framework that integrates a GNN backbone with sparse variational Gaussian Processes (SVGP) for uncertainty quantification in adsorption energy prediction. We benchmark DGKL against state-of-the-art methods such as ensemble/query-by-committee and Monte Carlo dropout, using both ranking-based metrics (e.g., Spearman's rank correlation, negative log-likelihood, miscalibration area) and error-based metrics (e.g., RMSE vs. RMV and error vs. standard deviation plots). DGKL consistently outperforms existing methods across all evaluation metrics while maintaining computational efficiency and scalability. For example, the correlation coefficient between RMSE and RMV for DGKL ranges from 0.98 to 1.00, slightly exceeding the next best method (ensemble learning). More significantly, the expected normalized calibration error (ENCE) for DGKL ranges from 0.06 to 0.15 across different datasets and GNN backbones, while the ensemble method exhibits a wider range of 0.36 to 1.55. DGKL can be incorporated into an active learning framework to iteratively explore catalytic material space, guiding the discovery of novel active catalysts. Additionally, we propose a variation of DGKL capable of predicting atomic-level uncertainty, a feature absent in existing methods. This enables fine-grained insights into out-of-domain data and provides a pathway for enhancing predictive model performance.

Keywords

Adsorption Energy Prediction

Graph Kernel Learning

Uncertainty Quantification

Gaussian Processes

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Mar 21, 2025 Version 2

Mar 14, 2025 Version 1

Version Notes

Additional results and discussion, along with minor edits and clarifications throughout the text.

Metrics

728

321

Views

Downloads

Citations

License

The content is available under CC BY NC 4.0

DOI

10.26434/chemrxiv-2025-pfng2-v2

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content