Abstract
Graph neural networks (GNNs) are a natural choice to represent chemical data, due to their inherent ability to handle arbitrary input topologies. They avoid the need to convert molecules into molecular fingerprints with a fixed vector length. However, like most deep learning models, GNNs are not interpretable and common explainability methods fail because of the variable input size. We introduce a novel method to interpret the predictions of GNNs based on Myerson values from cooperative game theory. Myerson values are closely related to Shapley values , which have been adapted to explain a wide variety of machine learning model predictions. Applying these approaches to GNNs have, however, proven to be challenging because of their varying graph size. Our approach treats a GNN as a coalition game and the nodes of an input layer graph as players. The Myerson value of a node then determines the contribution to the prediction of the model, with only connected nodes contributing to coalitions. All Myerson values add up to the predicted value of the model allowing for a simple and intuitive interpretation of the prediction. Because calculating Myerson values becomes computationally infeasible for large graphs, we have also implemented a scalable approximation technique using Monte Carlo sampling. We developed the technique for applications in cheminformatics and drug discovery, but it can also be used in any application that uses GNNs. The effectiveness of our approach is validated through successful applications to two proof-of-concept datasets (logP and molecular weight) as well as a real-world dataset featuring kinase inhibitors, highlighting its broad applicability and promise in explaining graph-based cheminformatic models.