Global interpretability and geometry of graph convolu- tional neural networks for chemistry in terms of chemical moieties

05 September 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Graph convolutional neural nets, such as SchNet, [Schütt et al, Journal of Chemical Physics, 2018, 148, 241722], provide accurate predictions of chemical quantities without invoking any direct physical or chemical principles. These methods learn a hidden statistical representation of molecular systems in an end-to-end fashion; from xyz coordinates to molecular properties with many hidden layers in between. This naturally leads to the interpretability question: what underlying chemical model determines the algorithm’s accurate decision-making? To answer this question, we analyze the hidden layer activations of QM9-trained SchNet, also known as “embedding vectors” with dimension- reduction, linear discriminant analysis and Euclidean-distance measures. The result is a quantifiable geometry of the model’s decision making that identifies chemical moieties and has a low parametric space of ∼ 5 important parameters from the fully-trained 128-parameter embedding. The geometry of the embedding space organizes these moieties with sharp linear boundaries that can classify each chemical environment within < 5 × 10−4 error. Euclidean distance between embedding vectors can be used to demonstrate a versatile molecular similarity measure, outperforming other popular hand- crafted representations such as Smooth Overlap of Atomic Positions (SOAP). We also reveal that the embedding vectors can be used to extract observables that are related to chemical environments such as pKa and NMR. The work is in line with the recent push for explainable AI and gives insights into the depth of modern statistical representations of chemistry, such as graph convolutional neural nets, in this rapidly evolving technology.

Keywords

Graph Convolutional Neural Networks
Explainable Artificial Intelligence
Euclidean geometry
pKa
13C NMR

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.