Abstract
Molecular structures that can be readily represented by graphs comprising constituent atoms (nodes) and their chemical bonds (edges) can also be used as input data for well-known machine learning (ML) models that process this data, such as graph neural networks (GNNs). GNNs showed a reasonable performance in the predicting properties of chemical systems. In typical applications of GNNs to chemistry-related fields, the main objective is to create an optimal molecular representation by aggregating atomic features and pooling features in the graph. In this study, we investigated two different approaches that can possibly generate better molecular representations. First, we created intermolecular edges to predict the photochemical properties of chromophore molecules in the solution. These intermolecular edges were constructed using atomic partial charges, inspired from the fact that electrostatic interaction is the main component of solute-solvent interaction. In the second approach, we investigated the effect of the aggregation and pooling functions. The results showed that intermolecular electrostatic interactions based on ground state charges prevent the GNN model from generating more effective molecular representations. On the contrary, the model demonstrated better performance when the averaging and adding operations were employed in a hybrid manner for aggregation and pooling functions.
Supplementary materials
Title
Supporting Information for Revealing the Impact of Aggregations in the Graph-based Molecular Machine Learning: Electrostatic Interaction versus Pooling Methods
Description
Supporting Information which contains set of optimized hyperparameters and additional figures
Actions
Supplementary weblinks
Title
Revealing the Impact of Aggregations in the Graph-based Molecular Machine Learning: Electrostatic Interaction versus Pooling Methods
Description
Code for implementation of eelGNN
Actions
View