When Do Quantum Mechanical Descriptors Help Graph Neural Networks Predict Chemical Properties?

04 April 2024, Version 2
This content is a preprint and has not undergone peer review at the time of posting.


Deep graph neural networks are extensively utilized to predict chemical reactivity and molecular properties. However, because of the complexity of chemical space, such models often have difficulty extrapolating beyond the chemistry contained in the training set. Augmented model with quantum mechanical (QM) descriptors is anticipated to improve its generalizability. However, obtaining QM descriptors often requires CPU-intensive computational chemistry calculations. To identify when QM descriptors help graph neural networks predict chemical properties, we conduct a systematic investigation of the impact of atom, bond, and molecular QM descriptors on the performance of directed message passing neural networks (D-MPNNs) for predicting 16 molecular properties. The analysis surveys computational and experimental targets, classification and regression tasks, and varied dataset sizes from several hundred to hundreds of thousands of datapoints. Our results indicate that QM descriptors are mostly beneficial for D-MPNN performance on small datasets, provided that the descriptors correlate well with the targets and can be readily computed with high accuracy. Otherwise, using QM descriptors can add cost without benefit or even introduce unwanted noise that can degrade model performance. Strategic integration of QM descriptors with D-MPNN unlocks potential for physics-informed, data-efficient modeling with some interpretability that can streamline de novo drug and material designs. To facilitate the use of QM descriptors in machine learning workflows for chemistry, we provide a set of guidelines regarding when and how to best leverage QM descriptors, a high-throughput workflow to compute them, and an enhancement to Chemprop, a widely adopted open-source D-MPNN implementation for chemical property prediction.


Quantum Mechanical Descriptors
Graph Neural Networks
Message-passing neural networks
Chemical Property Prediction

Supplementary materials

Supporting Information
Additional results and discussions.

Supplementary weblinks


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.