Abstract
Here we evaluate the robustness and utility of quantum mechanical descriptors for machine learning with transition metal complexes. We utilize ab initio information from the quantum theory of atoms-in-molecules (QTAIM) for 60k transition metal complexes at multiple levels of theory (LOT), presented here in the tmQM+ dataset, to inform flexible graph neural network (GNN) models. We evaluate these models with several experiments, including training on limited charge and elemental compositions 1 and testing on unseen charges and elements, as well as training on smaller portions of the dataset. Results show that additional quantum chemical information improves performance on unseen regimes and smaller training sets. Furthermore, we leverage the tmQM+ dataset to analyze how QTAIM descriptors vary across different LOT and probe machine learning performance with less computationally expensive LOT. We determine that ab initio descriptors provide benefits across LOT, thereby motivating the use of lower-level DFT descriptors, particularly for predicting expensive or experimental molecular properties.