Abstract
Common methods for assigning atom-centered partial charges in computational chemistry, such as RESP and AM1-BCC, rely on quantum mechanical or semi-empirical calculations of the molecule of interest, which are expensive to compute and dependent on the choice of input molecular conformer(s). Graph neural network (GNN) based continuous atom embeddings have been shown to be a fast and flexible solution for partial charge assignment, but those developed so far for condensed phase modeling have usually been trained to reproduce AM1-BCC charges, which themselves seek to reproduce the HF/6-31G(d) molecular electrostatic potential. Here, we investigate the suitability of various common charge assignment schemes, including ESP and atoms-in-molecule (AIM) based approaches, as training targets for new GNN based charge models. We show that the strengths of both approaches can be combined by co-training GNN models to AIM charges and molecular dipoles and electrostatic potentials. We collect a dataset of quantum mechanical AIM properties computed at a high level of theory (wB97X-D/def2-tzvpp), in both vacuum and implicit solvent, and train new GNN charge models to each. Charges can be scaled between the vacuum and solvated charge sets, and combined with Lennard-Jones parameters optimized using the Open Force Field infrastructure, to yield force fields that are suitably polarized for condensed phase modeling. We further demonstrate that the charge models may be applied to explore electrostatics-driven structure-activity relationships in medicinal chemistry. The charge models are freely available at: https://github.com/cole-group/nagl-mbis/.
Supplementary materials
Title
Supporting Information for: A graph neural network charge model targeting accurate electrostatic properties of organic molecules
Description
Dataset statistics, list of molecules used for conformer and QM benchmarking, QM method comparisons, analysis of re-building the ESP with MBIS multipole moments, analysis of EspalomaCharge errors, training set performance for NAGL gas phase models, speed of
NAGL charge assignment, breakdown of physical properties for mixtures containing water, list of SMIRKS types trained and their parameters, and further analysis of structure-activity relationships.
Actions