Using Deep Graph Neural Networks Improves Physics-Based Hydration Free Energy Predictions Even for Molecules Outside of the Training Set Distribution

04 June 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The accuracy of computational models of water is key to atomistic simulations of biomolecules. Here we explore a decoupled framework that combines classical physics- based models with deep neural networks (DNNs) to correct residual error in hydration free energy (HFE) prediction. Our main goal is to evaluate this framework on out-of- distribution data (molecules that differ significantly from those used in training), where DNNs are known to struggle. Several common physics-based solvation models are used in the evaluation. Graph neural network architectures are tested for their ability to generalize using multiple dataset splits, including out-of-distribution HFEs and unseen molecular scaffolds. Our most important finding is that for out-of-distribution data, where DNNs alone often struggle, the physics + DNN models consistently improve physics model predictions. For in-distribution data, the DNN corrections significantly improve the accuracy of physics-based models, with a final RMSE below 1 kcal/mol and a relative improvement between 40% and 65% in most cases. The accuracy of physics + DNN models tends to improve when the 6% of molecules with the highest experimental uncertainty are removed. This study provides insights into the potential and limitations of combining physics and machine learning for molecular modeling, offering a practical and generalizable strategy.

Keywords

water models
hydration free energy
deep learning
physics-based models
small molecules

Supplementary materials

Title
Description
Actions
Title
Supplementary Materials
Description
Details of the physics-based models of solvation; additional tables and figures; access to open source code.
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.