On the Reproducibility of Free Energy Surfaces in Machine-Learned Collective Variable Spaces

10 March 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

As Machine-Learned CVs (MLCVs) are becoming increasingly relevant in the molecular simulation literature, we discuss the necessary conditions to enable reproducibility in the calculation and representation of free energy surfaces (FES). We note that the variability of the training process, as well as the roughness of the hyperparameter space, impose inherent limits on the reproducibility of results even when the mathematical structure of the model defining a CV is consistent. To this end, we propose the adoption of a Geometric (gauge invariant) Free Energy representation to obtain consistent free energy differences across training instances and architectures. Further, we introduce a normalisation factor to model gradients for biased enhanced sampling. This factor, effectively unifies Free Energy definitions and addresses practical issues preventing the widespread use and deployment of MLCVs.

Keywords

Free Energy Surfaces
Machine Learned CVs
Geometric Free Energy
Enhanced Sampling

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.