Lattice free energies of molecular crystals using normalizing flow

08 April 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Ranking computationally predicted molecular crystal polymorphs by thermodynamic stability requires anharmonic free energy calculations that are scalable and robust. Although classical free energy perturbation (FEP) methods provide very accurate lattice free energy estimates, their scalability in large polymorphic landscapes is limited by the need to ergodically sample one chain of overlapping alchemical Boltzmann distributions per polymorph. In contrast, targeted free energy perturbation (TFEP) can converge free energies of the same accuracy directly on the existing physical data sampled in each polymorph. The only major challenge in TFEP is obtaining an accurate bijective mapping between each polymorph and a common reference distribution. To achieve a general mapping strategy, we turn to normalising flow neural networks. In this work, we demonstrate the feasibility of normalising-flow enhanced TFEP in molecular crystals. Specifically, we describe a normalising flow architecture, based on spline-coupling, that can achieve a significant overlap between a Boltzmann distribution of a molecular crystal supercell and the artificial distribution learned by a probabilistic generative model (PGM) trained by example on relatively small amounts of MD data. The accuracy and efficiency of this approach were assessed by comparing PGM-based FE estimates to the reference FE estimates from the Einstein crystal method (ECM). These two methods were applied to three molecules of practical relevance in the pharmaceutical industry: succinic acid, a salt former and starting material, and two drug molecules, Veliparib and Mivebresib. A compelling agreement in accuracy between the PGM and the ECM was observed in the polymorphic landscapes for these three compounds across all supercell sizes that were considered. In addition to showing that normalising flow can provide reliable free energy estimates in systems that are significantly more complex compared to those previously studied at a significantly lower cost than ECM, we also discuss how one can further improve the cost-effectiveness of this approach.

Keywords

Lattice Free Energy
Probabilistic Generative Models
TFEP
Molecular Crystals
Pharmaceutical Solid-State

Supplementary materials

Title
Description
Actions
Title
Supplementary Materials
Description
Figures S1-S22, Tables S1-S5. Additional discussion and methodology details.
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.