Chemical Space Exploration with Active Learning and Alchemical Free Energies

13 July 2022, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Drug discovery can be thought as a search for a needle in a haystack. Finding the initial active hit molecules, the optimal decoration of lead molecule analogues, to final clinical candidate selection is an on-going trade-off between applying the best methods versus the cost of assessing the large available chemical space. Computational techniques can impact by narrowing the search-space, but some preferred methods such as binding affinity calculations can still only be performed on a small fraction of the possible molecules. For that purpose, machine learning (ML) strategies are being developed to complement the experimentation and computationally more expensive approaches in navigating and triaging large chemical libraries. In the current study, we explore how an active learning protocol can be combined with first principles based alchemical free energy calculations to identify high affinity phosphodiesterase 2 (PDE2) inhibitors. Firstly, we calibrate the procedure using a large set of experimentally characterised PDE2 binders. The optimized protocol is then used prospectively on a large chemical library to navigate towards potent inhibitors. In the active learning cycle, at every iteration a small fraction of compounds is probed by alchemical calculations and the obtained affinities are used to train ML models. With successive rounds high affinity binders are identified by explicitly evaluating only a small subset of compounds in a large chemical library, thus providing an efficient protocol that robustly identifies a large fraction of true positives.


active learning
machine learning
free energy calculations
computational alchemy
molecular dynamics

Supplementary materials

Supplementary Information: Chemical Space Exploration with Active Learning and Alchemical Free Energies
Supplementary figures and tables


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.