Abstract
We present a systematic approach for the identification of statistically relevant conformational macrostates of organic molecules from molecular dynamics trajectories. The approach applies to molecules characterised by an arbitrary number of torsional degrees of freedom and enables the transferability of the macrostates definition across different environments. We formulate a dissimilarity measure between molecular configurations that incorporates information on the characteristic energetic cost associated with transitions along all relevant torsional degrees of freedom. Such metric is employed to perform unsupervised clustering of molecular configurations based on the fast search and find of density peaks algorithm. We apply this method to investigate the equilibrium conformational ensemble of Sildenafil, a conformationally complex pharmaceutical compound, in different environments including the crystal bulk, the gas phase and three different solvents (acetonitrile, 1-butanol, and toluene). We demonstrate that, while Sildenafil can adopt more than one hundred metastable conformational configurations, only 12 are significantly populated across all the environments investigated. Despite the complexity of the conformational space, we find that the most abundant conformers in solution are the closest to the conformers found in the most common Sildenafil crystal phase.