Molecular dynamics simulations and diversity selection by extended continuous similarity indices

01 April 2022, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Molecular dynamics (MD) is a core methodology of molecular modeling and computational design for the study of the dynamics and temporal evolution of molecular systems. MD simulations have particularly benefited from the rapid increase of computational power that has characterized the past decades of computational chemical research, being the first method to be successfully migrated to GPU infrastructure. While new-generation MD softwares are capable of delivering simulations on an ever-increasing scale, relatively less effort is invested in developing post-processing methods that can keep up with the quickly expanding volumes of data that are being generated. Here, we introduce a new idea for sampling frames from large MD trajectories, based on the recently introduced framework of extended similarity indices. Our approach presents a new, linearly scaling alternative to the traditional approach of applying a clustering algorithm that usually scales as a quadratic function of the number of frames. When showcasing its usage on case studies with different system sizes and simulation lengths, we have registered speedups of up to two orders of magnitude, as compared to traditional clustering algorithms. The conformational diversity of the selected frames is also noticeably higher, which is a further advantage for certain applications, such as the selection of structural ensembles for ligand docking. The method is available open source at:


extended similarity
Molecular Dynamics
conformational search
data analysis
dimensionality reduction

Supplementary materials

Supplementary Information
Supplementary Information

Supplementary weblinks


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.