Abstract
Chemical diversity is challenging to describe objectively. Despite this, various notions of chemical diversity are used throughout the medicinal chemistry optimization process in drug discovery. In this work, we show the usefulness of considering exploited vectors during different phases of the drug design process to provide a quantitative and objective description of chemical diversity. We have developed a concise and fast approach to enumerate and analyze the exploited vector patterns (EVPs) of molecular compound series, which can then be used in archetypal compound selection tasks from hit matter identification to hit expansion and lead optimization. We firstly show that EVPs can be used to assess the progressibility of compounds in a fragment library design exercise. By considering EVPs, we then show how a set of compounds can be prioritized for hit expansion using EVP-based, customizable diversity sampling approaches, reducing the time taken and mitigating human biases. We also show that EVPs are a useful tool to analyze SAR data, offering the chance to uncover correlations between different vectors without pre-determining the molecular scaffold structures. The codes used to perform these tasks are presented as easy-to-use Jupyter notebooks, which can be readily adapted for further related tasks.
Supplementary weblinks
Title
Code and data
Description
All code (jupyter notebooks) and data needed to reproduce the results.
Actions
View