These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
Deriver Paper v2.pdf (1.53 MB)
Assessing Methods and Obstacles in Chemical Space Exploration
Preprints are manuscripts made publicly available before they have been submitted for formal peer review and publication. They might contain new research findings or data. Preprints can be a draft or final version of an author's research but must not have been accepted for publication at the time of submission.
Benchmarking the performance of generative methods for drug design is complex and multifaceted. In this report, we propose a separation of concerns for de novo drug design, categorizing the task into three main categories: generation, discrimination, and exploration. We demonstrate that changes to any of these three concerns impacts benchmark performance for drug design tasks. In this report we present Deriver, an open-source Python package that acts as a modular framework for molecule generation, with a focus on integrating multiple generative methods. Using Deriver, we demonstrate that changing parameters related to each of these three concerns impacts chemical space traversal significantly, and that the freedom to independently adjust each is critical to real-world applications having conflicting priorities. We find that combining multiple generative methods can improve optimization of molecular properties, and lower the chance of becoming trapped in local minima. Additionally, filtering molecules for drug-likeness (based on physicochemical properties and SMARTS pattern matching) before they are scored can hinder exploration, but can improve the quality of the final molecules. Finally, we demonstrate that any given task has an exploration algorithm best suited to it, though in practice linear probabilistic sampling generally results in the best outcomes, when compared to Monte Carlo sampling or greedy sampling. We intend that Deriver, which is being made freely available, will be helpful to others interested in collaboratively improving existing methods in de novo drug design centered around inheritance of molecular structure, modularity, extensibility, and separation of concerns.