Abstract
A critical step in structure-based drug discovery is predicting whether and how a candidate molecule binds to a model of a therapeutic target. However, substantial protein side chain movements prevent current screening methods, such as docking, from accurately predicting the ligand conformations, and require expensive refinements to produce viable candidates. We present the development of a high-throughput and flexible ligand pose refinement workflow, called "tinyIFD". The main features of the workflow includes the use of specialized high-throughput, small-system MD simulation code mdgx.cuda and an actively learning model zoo approach. We show the application of this workflow on a large test set of diverse protein targets, achieving 70% and 78% success rates for finding a crystal-like pose within top-2 and top-5 poses, respectively. We also applied this workflow to the SARS-CoV-2 main protease (Mpro) inhibitors, where we demonstrate the benefit of the active learning aspect in this workflow.
Supplementary materials
Title
Supporting Information for the manuscript
Description
This document includes: (1) A list of software and the versions of which used in this manuscript, (2) detailed explanation of features extracted from MD snapshots for the classifiers, (3) a list of cross-docking cases used in the model zoo, (4) a list of cross-docking cases in the training set but excluded from the model zoo, (5) test set refinement results, and (6) a list of PDB entries used for the Mpro dataset.
Actions