Abstract
Molecular docking has become an essential tool in the early stages of structure-based drug discovery, enabling rapid virtual screening of large compound libraries against biological targets. However, the accuracy of binder selection is often limited by the available scoring functions. Here, we present a novel workflow SMARTDock (Scoring with Machine learning and Activity for Ranking Targeted Docking) that enhances the virtual screening capabilities of GOLD docking by integrating publicly available bioactivity data, a protein-ligand interaction fingerprint (PADIF), and machine learning classification models within a user-friendly Docker environment. This platform-independent approach enables seamless use on different operating systems and is accessible to both computational and medicinal chemists. With only a ChEMBL target ID, a protein structure file, and a SMILES list of testing compounds, users can build and apply target-specific scoring models to improve the enrichment of active compounds in the top ranks. We demonstrate that this workflow reliably enhances the virtual screening performance in a high-throughput screening context and provides significant benefits for early-stage drug discovery. Finally, we show the advantages and disadvantages in the bioactive classification in virtual screening tasks.
Supplementary materials
Title
Supporting Information
Description
Supporting Information for SMARTDock
- Figure S1-S5
- Table S1
Actions