SMARTDock: A toolkit for the automated development of target-specific scoring functions using bioactivity data

01 July 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Molecular docking has become an essential tool in the early stages of structure-based drug discovery, enabling rapid virtual screening of large compound libraries against biological targets. However, the accuracy of binder selection is often limited by the available scoring functions. Here, we present a novel workflow SMARTDock (Scoring with Machine learning and Activity for Ranking Targeted Docking) that enhances the virtual screening capabilities of GOLD docking by integrating publicly available bioactivity data, a protein-ligand interaction fingerprint (PADIF), and machine learning classification models within a user-friendly Docker environment. This platform-independent approach enables seamless use on different operating systems and is accessible to both computational and medicinal chemists. With only a ChEMBL target ID, a protein structure file, and a SMILES list of testing compounds, users can build and apply target-specific scoring models to improve the enrichment of active compounds in the top ranks. We demonstrate that this workflow reliably enhances the virtual screening performance in a high-throughput screening context and provides significant benefits for early-stage drug discovery. Finally, we show the advantages and disadvantages in the bioactive classification in virtual screening tasks.

Keywords

Molecular Docking
Bioactivity Data
Machine Learning
Scoring Functions
Target-Specific Scoring Functions

Supplementary materials

Title
Description
Actions
Title
Supporting Information
Description
Supporting Information for SMARTDock - Figure S1-S5 - Table S1
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.