Abstract
The identification of promising lead compounds showing pharmacological activities toward a biological target is essential in early-stage drug discovery. With the recent increase in available small–molecule databases, virtual high-throughput screening using physics-based molecular docking has emerged as an essential tool in assisting fast and cost-efficient lead discovery and optimization. However, the best scored docking poses are often suboptimal, resulting in incorrect screening and chemical property calculation. We address the pose classification problem by leveraging data-driven machine learning approaches to identify correct docking poses from AutoDock Vina and Glide screens. To enable effective classification of docking poses, we present two convolutional neural network approaches: a 3D convolutional neural network (3D-CNN) and an attention-based point cloud network (PCN) trained on the PDBbind refined set. We demonstrate the effectiveness of our proposed classifiers on multiple evaluation datasets including the standard PDBbind CASF-2016 benchmark dataset and various compound libraries with structurally different protein targets including an ion-channel dataset extracted from Protein Data Bank (PDB) and an inhouse KCa3.1 inhibitor dataset. Our experiments show that excluding false-positive docking poses using the proposed classifiers improves virtual high-throughput screening to identify novel molecules against each target protein, compared to the initial screen based on the docking scores.
Supplementary materials
Title
Supporting Information: Pose Classification using 3D Atomic Structure-Based Neural Networks Applied to Ion Channel-Ligand Docking
Description
All the compound structures of our in-house KCa3.1 inhibitor datasets. Correlations between the Vina docking scores and the binding affinities of the PDB ion-channel complexes using all 7 pose classification models. Correlations between the docking scores and the binding affinities of the KCa3.1 channel inhibitors complexes using all 7 pose classification models. Pearson correlations between binding affinity and docking scores of the top 10, 20, 30 and 40 ranked compounds based on the confidence scores of our pose classifier models on the KCa3.1 channel inhibitor dataset with Vina and Glide. Top 10 ranked compounds in the KCa3.1 channel inhibitor dataset using the 4 best performing pose classifiers.
Actions