Graph neural networks and molecular docking as two complementary approaches for virtual screening: a case study on Cruzain

21 December 2022, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The idea behind virtual screening is to first test compounds computationally in order to reduce the number of compounds that need to be screened experimentally, thus reducing the time and cost of physical experiments. Molecular docking is the most popular virtual screening technique, it predicts the binding of candidate compounds to the protein target by modeling the interactions at the binding pocket. Despite being widely used, docking accuracy is often low due to the difficulty of modeling inherently complex biological systems. On the other hand, state of the art deep neural networks, like Graph Convolutional Networks (GCNs) are able to capture the complex non-linear relationships between structural and biological data, but they lack the interpretability of structure-based modeling. In this work we took advantage of the activity data from a quantitative High Throughput Screen (HTS) of ~200K compounds against Cruzain (Cz) to retrospectively evaluate the ability of a docking algorithm and a Graph Convolutional Network for prioritizing the active compounds from the dataset. We then propose strategies to combine both techniques in a single virtual screening pipeline in order to exploit their orthogonal benefits. By plugging in the atomic embeddings learned by the GCN into the docking algorithm by means of pharmacophoric restraints, docking ability to retrieve the active ligands was enhanced. Moreover, by applying the GCN as a pre-docking filter, the compound’s library was enriched in active molecules and subsequent docking of the filtered library achieved significantly higher hit rates. This work aims to be a proof of concept of the usefulness of combination strategies involving deep learning and classical molecular docking techniques, in the context of drug discovery.

Keywords

Graph Neural Network
Docking
Chagas Disease
Cruzain
Structure Based Virtual Screening
Ligand Based Virtual Screening

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.