Encoding Protein-Ligand Interactions: Binding Affinity Prediction with Multigraph-based Modeling and Graph Convolutional Network

18 December 2023, Version 2
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Machine learning models are employed to enhance the speed and provide novel insights in drug discovery due to their demonstrated effectiveness in predicting properties of small molecules like pKa, solubility, and binding affinity. These approaches accelerate drug discovery by helping researchers efficiently identify, prioritize, and optimize compounds. Nonetheless, when investigating properties that depend on the interaction between a ligand and its corresponding protein, a compelling need arises to incorporate the protein counterpart information within the models. Recently, graph neural networks (GNNs) have been developed to incorporate 3D structural information to improve our understanding of the underlying protein-ligand interactions. However, incorporating 3D information into GNNs is not always straightforward. To address the challenge, we introduce a model called InterGraph, which models the protein-ligand interaction as topological multigraphs. By leveraging a topological representation, InterGraph offers a comprehensive approach to a graph representation of the intricate spatial organization and connectivity patterns within protein-ligand systems. We introduce interaction spheres that assign varying edge densities, capturing the proximity-based influence of interactions. This approach enables us to capture the characteristics of the interaction network, filtering out the ones that are beyond 9 Å from the ligand since they are not considered relevant or established. Finally, we trained the model using a ligand binding dataset from PDBbind and tested it on a hold-out test set, achieving an RMSE value of 1.34. Our findings have demonstrated the power of the multigraph to encode the importance of close interactions, a factor that is relevant in the context of binding affinity. On average, our model accurately predicts binding affinity values for several protein-ligand complexes and exhibits higher accuracy for hydrolase, lyase, and families of proteins involved in mediating protein-protein interactions. Additionally, the Intergraph method displayed sensitivity to the binding mode when compared to a set of complexes that had undergone redocking calculations

Keywords

GCN
Binding affinity
Multigraph
Protein-ligand binding

Supplementary materials

Title
Description
Actions
Title
Supporting Informations
Description
The supporting information for the paper "Encoding Protein-Ligand Interactions: Binding Affinity Prediction with Multigraph-based Modeling and Graph Convolutional Network" provides additional details and resources that complement the primary research manuscript. The supporting information comprises key visuals, including K-Fold cross-validation results, probability density distributions, a pie chart depicting successful predictions, and comparison plots. These visuals enhance our understanding of the model's performance and data distribution, adding depth to the research.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.