A Minimalistic Deep Graph Learning Approach for Protein-Ligand Binding Affinity: One Step Towards Generalization

21 March 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Predicting protein-ligand binding affinity is a fundamental challenge in structure-based drug design. While deep learning models have significantly improved affinity predictions, many state-of-the-art approaches rely on complex architectures with tens or hundreds of thousands of trainable parameters, which may lead to overfitting and reduced generalizability. In this study, we introduce ECIF-GCN, a minimalist deep graph learning model that extends the Extended Connectivity Interaction Features (ECIF) framework by incorporating a fully connected graph representation and leveraging Graph Convolutional Networks (GCNs) to process molecular interactions. ECIF-GCN was trained and evaluated on LP-PDBbind, a benchmark specifically designed to minimize protein and ligand similarity across dataset splits, providing a rigorous assessment of model generalization. Despite having significantly fewer trainable parameters compared to more complex architectures, ECIF-GCN achieved the lowest RMSE (1.52) in the test set of LP-PDBbind, outperforming models such as InteractionGraphNet and RF-Score, which contain a substantially larger number of parameters. These results demonstrate that high predictive accuracy in binding affinity estimation does not require highly overparameterized deep learning models. These results highlight the potential of minimalist deep learning architectures in protein-ligand binding affinity prediction, providing a balance between predictive power, computational efficiency, and generalization ability, and suggest that a carefully designed low-parameter model can achieve state-of-the-art performance, reinforcing the idea that overparameterization is not a prerequisite for robust molecular modeling.

Keywords

Graph Neural Networks
Protein-Ligand Binding Affinity Prediction
Machine Learning for Drug Discovery

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.