Abstract
Predicting protein-ligand binding affinity is a fundamental challenge in structure-based drug design. While deep learning models have significantly improved affinity predictions, many state-of-the-art approaches rely on complex architectures with tens or hundreds of thousands of trainable parameters, which may lead to overfitting and reduced generalizability. In this study, we introduce ECIF-GCN, a minimalist deep graph learning model that extends the Extended Connectivity Interaction Features (ECIF) framework by incorporating a fully connected graph representation and leveraging Graph Convolutional Networks (GCNs) to process molecular interactions. ECIF-GCN was trained and evaluated on LP-PDBbind, a benchmark specifically designed to minimize protein and ligand similarity across dataset splits, providing a rigorous assessment of model generalization. Despite having significantly fewer trainable parameters compared to more complex architectures, ECIF-GCN achieved the lowest RMSE (1.52) in the test set of LP-PDBbind, outperforming models such as InteractionGraphNet and RF-Score, which contain a substantially larger number of parameters. These results demonstrate that high predictive accuracy in binding affinity estimation does not require highly overparameterized deep learning models. These results highlight the potential of minimalist deep learning architectures in protein-ligand binding affinity prediction, providing a balance between predictive power, computational efficiency, and generalization ability, and suggest that a carefully designed low-parameter model can achieve state-of-the-art performance, reinforcing the idea that overparameterization is not a prerequisite for robust molecular modeling.