A 2025 perspective on the role of machine learning in clinical proteomics

Charlotte Adams; Wout Bittremieux

doi:10.26434/chemrxiv-2025-b9lpx

Machine learning holds significant promise for accelerating biomarker discovery in clinical proteomics, yet its real-world impact remains limited by widespread methodological pitfalls and unrealistic expectations. In this perspective, we critically examine the integration of machine learning into clinical proteomics workflows, emphasizing that algorithmic novelty alone cannot compensate for issues such as small sample sizes, batch effects, overfitting, data leakage, and poor model generalization. We caution against the uncritical application of complex models, such as deep learning architectures, that often exacerbate these problems, offering limited interpretability and negligible performance gains in typical clinical proteomics datasets. Instead, we advocate for the realistic and responsible use of machine learning, grounded in rigorous study design, appropriate validation strategies, and transparent, reproducible modeling practices. Emphasizing simplicity, interpretability, and domain awareness over hype-driven complexity is essential if machine learning is to fulfill its translational potential in the clinic.

A 2025 perspective on the role of machine learning in clinical proteomics

Abstract

Keywords

Comments

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share

A 2025 perspective on the role of machine learning in clinical proteomics

Authors

Abstract

Keywords

Comments

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share