Deep learning methods for de novo peptide sequencing

Wout Bittremieux; Varun Ananth; William E. Fondrie; Carlo Melendez; Marina Pominova; Justin Sanders; Bo Wen; Melih Yilmaz; William Stafford Noble

doi:10.26434/chemrxiv-2024-l6wnt-v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Deep learning methods for de novo peptide sequencing

07 October 2024, Version 2

Review

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Protein tandem mass spectrometry data is most often interpreted by matching observed mass spectra to a protein database derived from the reference genome of the sample being analyzed. In many application domains, however, a relevant protein database is unavailable or incomplete, and in such settings de novo sequencing is required. Since the introduction of the DeepNovo algorithm in 2017, the field of de novo sequencing has been dominated by deep learning methods, which use large amounts of labeled mass spectrometry data to train multi-layer neural networks to translate from observed mass spectra to corresponding peptide sequences. Here, we describe these deep learning methods, outline procedures for evaluating their performance, and discuss the challenges in the field, both in terms of methods development and evaluation protocols.

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Oct 07, 2024 Version 2

May 27, 2024 Version 1

Version Notes

- Additional de novo tools. - Discussion of PTMs. - Discussion of resource requirements.

Metrics

1,970

3,048

Views

Downloads

Citations

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2024-l6wnt-v2

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Deep learning methods for de novo peptide sequencing

Authors

Abstract

Comments

Version History

Version Notes

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share