Single molecule identification and quantification of whole proteins without purification, proteolysis, or labeling: a computational model

20 June 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

A recent report shows that with a suitably designed buffer solution proteins can be unfolded and translocated through a nanopore unidirectionally and uniformly, with residues exiting the pore in sequence order at a roughly constant rate of 1/µs (Nature Biotechnology 41, 1130–1139, 2023). The present work shows in theory that by sampling the signal of pore exclusion volume (a proxy for the measured blockade current) at a low frequency of 10-20 KHz and digitizing the sampled signal at a volume precision of 70 Å3 a substantial majority of the proteins in a proteome can be identified and counted without labeling. Computations on the full set of sequences in the human proteome (Uniprot id UP000005640_9606) show that ~70% of the proteins can be identified; the result generally holds even when post-translational modifications (PTMs) are present. The identification rate can be increased to better than 95% with modified algorithms; with an array of 100 pores ~109 proteins can be identified/counted in about 1.5 hours. This is a minimalist non-destructive single molecule label-free approach that is based on unmodified nanopores; it serves as a potential alternative to mass spectrometry while overcoming many of the limitations of the latter. In principle it can work with whole proteins in mixtures over the full dynamic range of a proteome without purification/separation, proteolytic degradation, or enzymes for translocation control.

Keywords

Protein identification
Protein quantification
Nanopores
Post-translational modifications
Dynamic range

Supplementary materials

Title
Description
Actions
Title
Protein id file for human proteome
Description
Contains ids for 20538 proteins in human proteome (Uniprot id UP000005640_9606) obtained with computational model
Actions
Title
Protein sequences in human proteome
Description
Reduced file of all 20598 sequences in human proteome (Uniprot id UP000005640_9606)
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.