Abstract
One aim of the international Human Proteome Organization (HUPO) Human Proteome Project (HPP) is to obtain high-confidence translation evidence for every human protein-coding gene established in its target list of 19433 entries based on the protein-coding genes from Ensembl-GENCODE. However, 76 are annotated in UniProtKB (as of release 2024_06) with PE5, indicating skepticism in the protein’s existence from a manual curator, so it is unclear if these entries belong in the HPP target list. Here we review these 76 entries by assembling evidence from the literature, reference databases, and genome alignments with other species to conclude whether these entries should be freed from their PE5 status to become annotated with PE1-4 in UniProtKB. We find that 17 of these have credible translation evidence and therefore should be upgraded to PE1. Another 15 lack translation evidence, but have transcription evidence, the evolutionary hallmarks of protein-coding genes, and are presumed to produce functional proteins. 41 have no translational nor transcriptional evidence, although still bear the evolutionary hallmarks of protein-coding genes; currently it remains unclear if these are protein-coding, so their representation becomes a matter of policy. Only 3 entries still seem best categorized as PE5 and excluded from the HUPO-HPP target list.
Supplementary materials
Title
Supplementary Table S1
Description
Output of the HPP target list processing pipeline providing 19433 entries and their attributes.
Actions
Title
Supplementary Table S2
Description
Union of four separate lists of PE5 proteins that are examined here.
Actions
Title
Supplementary Table S3
Description
Extension of Table 1 with additional columns for each of the 76 entries.
Actions