Evolutionary Relationships and Sequence-Structure Determinants in Human SARS Coronavirus-2 Spike Proteins for Host Receptor Recognition
2020-04-27T07:23:30Z (GMT) by
Coronavirus disease 2019 (COVID-19) is a pandemic infectious disease caused by novel Severe Acute Respiratory Syndrome coronavirus-2 (SARS CoV-2). The SARS CoV-2 is transmitted more rapidly and readily than SARS CoV. Both, SARS CoV and SARS CoV-2 via their glycosylated spike proteins recognize the human angiotensin converting enzyme-2 (ACE-2) receptor. We generated multiple sequence alignments and phylogenetic trees for representative spike proteins of CoV and CoV-2 from various host sources in order to analyze the specificity in SARS CoV-2 spike proteins required for causing infection in humans. Our results show that two sequence motifs in the N-terminal domain; "MESEFR" and "SYLTPG" are specific to human SARS CoV-2. In the receptor binding domain (RBD), two sequence motifs; "VGGNY" and "EIYQAGSTPCNGV" and a disulfide bridge connecting 480C and 488C in the extended loop are structural determinants for the recognition of human ACE-2 receptor. The complete genome analysis of representative SARS CoVs from bat, civet, human host sources and human SARS CoV-2 identified the bat genome (GenBank code: MN996532.1) as closest to the recent novel human SARS CoV-2 genomes. The bat CoV genomes (GenBank codes: MG772933 and MG772934) are evolutionary intermediates in the mutagenesis progression towards becoming human SARS CoV-2.