Rapid protein sequence evolution via compensatory frameshift is widespread in RNA virus genomes

Background: RNA viruses possess remarkable evolutionary versatility driven by the high mutability of their genomes. Frameshifting nucleotide insertions or deletions (indels), which cause the premature termination of proteins, are frequently observed in the coding sequences of various viral genomes....

Full description

Bibliographic Details
Main Authors: Hahn, Y. (Author), Park, D. (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
RNA
Online Access:View Fulltext in Publisher
LEADER 03110nam a2200565Ia 4500
001 10.1186-s12859-021-04182-9
008 220427s2021 CNT 000 0 und d
020 |a 14712105 (ISSN) 
245 1 0 |a Rapid protein sequence evolution via compensatory frameshift is widespread in RNA virus genomes 
260 0 |b BioMed Central Ltd  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1186/s12859-021-04182-9 
520 3 |a Background: RNA viruses possess remarkable evolutionary versatility driven by the high mutability of their genomes. Frameshifting nucleotide insertions or deletions (indels), which cause the premature termination of proteins, are frequently observed in the coding sequences of various viral genomes. When a secondary indel occurs near the primary indel site, the open reading frame can be restored to produce functional proteins, a phenomenon known as the compensatory frameshift. Results: In this study, we systematically analyzed publicly available viral genome sequences and identified compensatory frameshift events in hundreds of viral protein-coding sequences. Compensatory frameshift events resulted in large-scale amino acid differences between the compensatory frameshift form and the wild type even though their nucleotide sequences were almost identical. Phylogenetic analyses revealed that the evolutionary distance between proteins with and without a compensatory frameshift were significantly overestimated because amino acid mismatches caused by compensatory frameshifts were counted as substitutions. Further, this could cause compensatory frameshift forms to branch in different locations in the protein and nucleotide trees, which may obscure the correct interpretation of phylogenetic relationships between variant viruses. Conclusions: Our results imply that the compensatory frameshift is one of the mechanisms driving the rapid protein evolution of RNA viruses and potentially assisting their host-range expansion and adaptation. © 2021, The Author(s). 
650 0 4 |a amino acid sequence 
650 0 4 |a Amino Acid Sequence 
650 0 4 |a Amino acids 
650 0 4 |a Amino-acids 
650 0 4 |a Coding sequences 
650 0 4 |a Compensatory frameshift 
650 0 4 |a Compensatory frameshift 
650 0 4 |a Frameshift 
650 0 4 |a frameshift mutation 
650 0 4 |a Frameshift Mutation 
650 0 4 |a Genes 
650 0 4 |a genetics 
650 0 4 |a Genome, Viral 
650 0 4 |a Nucleotides 
650 0 4 |a Open reading frame 
650 0 4 |a phylogeny 
650 0 4 |a Phylogeny 
650 0 4 |a Premature termination 
650 0 4 |a Protein evolution 
650 0 4 |a Protein evolution 
650 0 4 |a Protein sequences 
650 0 4 |a Proteins 
650 0 4 |a RNA 
650 0 4 |a RNA 
650 0 4 |a RNA 
650 0 4 |a RNA virus 
650 0 4 |a RNA virus 
650 0 4 |a RNA virus 
650 0 4 |a RNA Viruses 
650 0 4 |a RNA, Viral 
650 0 4 |a Viral genome 
650 0 4 |a Viral genome 
650 0 4 |a virus genome 
650 0 4 |a virus RNA 
650 0 4 |a Viruses 
700 1 |a Hahn, Y.  |e author 
700 1 |a Park, D.  |e author 
773 |t BMC Bioinformatics