Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies

Abstract Background The genome of SARS-CoV-2 is susceptible to mutations during viral replication due to the errors generated by RNA-dependent RNA polymerases. These mutations enable the SARS-CoV-2 to evolve into new strains. Viral quasispecies emerge from de novo mutations that occur in individual...

Full description

Bibliographic Details
Main Authors: Billy T. Lau, Dmitri Pavlichin, Anna C. Hooker, Alison Almeda, Giwon Shin, Jiamin Chen, Malaya K. Sahoo, Chun Hong Huang, Benjamin A. Pinsky, Ho Joon Lee, Hanlee P. Ji
Format: Article
Language:English
Published: BMC 2021-04-01
Series:Genome Medicine
Subjects:
Online Access:https://doi.org/10.1186/s13073-021-00882-2
id doaj-6c281c23b3f34308a56ed35fdc580c16
record_format Article
spelling doaj-6c281c23b3f34308a56ed35fdc580c162021-04-25T11:10:36ZengBMCGenome Medicine1756-994X2021-04-0113112310.1186/s13073-021-00882-2Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispeciesBilly T. Lau0Dmitri Pavlichin1Anna C. Hooker2Alison Almeda3Giwon Shin4Jiamin Chen5Malaya K. Sahoo6Chun Hong Huang7Benjamin A. Pinsky8Ho Joon Lee9Hanlee P. Ji10Division of Oncology, Department of Medicine, Stanford University School of MedicineDivision of Oncology, Department of Medicine, Stanford University School of MedicineDivision of Oncology, Department of Medicine, Stanford University School of MedicineDivision of Oncology, Department of Medicine, Stanford University School of MedicineDivision of Oncology, Department of Medicine, Stanford University School of MedicineDivision of Oncology, Department of Medicine, Stanford University School of MedicineDepartment of Pathology, Stanford University School of MedicineDepartment of Pathology, Stanford University School of MedicineDepartment of Pathology, Stanford University School of MedicineDivision of Oncology, Department of Medicine, Stanford University School of MedicineDivision of Oncology, Department of Medicine, Stanford University School of MedicineAbstract Background The genome of SARS-CoV-2 is susceptible to mutations during viral replication due to the errors generated by RNA-dependent RNA polymerases. These mutations enable the SARS-CoV-2 to evolve into new strains. Viral quasispecies emerge from de novo mutations that occur in individual patients. In combination, these sets of viral mutations provide distinct genetic fingerprints that reveal the patterns of transmission and have utility in contact tracing. Methods Leveraging thousands of sequenced SARS-CoV-2 genomes, we performed a viral pangenome analysis to identify conserved genomic sequences. We used a rapid and highly efficient computational approach that relies on k-mers, short tracts of sequence, instead of conventional sequence alignment. Using this method, we annotated viral mutation signatures that were associated with specific strains. Based on these highly conserved viral sequences, we developed a rapid and highly scalable targeted sequencing assay to identify mutations, detect quasispecies variants, and identify mutation signatures from patients. These results were compared to the pangenome genetic fingerprints. Results We built a k-mer index for thousands of SARS-CoV-2 genomes and identified conserved genomics regions and landscape of mutations across thousands of virus genomes. We delineated mutation profiles spanning common genetic fingerprints (the combination of mutations in a viral assembly) and a combination of mutations that appear in only a small number of patients. We developed a targeted sequencing assay by selecting primers from the conserved viral genome regions to flank frequent mutations. Using a cohort of 100 SARS-CoV-2 clinical samples, we identified genetic fingerprints consisting of strain-specific mutations seen across populations and de novo quasispecies mutations localized to individual infections. We compared the mutation profiles of viral samples undergoing analysis with the features of the pangenome. Conclusions We conducted an analysis for viral mutation profiles that provide the basis of genetic fingerprints. Our study linked pangenome analysis with targeted deep sequenced SARS-CoV-2 clinical samples. We identified quasispecies mutations occurring within individual patients and determined their general prevalence when compared to over 70,000 other strains. Analysis of these genetic fingerprints may provide a way of conducting molecular contact tracing.https://doi.org/10.1186/s13073-021-00882-2COVID-19SARS-CoV-2PandemicViral mutationsQuasispeciesGenetic fingerprints
collection DOAJ
language English
format Article
sources DOAJ
author Billy T. Lau
Dmitri Pavlichin
Anna C. Hooker
Alison Almeda
Giwon Shin
Jiamin Chen
Malaya K. Sahoo
Chun Hong Huang
Benjamin A. Pinsky
Ho Joon Lee
Hanlee P. Ji
spellingShingle Billy T. Lau
Dmitri Pavlichin
Anna C. Hooker
Alison Almeda
Giwon Shin
Jiamin Chen
Malaya K. Sahoo
Chun Hong Huang
Benjamin A. Pinsky
Ho Joon Lee
Hanlee P. Ji
Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies
Genome Medicine
COVID-19
SARS-CoV-2
Pandemic
Viral mutations
Quasispecies
Genetic fingerprints
author_facet Billy T. Lau
Dmitri Pavlichin
Anna C. Hooker
Alison Almeda
Giwon Shin
Jiamin Chen
Malaya K. Sahoo
Chun Hong Huang
Benjamin A. Pinsky
Ho Joon Lee
Hanlee P. Ji
author_sort Billy T. Lau
title Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies
title_short Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies
title_full Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies
title_fullStr Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies
title_full_unstemmed Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies
title_sort profiling sars-cov-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies
publisher BMC
series Genome Medicine
issn 1756-994X
publishDate 2021-04-01
description Abstract Background The genome of SARS-CoV-2 is susceptible to mutations during viral replication due to the errors generated by RNA-dependent RNA polymerases. These mutations enable the SARS-CoV-2 to evolve into new strains. Viral quasispecies emerge from de novo mutations that occur in individual patients. In combination, these sets of viral mutations provide distinct genetic fingerprints that reveal the patterns of transmission and have utility in contact tracing. Methods Leveraging thousands of sequenced SARS-CoV-2 genomes, we performed a viral pangenome analysis to identify conserved genomic sequences. We used a rapid and highly efficient computational approach that relies on k-mers, short tracts of sequence, instead of conventional sequence alignment. Using this method, we annotated viral mutation signatures that were associated with specific strains. Based on these highly conserved viral sequences, we developed a rapid and highly scalable targeted sequencing assay to identify mutations, detect quasispecies variants, and identify mutation signatures from patients. These results were compared to the pangenome genetic fingerprints. Results We built a k-mer index for thousands of SARS-CoV-2 genomes and identified conserved genomics regions and landscape of mutations across thousands of virus genomes. We delineated mutation profiles spanning common genetic fingerprints (the combination of mutations in a viral assembly) and a combination of mutations that appear in only a small number of patients. We developed a targeted sequencing assay by selecting primers from the conserved viral genome regions to flank frequent mutations. Using a cohort of 100 SARS-CoV-2 clinical samples, we identified genetic fingerprints consisting of strain-specific mutations seen across populations and de novo quasispecies mutations localized to individual infections. We compared the mutation profiles of viral samples undergoing analysis with the features of the pangenome. Conclusions We conducted an analysis for viral mutation profiles that provide the basis of genetic fingerprints. Our study linked pangenome analysis with targeted deep sequenced SARS-CoV-2 clinical samples. We identified quasispecies mutations occurring within individual patients and determined their general prevalence when compared to over 70,000 other strains. Analysis of these genetic fingerprints may provide a way of conducting molecular contact tracing.
topic COVID-19
SARS-CoV-2
Pandemic
Viral mutations
Quasispecies
Genetic fingerprints
url https://doi.org/10.1186/s13073-021-00882-2
work_keys_str_mv AT billytlau profilingsarscov2mutationfingerprintsthatrangefromtheviralpangenometoindividualinfectionquasispecies
AT dmitripavlichin profilingsarscov2mutationfingerprintsthatrangefromtheviralpangenometoindividualinfectionquasispecies
AT annachooker profilingsarscov2mutationfingerprintsthatrangefromtheviralpangenometoindividualinfectionquasispecies
AT alisonalmeda profilingsarscov2mutationfingerprintsthatrangefromtheviralpangenometoindividualinfectionquasispecies
AT giwonshin profilingsarscov2mutationfingerprintsthatrangefromtheviralpangenometoindividualinfectionquasispecies
AT jiaminchen profilingsarscov2mutationfingerprintsthatrangefromtheviralpangenometoindividualinfectionquasispecies
AT malayaksahoo profilingsarscov2mutationfingerprintsthatrangefromtheviralpangenometoindividualinfectionquasispecies
AT chunhonghuang profilingsarscov2mutationfingerprintsthatrangefromtheviralpangenometoindividualinfectionquasispecies
AT benjaminapinsky profilingsarscov2mutationfingerprintsthatrangefromtheviralpangenometoindividualinfectionquasispecies
AT hojoonlee profilingsarscov2mutationfingerprintsthatrangefromtheviralpangenometoindividualinfectionquasispecies
AT hanleepji profilingsarscov2mutationfingerprintsthatrangefromtheviralpangenometoindividualinfectionquasispecies
_version_ 1721510008472993792