NG-Tax, a highly accurate and validated pipeline for analysis of 16S rRNA amplicons from complex biomes [version 2; referees: 2 approved, 1 approved with reservations, 1 not approved]

Background: Massive high-throughput sequencing of short, hypervariable segments of the 16S ribosomal RNA (rRNA) gene has transformed the methodological landscape describing microbial diversity within and across complex biomes. However, several studies have shown that the methodology rather than the...

Full description

Bibliographic Details
Main Authors: Javier Ramiro-Garcia, Gerben D. A. Hermes, Christos Giatsis, Detmer Sipkema, Erwin G. Zoetendal, Peter J. Schaap, Hauke Smidt
Format: Article
Language:English
Published: F1000 Research Ltd 2018-11-01
Series:F1000Research
Subjects:
Online Access:https://f1000research.com/articles/5-1791/v2
id doaj-e75784d326eb4d9b8693be61e6833ef8
record_format Article
spelling doaj-e75784d326eb4d9b8693be61e6833ef82020-11-25T03:30:20ZengF1000 Research LtdF1000Research2046-14022018-11-01510.12688/f1000research.9227.218667NG-Tax, a highly accurate and validated pipeline for analysis of 16S rRNA amplicons from complex biomes [version 2; referees: 2 approved, 1 approved with reservations, 1 not approved]Javier Ramiro-Garcia0Gerben D. A. Hermes1Christos Giatsis2Detmer Sipkema3Erwin G. Zoetendal4Peter J. Schaap5Hauke Smidt6TI Food and Nutrition (TIFN), Wageningen, 6703 HB, The NetherlandsTI Food and Nutrition (TIFN), Wageningen, 6703 HB, The NetherlandsAquaculture and Fisheries Group, Wageningen University, Wageningen, 6708 WD, The NetherlandsLaboratory of Microbiology, Wageningen University, Wageningen, 6708 WE, The NetherlandsTI Food and Nutrition (TIFN), Wageningen, 6703 HB, The NetherlandsTI Food and Nutrition (TIFN), Wageningen, 6703 HB, The NetherlandsLaboratory of Microbiology, Wageningen University, Wageningen, 6708 WE, The NetherlandsBackground: Massive high-throughput sequencing of short, hypervariable segments of the 16S ribosomal RNA (rRNA) gene has transformed the methodological landscape describing microbial diversity within and across complex biomes. However, several studies have shown that the methodology rather than the biological variation is responsible for the observed sample composition and distribution. This compromises meta-analyses, although this fact is often disregarded. Results: To facilitate true meta-analysis of microbiome studies, we developed NG-Tax, a pipeline for 16S rRNA gene amplicon sequence analysis that was validated with different mock communities and benchmarked against QIIME as a frequently used pipeline. The microbial composition of 49 independently amplified mock samples was characterized by sequencing two variable 16S rRNA gene regions, V4 and V5-V6, in three separate sequencing runs on Illumina’s HiSeq2000 platform. This allowed for the evaluation of important causes of technical bias in taxonomic classification: 1) run-to-run sequencing variation, 2) PCR–error, and 3) region/primer specific amplification bias. Despite the short read length (~140 nt) and all technical biases, the average specificity of the taxonomic assignment for the phylotypes included in the mock communities was 97.78%. On average 99.95% and 88.43% of the reads could be assigned to at least family or genus level, respectively, while assignment to ‘spurious genera’ represented on average only 0.21% of the reads per sample. Analysis of α- and β-diversity confirmed conclusions guided by biology rather than the aforementioned methodological aspects, which was not achieved with QIIME. Conclusions: Different biological outcomes are commonly observed due to 16S rRNA region-specific performance. NG-Tax demonstrated high robustness against choice of region and other technical biases associated with 16S rRNA gene amplicon sequencing studies, diminishing their impact and providing accurate qualitative and quantitative representation of the true sample composition. This will improve comparability between studies and facilitate efforts towards standardization.https://f1000research.com/articles/5-1791/v2BioinformaticsGenomicsMicrobial Evolution & Genomics
collection DOAJ
language English
format Article
sources DOAJ
author Javier Ramiro-Garcia
Gerben D. A. Hermes
Christos Giatsis
Detmer Sipkema
Erwin G. Zoetendal
Peter J. Schaap
Hauke Smidt
spellingShingle Javier Ramiro-Garcia
Gerben D. A. Hermes
Christos Giatsis
Detmer Sipkema
Erwin G. Zoetendal
Peter J. Schaap
Hauke Smidt
NG-Tax, a highly accurate and validated pipeline for analysis of 16S rRNA amplicons from complex biomes [version 2; referees: 2 approved, 1 approved with reservations, 1 not approved]
F1000Research
Bioinformatics
Genomics
Microbial Evolution & Genomics
author_facet Javier Ramiro-Garcia
Gerben D. A. Hermes
Christos Giatsis
Detmer Sipkema
Erwin G. Zoetendal
Peter J. Schaap
Hauke Smidt
author_sort Javier Ramiro-Garcia
title NG-Tax, a highly accurate and validated pipeline for analysis of 16S rRNA amplicons from complex biomes [version 2; referees: 2 approved, 1 approved with reservations, 1 not approved]
title_short NG-Tax, a highly accurate and validated pipeline for analysis of 16S rRNA amplicons from complex biomes [version 2; referees: 2 approved, 1 approved with reservations, 1 not approved]
title_full NG-Tax, a highly accurate and validated pipeline for analysis of 16S rRNA amplicons from complex biomes [version 2; referees: 2 approved, 1 approved with reservations, 1 not approved]
title_fullStr NG-Tax, a highly accurate and validated pipeline for analysis of 16S rRNA amplicons from complex biomes [version 2; referees: 2 approved, 1 approved with reservations, 1 not approved]
title_full_unstemmed NG-Tax, a highly accurate and validated pipeline for analysis of 16S rRNA amplicons from complex biomes [version 2; referees: 2 approved, 1 approved with reservations, 1 not approved]
title_sort ng-tax, a highly accurate and validated pipeline for analysis of 16s rrna amplicons from complex biomes [version 2; referees: 2 approved, 1 approved with reservations, 1 not approved]
publisher F1000 Research Ltd
series F1000Research
issn 2046-1402
publishDate 2018-11-01
description Background: Massive high-throughput sequencing of short, hypervariable segments of the 16S ribosomal RNA (rRNA) gene has transformed the methodological landscape describing microbial diversity within and across complex biomes. However, several studies have shown that the methodology rather than the biological variation is responsible for the observed sample composition and distribution. This compromises meta-analyses, although this fact is often disregarded. Results: To facilitate true meta-analysis of microbiome studies, we developed NG-Tax, a pipeline for 16S rRNA gene amplicon sequence analysis that was validated with different mock communities and benchmarked against QIIME as a frequently used pipeline. The microbial composition of 49 independently amplified mock samples was characterized by sequencing two variable 16S rRNA gene regions, V4 and V5-V6, in three separate sequencing runs on Illumina’s HiSeq2000 platform. This allowed for the evaluation of important causes of technical bias in taxonomic classification: 1) run-to-run sequencing variation, 2) PCR–error, and 3) region/primer specific amplification bias. Despite the short read length (~140 nt) and all technical biases, the average specificity of the taxonomic assignment for the phylotypes included in the mock communities was 97.78%. On average 99.95% and 88.43% of the reads could be assigned to at least family or genus level, respectively, while assignment to ‘spurious genera’ represented on average only 0.21% of the reads per sample. Analysis of α- and β-diversity confirmed conclusions guided by biology rather than the aforementioned methodological aspects, which was not achieved with QIIME. Conclusions: Different biological outcomes are commonly observed due to 16S rRNA region-specific performance. NG-Tax demonstrated high robustness against choice of region and other technical biases associated with 16S rRNA gene amplicon sequencing studies, diminishing their impact and providing accurate qualitative and quantitative representation of the true sample composition. This will improve comparability between studies and facilitate efforts towards standardization.
topic Bioinformatics
Genomics
Microbial Evolution & Genomics
url https://f1000research.com/articles/5-1791/v2
work_keys_str_mv AT javierramirogarcia ngtaxahighlyaccurateandvalidatedpipelineforanalysisof16srrnaampliconsfromcomplexbiomesversion2referees2approved1approvedwithreservations1notapproved
AT gerbendahermes ngtaxahighlyaccurateandvalidatedpipelineforanalysisof16srrnaampliconsfromcomplexbiomesversion2referees2approved1approvedwithreservations1notapproved
AT christosgiatsis ngtaxahighlyaccurateandvalidatedpipelineforanalysisof16srrnaampliconsfromcomplexbiomesversion2referees2approved1approvedwithreservations1notapproved
AT detmersipkema ngtaxahighlyaccurateandvalidatedpipelineforanalysisof16srrnaampliconsfromcomplexbiomesversion2referees2approved1approvedwithreservations1notapproved
AT erwingzoetendal ngtaxahighlyaccurateandvalidatedpipelineforanalysisof16srrnaampliconsfromcomplexbiomesversion2referees2approved1approvedwithreservations1notapproved
AT peterjschaap ngtaxahighlyaccurateandvalidatedpipelineforanalysisof16srrnaampliconsfromcomplexbiomesversion2referees2approved1approvedwithreservations1notapproved
AT haukesmidt ngtaxahighlyaccurateandvalidatedpipelineforanalysisof16srrnaampliconsfromcomplexbiomesversion2referees2approved1approvedwithreservations1notapproved
_version_ 1724576173065764864