Development of an Analysis Pipeline Characterizing Multiple Hypervariable Regions of 16S rRNA Using Mock Samples.

OBJECTIVES:There is much speculation on which hypervariable region provides the highest bacterial specificity in 16S rRNA sequencing. The optimum solution to prevent bias and to obtain a comprehensive view of complex bacterial communities would be to sequence the entire 16S rRNA gene; however, this...

Full description

Bibliographic Details
Main Authors: Jennifer J Barb, Andrew J Oler, Hyung-Suk Kim, Natalia Chalmers, Gwenyth R Wallen, Ann Cashion, Peter J Munson, Nancy J Ames
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2016-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4734828?pdf=render
id doaj-69540054c5ec4b1aa3925ac7f82c3a69
record_format Article
spelling doaj-69540054c5ec4b1aa3925ac7f82c3a692020-11-25T01:58:56ZengPublic Library of Science (PLoS)PLoS ONE1932-62032016-01-01112e014804710.1371/journal.pone.0148047Development of an Analysis Pipeline Characterizing Multiple Hypervariable Regions of 16S rRNA Using Mock Samples.Jennifer J BarbAndrew J OlerHyung-Suk KimNatalia ChalmersGwenyth R WallenAnn CashionPeter J MunsonNancy J AmesOBJECTIVES:There is much speculation on which hypervariable region provides the highest bacterial specificity in 16S rRNA sequencing. The optimum solution to prevent bias and to obtain a comprehensive view of complex bacterial communities would be to sequence the entire 16S rRNA gene; however, this is not possible with second generation standard library design and short-read next-generation sequencing technology. METHODS:This paper examines a new process using seven hypervariable or V regions of the 16S rRNA (six amplicons: V2, V3, V4, V6-7, V8, and V9) processed simultaneously on the Ion Torrent Personal Genome Machine (Life Technologies, Grand Island, NY). Four mock samples were amplified using the 16S Ion Metagenomics Kit™ (Life Technologies) and their sequencing data is subjected to a novel analytical pipeline. RESULTS:Results are presented at family and genus level. The Kullback-Leibler divergence (DKL), a measure of the departure of the computed from the nominal bacterial distribution in the mock samples, was used to infer which region performed best at the family and genus levels. Three different hypervariable regions, V2, V4, and V6-7, produced the lowest divergence compared to the known mock sample. The V9 region gave the highest (worst) average DKL while the V4 gave the lowest (best) average DKL. In addition to having a high DKL, the V9 region in both the forward and reverse directions performed the worst finding only 17% and 53% of the known family level and 12% and 47% of the genus level bacteria, while results from the forward and reverse V4 region identified all 17 family level bacteria. CONCLUSIONS:The results of our analysis have shown that our sequencing methods using 6 hypervariable regions of the 16S rRNA and subsequent analysis is valid. This method also allowed for the assessment of how well each of the variable regions might perform simultaneously. Our findings will provide the basis for future work intended to assess microbial abundance at different time points throughout a clinical protocol.http://europepmc.org/articles/PMC4734828?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Jennifer J Barb
Andrew J Oler
Hyung-Suk Kim
Natalia Chalmers
Gwenyth R Wallen
Ann Cashion
Peter J Munson
Nancy J Ames
spellingShingle Jennifer J Barb
Andrew J Oler
Hyung-Suk Kim
Natalia Chalmers
Gwenyth R Wallen
Ann Cashion
Peter J Munson
Nancy J Ames
Development of an Analysis Pipeline Characterizing Multiple Hypervariable Regions of 16S rRNA Using Mock Samples.
PLoS ONE
author_facet Jennifer J Barb
Andrew J Oler
Hyung-Suk Kim
Natalia Chalmers
Gwenyth R Wallen
Ann Cashion
Peter J Munson
Nancy J Ames
author_sort Jennifer J Barb
title Development of an Analysis Pipeline Characterizing Multiple Hypervariable Regions of 16S rRNA Using Mock Samples.
title_short Development of an Analysis Pipeline Characterizing Multiple Hypervariable Regions of 16S rRNA Using Mock Samples.
title_full Development of an Analysis Pipeline Characterizing Multiple Hypervariable Regions of 16S rRNA Using Mock Samples.
title_fullStr Development of an Analysis Pipeline Characterizing Multiple Hypervariable Regions of 16S rRNA Using Mock Samples.
title_full_unstemmed Development of an Analysis Pipeline Characterizing Multiple Hypervariable Regions of 16S rRNA Using Mock Samples.
title_sort development of an analysis pipeline characterizing multiple hypervariable regions of 16s rrna using mock samples.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2016-01-01
description OBJECTIVES:There is much speculation on which hypervariable region provides the highest bacterial specificity in 16S rRNA sequencing. The optimum solution to prevent bias and to obtain a comprehensive view of complex bacterial communities would be to sequence the entire 16S rRNA gene; however, this is not possible with second generation standard library design and short-read next-generation sequencing technology. METHODS:This paper examines a new process using seven hypervariable or V regions of the 16S rRNA (six amplicons: V2, V3, V4, V6-7, V8, and V9) processed simultaneously on the Ion Torrent Personal Genome Machine (Life Technologies, Grand Island, NY). Four mock samples were amplified using the 16S Ion Metagenomics Kit™ (Life Technologies) and their sequencing data is subjected to a novel analytical pipeline. RESULTS:Results are presented at family and genus level. The Kullback-Leibler divergence (DKL), a measure of the departure of the computed from the nominal bacterial distribution in the mock samples, was used to infer which region performed best at the family and genus levels. Three different hypervariable regions, V2, V4, and V6-7, produced the lowest divergence compared to the known mock sample. The V9 region gave the highest (worst) average DKL while the V4 gave the lowest (best) average DKL. In addition to having a high DKL, the V9 region in both the forward and reverse directions performed the worst finding only 17% and 53% of the known family level and 12% and 47% of the genus level bacteria, while results from the forward and reverse V4 region identified all 17 family level bacteria. CONCLUSIONS:The results of our analysis have shown that our sequencing methods using 6 hypervariable regions of the 16S rRNA and subsequent analysis is valid. This method also allowed for the assessment of how well each of the variable regions might perform simultaneously. Our findings will provide the basis for future work intended to assess microbial abundance at different time points throughout a clinical protocol.
url http://europepmc.org/articles/PMC4734828?pdf=render
work_keys_str_mv AT jenniferjbarb developmentofananalysispipelinecharacterizingmultiplehypervariableregionsof16srrnausingmocksamples
AT andrewjoler developmentofananalysispipelinecharacterizingmultiplehypervariableregionsof16srrnausingmocksamples
AT hyungsukkim developmentofananalysispipelinecharacterizingmultiplehypervariableregionsof16srrnausingmocksamples
AT nataliachalmers developmentofananalysispipelinecharacterizingmultiplehypervariableregionsof16srrnausingmocksamples
AT gwenythrwallen developmentofananalysispipelinecharacterizingmultiplehypervariableregionsof16srrnausingmocksamples
AT anncashion developmentofananalysispipelinecharacterizingmultiplehypervariableregionsof16srrnausingmocksamples
AT peterjmunson developmentofananalysispipelinecharacterizingmultiplehypervariableregionsof16srrnausingmocksamples
AT nancyjames developmentofananalysispipelinecharacterizingmultiplehypervariableregionsof16srrnausingmocksamples
_version_ 1724967125565571072