A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity.

Forty-two cytopathic effect (CPE)-positive isolates were collected from 2008 to 2012. All isolates could not be identified for known viral pathogens by routine diagnostic assays. They were pooled into 8 groups of 5-6 isolates to reduce the sequencing cost. Next-generation sequencing (NGS) was conduc...

Full description

Bibliographic Details
Main Authors: Yu-Nong Gong, Guang-Wu Chen, Shu-Li Yang, Ching-Ju Lee, Shin-Ru Shih, Kuo-Chien Tsao
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2016-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4795770?pdf=render
id doaj-d2afee50e3e548caa7c8ef79029c66bf
record_format Article
spelling doaj-d2afee50e3e548caa7c8ef79029c66bf2020-11-24T20:50:52ZengPublic Library of Science (PLoS)PLoS ONE1932-62032016-01-01113e015149510.1371/journal.pone.0151495A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity.Yu-Nong GongGuang-Wu ChenShu-Li YangChing-Ju LeeShin-Ru ShihKuo-Chien TsaoForty-two cytopathic effect (CPE)-positive isolates were collected from 2008 to 2012. All isolates could not be identified for known viral pathogens by routine diagnostic assays. They were pooled into 8 groups of 5-6 isolates to reduce the sequencing cost. Next-generation sequencing (NGS) was conducted for each group of mixed samples, and the proposed data analysis pipeline was used to identify viral pathogens in these mixed samples. Polymerase chain reaction (PCR) or enzyme-linked immunosorbent assay (ELISA) was individually conducted for each of these 42 isolates depending on the predicted viral types in each group. Two isolates remained unknown after these tests. Moreover, iteration mapping was implemented for each of these 2 isolates, and predicted human parechovirus (HPeV) in both. In summary, our NGS pipeline detected the following viruses among the 42 isolates: 29 human rhinoviruses (HRVs), 10 HPeVs, 1 human adenovirus (HAdV), 1 echovirus and 1 rotavirus. We then focused on the 10 identified Taiwanese HPeVs because of their reported clinical significance over HRVs. Their genomes were assembled and their genetic diversity was explored. One novel 6-bp deletion was found in one HPeV-1 virus. In terms of nucleotide heterogeneity, 64 genetic variants were detected from these HPeVs using the mapped NGS reads. Most importantly, a recombination event was found between our HPeV-3 and a known HPeV-4 strain in the database. Similar event was detected in the other HPeV-3 strains in the same clade of the phylogenetic tree. These findings demonstrated that the proposed NGS data analysis pipeline identified unknown viruses from the mixed clinical samples, revealed their genetic identity and variants, and characterized their genetic features in terms of viral evolution.http://europepmc.org/articles/PMC4795770?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Yu-Nong Gong
Guang-Wu Chen
Shu-Li Yang
Ching-Ju Lee
Shin-Ru Shih
Kuo-Chien Tsao
spellingShingle Yu-Nong Gong
Guang-Wu Chen
Shu-Li Yang
Ching-Ju Lee
Shin-Ru Shih
Kuo-Chien Tsao
A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity.
PLoS ONE
author_facet Yu-Nong Gong
Guang-Wu Chen
Shu-Li Yang
Ching-Ju Lee
Shin-Ru Shih
Kuo-Chien Tsao
author_sort Yu-Nong Gong
title A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity.
title_short A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity.
title_full A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity.
title_fullStr A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity.
title_full_unstemmed A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity.
title_sort next-generation sequencing data analysis pipeline for detecting unknown pathogens from mixed clinical samples and revealing their genetic diversity.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2016-01-01
description Forty-two cytopathic effect (CPE)-positive isolates were collected from 2008 to 2012. All isolates could not be identified for known viral pathogens by routine diagnostic assays. They were pooled into 8 groups of 5-6 isolates to reduce the sequencing cost. Next-generation sequencing (NGS) was conducted for each group of mixed samples, and the proposed data analysis pipeline was used to identify viral pathogens in these mixed samples. Polymerase chain reaction (PCR) or enzyme-linked immunosorbent assay (ELISA) was individually conducted for each of these 42 isolates depending on the predicted viral types in each group. Two isolates remained unknown after these tests. Moreover, iteration mapping was implemented for each of these 2 isolates, and predicted human parechovirus (HPeV) in both. In summary, our NGS pipeline detected the following viruses among the 42 isolates: 29 human rhinoviruses (HRVs), 10 HPeVs, 1 human adenovirus (HAdV), 1 echovirus and 1 rotavirus. We then focused on the 10 identified Taiwanese HPeVs because of their reported clinical significance over HRVs. Their genomes were assembled and their genetic diversity was explored. One novel 6-bp deletion was found in one HPeV-1 virus. In terms of nucleotide heterogeneity, 64 genetic variants were detected from these HPeVs using the mapped NGS reads. Most importantly, a recombination event was found between our HPeV-3 and a known HPeV-4 strain in the database. Similar event was detected in the other HPeV-3 strains in the same clade of the phylogenetic tree. These findings demonstrated that the proposed NGS data analysis pipeline identified unknown viruses from the mixed clinical samples, revealed their genetic identity and variants, and characterized their genetic features in terms of viral evolution.
url http://europepmc.org/articles/PMC4795770?pdf=render
work_keys_str_mv AT yunonggong anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT guangwuchen anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT shuliyang anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT chingjulee anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT shinrushih anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT kuochientsao anextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT yunonggong nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT guangwuchen nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT shuliyang nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT chingjulee nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT shinrushih nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
AT kuochientsao nextgenerationsequencingdataanalysispipelinefordetectingunknownpathogensfrommixedclinicalsamplesandrevealingtheirgeneticdiversity
_version_ 1716803364206936064