TAR-VIR: a pipeline for TARgeted VIRal strain reconstruction from metagenomic data

Abstract Background Strain-level RNA virus characterization is essential for developing prevention and treatment strategies. Viral metagenomic data, which can contain sequences of both known and novel viruses, provide new opportunities for characterizing RNA viruses. Although there are a number of p...

Full description

Bibliographic Details
Main Authors: Jiao Chen, Jiating Huang, Yanni Sun
Format: Article
Language:English
Published: BMC 2019-06-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2878-2
id doaj-cc3a27397e864c298af5fec5eee54506
record_format Article
spelling doaj-cc3a27397e864c298af5fec5eee545062020-11-25T03:20:04ZengBMCBMC Bioinformatics1471-21052019-06-0120111410.1186/s12859-019-2878-2TAR-VIR: a pipeline for TARgeted VIRal strain reconstruction from metagenomic dataJiao Chen0Jiating Huang1Yanni Sun2Computer Science and Engineering, Michigan State UniversityInstitute of Clinical Pharmacology, Guangzhou University of Chinese MedicineElectronic Engineering, City University of Hong KongAbstract Background Strain-level RNA virus characterization is essential for developing prevention and treatment strategies. Viral metagenomic data, which can contain sequences of both known and novel viruses, provide new opportunities for characterizing RNA viruses. Although there are a number of pipelines for analyzing viruses in metagenomic data, they have different limitations. First, viruses that lack closely related reference genomes cannot be detected with high sensitivity. Second, strain-level analysis is usually missing. Results In this study, we developed a hybrid pipeline named TAR-VIR that reconstructs viral strains without relying on complete or high-quality reference genomes. It is optimized for identifying RNA viruses from metagenomic data by combining an effective read classification method and our in-house strain-level de novo assembly tool. TAR-VIR was tested on both simulated and real viral metagenomic data sets. The results demonstrated that TAR-VIR competes favorably with other tested tools. Conclusion TAR-VIR can be used standalone for viral strain reconstruction from metagenomic data. Or, its read recruiting stage can be used with other de novo assembly tools for superior viral functional and taxonomic analyses. The source code and the documentation of TAR-VIR are available at https://github.com/chjiao/TAR-VIR.http://link.springer.com/article/10.1186/s12859-019-2878-2RNA virusRead classificationStrain assemblyViral metagenomics
collection DOAJ
language English
format Article
sources DOAJ
author Jiao Chen
Jiating Huang
Yanni Sun
spellingShingle Jiao Chen
Jiating Huang
Yanni Sun
TAR-VIR: a pipeline for TARgeted VIRal strain reconstruction from metagenomic data
BMC Bioinformatics
RNA virus
Read classification
Strain assembly
Viral metagenomics
author_facet Jiao Chen
Jiating Huang
Yanni Sun
author_sort Jiao Chen
title TAR-VIR: a pipeline for TARgeted VIRal strain reconstruction from metagenomic data
title_short TAR-VIR: a pipeline for TARgeted VIRal strain reconstruction from metagenomic data
title_full TAR-VIR: a pipeline for TARgeted VIRal strain reconstruction from metagenomic data
title_fullStr TAR-VIR: a pipeline for TARgeted VIRal strain reconstruction from metagenomic data
title_full_unstemmed TAR-VIR: a pipeline for TARgeted VIRal strain reconstruction from metagenomic data
title_sort tar-vir: a pipeline for targeted viral strain reconstruction from metagenomic data
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2019-06-01
description Abstract Background Strain-level RNA virus characterization is essential for developing prevention and treatment strategies. Viral metagenomic data, which can contain sequences of both known and novel viruses, provide new opportunities for characterizing RNA viruses. Although there are a number of pipelines for analyzing viruses in metagenomic data, they have different limitations. First, viruses that lack closely related reference genomes cannot be detected with high sensitivity. Second, strain-level analysis is usually missing. Results In this study, we developed a hybrid pipeline named TAR-VIR that reconstructs viral strains without relying on complete or high-quality reference genomes. It is optimized for identifying RNA viruses from metagenomic data by combining an effective read classification method and our in-house strain-level de novo assembly tool. TAR-VIR was tested on both simulated and real viral metagenomic data sets. The results demonstrated that TAR-VIR competes favorably with other tested tools. Conclusion TAR-VIR can be used standalone for viral strain reconstruction from metagenomic data. Or, its read recruiting stage can be used with other de novo assembly tools for superior viral functional and taxonomic analyses. The source code and the documentation of TAR-VIR are available at https://github.com/chjiao/TAR-VIR.
topic RNA virus
Read classification
Strain assembly
Viral metagenomics
url http://link.springer.com/article/10.1186/s12859-019-2878-2
work_keys_str_mv AT jiaochen tarvirapipelinefortargetedviralstrainreconstructionfrommetagenomicdata
AT jiatinghuang tarvirapipelinefortargetedviralstrainreconstructionfrommetagenomicdata
AT yannisun tarvirapipelinefortargetedviralstrainreconstructionfrommetagenomicdata
_version_ 1724619482092011520