VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data
Abstract Background Pre-processing of high-throughput sequencing data for immune repertoire profiling is essential to insure high quality input for downstream analysis. VDJPipe is a flexible, high-performance tool that can perform multiple pre-processing tasks with just a single pass over the data f...
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2017-10-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-017-1853-z |
id |
doaj-26c993d5cb694e45b75adc076995ae4a |
---|---|
record_format |
Article |
spelling |
doaj-26c993d5cb694e45b75adc076995ae4a2020-11-24T21:17:08ZengBMCBMC Bioinformatics1471-21052017-10-011811510.1186/s12859-017-1853-zVDJPipe: a pipelined tool for pre-processing immune repertoire sequencing dataScott Christley0Mikhail K. Levin1Inimary T. Toby2John M. Fonner3Nancy L. Monson4William H. Rounds5Florian Rubelt6Walter Scarborough7Richard H. Scheuermann8Lindsay G. Cowell9Department of Clinical Sciences, UT Southwestern Medical CenterBank of America Corporate CenterDepartment of Clinical Sciences, UT Southwestern Medical CenterTexas Advanced Computing CenterDepartment of Neurology and Neurotherapeutics, UT Southwestern Medical CenterDepartment of Clinical Sciences, UT Southwestern Medical CenterDepartment of Microbiology and Immunology, Stanford University School of MedicineTexas Advanced Computing CenterJ. Craig Venter InstituteDepartment of Clinical Sciences, UT Southwestern Medical CenterAbstract Background Pre-processing of high-throughput sequencing data for immune repertoire profiling is essential to insure high quality input for downstream analysis. VDJPipe is a flexible, high-performance tool that can perform multiple pre-processing tasks with just a single pass over the data files. Results Processing tasks provided by VDJPipe include base composition statistics calculation, read quality statistics calculation, quality filtering, homopolymer filtering, length and nucleotide filtering, paired-read merging, barcode demultiplexing, 5′ and 3′ PCR primer matching, and duplicate reads collapsing. VDJPipe utilizes a pipeline approach whereby multiple processing steps are performed in a sequential workflow, with the output of each step passed as input to the next step automatically. The workflow is flexible enough to handle the complex barcoding schemes used in many immunosequencing experiments. Because VDJPipe is designed for computational efficiency, we evaluated this by comparing execution times with those of pRESTO, a widely-used pre-processing tool for immune repertoire sequencing data. We found that VDJPipe requires <10% of the run time required by pRESTO. Conclusions VDJPipe is a high-performance tool that is optimized for pre-processing large immune repertoire sequencing data sets.http://link.springer.com/article/10.1186/s12859-017-1853-zRep-seqImmune repertoire analysisBioinformatics |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Scott Christley Mikhail K. Levin Inimary T. Toby John M. Fonner Nancy L. Monson William H. Rounds Florian Rubelt Walter Scarborough Richard H. Scheuermann Lindsay G. Cowell |
spellingShingle |
Scott Christley Mikhail K. Levin Inimary T. Toby John M. Fonner Nancy L. Monson William H. Rounds Florian Rubelt Walter Scarborough Richard H. Scheuermann Lindsay G. Cowell VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data BMC Bioinformatics Rep-seq Immune repertoire analysis Bioinformatics |
author_facet |
Scott Christley Mikhail K. Levin Inimary T. Toby John M. Fonner Nancy L. Monson William H. Rounds Florian Rubelt Walter Scarborough Richard H. Scheuermann Lindsay G. Cowell |
author_sort |
Scott Christley |
title |
VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data |
title_short |
VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data |
title_full |
VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data |
title_fullStr |
VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data |
title_full_unstemmed |
VDJPipe: a pipelined tool for pre-processing immune repertoire sequencing data |
title_sort |
vdjpipe: a pipelined tool for pre-processing immune repertoire sequencing data |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2017-10-01 |
description |
Abstract Background Pre-processing of high-throughput sequencing data for immune repertoire profiling is essential to insure high quality input for downstream analysis. VDJPipe is a flexible, high-performance tool that can perform multiple pre-processing tasks with just a single pass over the data files. Results Processing tasks provided by VDJPipe include base composition statistics calculation, read quality statistics calculation, quality filtering, homopolymer filtering, length and nucleotide filtering, paired-read merging, barcode demultiplexing, 5′ and 3′ PCR primer matching, and duplicate reads collapsing. VDJPipe utilizes a pipeline approach whereby multiple processing steps are performed in a sequential workflow, with the output of each step passed as input to the next step automatically. The workflow is flexible enough to handle the complex barcoding schemes used in many immunosequencing experiments. Because VDJPipe is designed for computational efficiency, we evaluated this by comparing execution times with those of pRESTO, a widely-used pre-processing tool for immune repertoire sequencing data. We found that VDJPipe requires <10% of the run time required by pRESTO. Conclusions VDJPipe is a high-performance tool that is optimized for pre-processing large immune repertoire sequencing data sets. |
topic |
Rep-seq Immune repertoire analysis Bioinformatics |
url |
http://link.springer.com/article/10.1186/s12859-017-1853-z |
work_keys_str_mv |
AT scottchristley vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT mikhailklevin vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT inimaryttoby vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT johnmfonner vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT nancylmonson vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT williamhrounds vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT florianrubelt vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT walterscarborough vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT richardhscheuermann vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata AT lindsaygcowell vdjpipeapipelinedtoolforpreprocessingimmunerepertoiresequencingdata |
_version_ |
1726013967047852032 |