pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data

Abstract Background With the widespread use of multiple amplicon-sequencing (MAS) in genetic variation detection, an efficient tool is required to remove primer sequences from short reads to ensure the reliability of downstream analysis. Although some tools are currently available, their efficiency...

Full description

Bibliographic Details
Main Authors: Xiaolong Zhang, Yanyan Shao, Jichao Tian, Yuwei Liao, Peiying Li, Yu Zhang, Jun Chen, Zhiguang Li
Format: Article
Language:English
Published: BMC 2019-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2854-x
id doaj-233416a888c54152be5b22fca66db17d
record_format Article
spelling doaj-233416a888c54152be5b22fca66db17d2020-11-25T03:07:27ZengBMCBMC Bioinformatics1471-21052019-05-012011610.1186/s12859-019-2854-xpTrimmer: An efficient tool to trim primers of multiplex deep sequencing dataXiaolong Zhang0Yanyan Shao1Jichao Tian2Yuwei Liao3Peiying Li4Yu Zhang5Jun Chen6Zhiguang Li7Center of Genome and Personalized Medicine, Institute of Cancer Stem Cell, Dalian Medical UniversityCenter of Genome and Personalized Medicine, Institute of Cancer Stem Cell, Dalian Medical UniversityCenter of Genome and Personalized Medicine, Institute of Cancer Stem Cell, Dalian Medical UniversityCenter of Genome and Personalized Medicine, Institute of Cancer Stem Cell, Dalian Medical UniversityCenter of Genome and Personalized Medicine, Institute of Cancer Stem Cell, Dalian Medical UniversityCenter of Genome and Personalized Medicine, Institute of Cancer Stem Cell, Dalian Medical UniversityThe Second Hospital of Dalian Medical UniversityCenter of Genome and Personalized Medicine, Institute of Cancer Stem Cell, Dalian Medical UniversityAbstract Background With the widespread use of multiple amplicon-sequencing (MAS) in genetic variation detection, an efficient tool is required to remove primer sequences from short reads to ensure the reliability of downstream analysis. Although some tools are currently available, their efficiency and accuracy require improvement in trimming large scale of primers in high throughput target genome sequencing. This issue is becoming more urgent considering the potential clinical implementation of MAS for processing patient samples. We here developed pTrimmer that could handle thousands of primers simultaneously with greatly improved accuracy and performance. Result pTrimmer combines the two algorithms of k-mers and Needleman-Wunsch algorithm, which ensures its accuracy even with the presence of sequencing errors. pTrimmer has an improvement of 28.59% sensitivity and 11.87% accuracy compared to the similar tools. The simulation showed pTrimmer has an ultra-high sensitivity rate of 99.96% and accuracy of 97.38% compared to cutPrimers (70.85% sensitivity rate and 58.73% accuracy). And the performance of pTrimmer is notably higher. It is about 370 times faster than cutPrimers and even 17,000 times faster than cutadapt per threads. Trimming 2158 pairs of primers from 11 million reads (Illumina PE 150 bp) takes only 37 s and no more than 100 MB of memory consumption. Conclusions pTrimmer is designed to trim primer sequence from multiplex amplicon sequencing and target sequencing. It is highly sensitive and specific compared to other three similar tools, which could help users to get more reliable mutational information for downstream analysis.http://link.springer.com/article/10.1186/s12859-019-2854-xPrimer trimmingTarget sequencingMultiplex amplicon sequencing
collection DOAJ
language English
format Article
sources DOAJ
author Xiaolong Zhang
Yanyan Shao
Jichao Tian
Yuwei Liao
Peiying Li
Yu Zhang
Jun Chen
Zhiguang Li
spellingShingle Xiaolong Zhang
Yanyan Shao
Jichao Tian
Yuwei Liao
Peiying Li
Yu Zhang
Jun Chen
Zhiguang Li
pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
BMC Bioinformatics
Primer trimming
Target sequencing
Multiplex amplicon sequencing
author_facet Xiaolong Zhang
Yanyan Shao
Jichao Tian
Yuwei Liao
Peiying Li
Yu Zhang
Jun Chen
Zhiguang Li
author_sort Xiaolong Zhang
title pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
title_short pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
title_full pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
title_fullStr pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
title_full_unstemmed pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
title_sort ptrimmer: an efficient tool to trim primers of multiplex deep sequencing data
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2019-05-01
description Abstract Background With the widespread use of multiple amplicon-sequencing (MAS) in genetic variation detection, an efficient tool is required to remove primer sequences from short reads to ensure the reliability of downstream analysis. Although some tools are currently available, their efficiency and accuracy require improvement in trimming large scale of primers in high throughput target genome sequencing. This issue is becoming more urgent considering the potential clinical implementation of MAS for processing patient samples. We here developed pTrimmer that could handle thousands of primers simultaneously with greatly improved accuracy and performance. Result pTrimmer combines the two algorithms of k-mers and Needleman-Wunsch algorithm, which ensures its accuracy even with the presence of sequencing errors. pTrimmer has an improvement of 28.59% sensitivity and 11.87% accuracy compared to the similar tools. The simulation showed pTrimmer has an ultra-high sensitivity rate of 99.96% and accuracy of 97.38% compared to cutPrimers (70.85% sensitivity rate and 58.73% accuracy). And the performance of pTrimmer is notably higher. It is about 370 times faster than cutPrimers and even 17,000 times faster than cutadapt per threads. Trimming 2158 pairs of primers from 11 million reads (Illumina PE 150 bp) takes only 37 s and no more than 100 MB of memory consumption. Conclusions pTrimmer is designed to trim primer sequence from multiplex amplicon sequencing and target sequencing. It is highly sensitive and specific compared to other three similar tools, which could help users to get more reliable mutational information for downstream analysis.
topic Primer trimming
Target sequencing
Multiplex amplicon sequencing
url http://link.springer.com/article/10.1186/s12859-019-2854-x
work_keys_str_mv AT xiaolongzhang ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT yanyanshao ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT jichaotian ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT yuweiliao ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT peiyingli ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT yuzhang ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT junchen ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT zhiguangli ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
_version_ 1724670370112339968