PaSS: a sequencing simulator for PacBio sequencing
Abstract Background Third-generation sequencing platforms, such as PacBio sequencing, have been developed rapidly in recent years. PacBio sequencing generates much longer reads than the second-generation sequencing (or the next generation sequencing, NGS) technologies and it has unique sequencing er...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-06-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-019-2901-7 |
id |
doaj-31ea5ae4de5c43f2a29f3fee3c9a5958 |
---|---|
record_format |
Article |
spelling |
doaj-31ea5ae4de5c43f2a29f3fee3c9a59582020-11-25T03:15:10ZengBMCBMC Bioinformatics1471-21052019-06-012011710.1186/s12859-019-2901-7PaSS: a sequencing simulator for PacBio sequencingWenmin Zhang0Ben Jia1Chaochun Wei2Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong UniversityDepartment of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong UniversityDepartment of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong UniversityAbstract Background Third-generation sequencing platforms, such as PacBio sequencing, have been developed rapidly in recent years. PacBio sequencing generates much longer reads than the second-generation sequencing (or the next generation sequencing, NGS) technologies and it has unique sequencing error patterns. An effective read simulator is essential to evaluate and promote the development of new bioinformatics tools for PacBio sequencing data analysis. Results We developed a new PacBio Sequencing Simulator (PaSS). It can learn sequence patterns from PacBio sequencing data currently available. In addition to the distribution of read lengths and error rates, we included a context-specific sequencing error model. Compared to existing PacBio sequencing simulators such as PBSIM, LongISLND and NPBSS, PaSS performed better in many aspects. Assembly tests also suggest that reads simulated by PaSS are the most similar to experimental sequencing data. Conclusion PaSS is an effective sequence simulator for PacBio sequencing. It will facilitate the evaluation and development of new analysis tools for the third-generation sequencing data.http://link.springer.com/article/10.1186/s12859-019-2901-7Third generation sequencingNext generation sequencingPacBio sequencingSequencing simulatorSequencing errorSequence pattern |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Wenmin Zhang Ben Jia Chaochun Wei |
spellingShingle |
Wenmin Zhang Ben Jia Chaochun Wei PaSS: a sequencing simulator for PacBio sequencing BMC Bioinformatics Third generation sequencing Next generation sequencing PacBio sequencing Sequencing simulator Sequencing error Sequence pattern |
author_facet |
Wenmin Zhang Ben Jia Chaochun Wei |
author_sort |
Wenmin Zhang |
title |
PaSS: a sequencing simulator for PacBio sequencing |
title_short |
PaSS: a sequencing simulator for PacBio sequencing |
title_full |
PaSS: a sequencing simulator for PacBio sequencing |
title_fullStr |
PaSS: a sequencing simulator for PacBio sequencing |
title_full_unstemmed |
PaSS: a sequencing simulator for PacBio sequencing |
title_sort |
pass: a sequencing simulator for pacbio sequencing |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2019-06-01 |
description |
Abstract Background Third-generation sequencing platforms, such as PacBio sequencing, have been developed rapidly in recent years. PacBio sequencing generates much longer reads than the second-generation sequencing (or the next generation sequencing, NGS) technologies and it has unique sequencing error patterns. An effective read simulator is essential to evaluate and promote the development of new bioinformatics tools for PacBio sequencing data analysis. Results We developed a new PacBio Sequencing Simulator (PaSS). It can learn sequence patterns from PacBio sequencing data currently available. In addition to the distribution of read lengths and error rates, we included a context-specific sequencing error model. Compared to existing PacBio sequencing simulators such as PBSIM, LongISLND and NPBSS, PaSS performed better in many aspects. Assembly tests also suggest that reads simulated by PaSS are the most similar to experimental sequencing data. Conclusion PaSS is an effective sequence simulator for PacBio sequencing. It will facilitate the evaluation and development of new analysis tools for the third-generation sequencing data. |
topic |
Third generation sequencing Next generation sequencing PacBio sequencing Sequencing simulator Sequencing error Sequence pattern |
url |
http://link.springer.com/article/10.1186/s12859-019-2901-7 |
work_keys_str_mv |
AT wenminzhang passasequencingsimulatorforpacbiosequencing AT benjia passasequencingsimulatorforpacbiosequencing AT chaochunwei passasequencingsimulatorforpacbiosequencing |
_version_ |
1724640236499107840 |