A Review of Scalable Bioinformatics Pipelines

Abstract Scalability is increasingly important for bioinformatics analysis services, since these must handle larger datasets, more jobs, and more users. The pipelines used to implement analyses must therefore scale with respect to the resources on a single compute node, the number of nodes on a clus...

Full description

Bibliographic Details
Main Authors: Bjørn Fjukstad, Lars Ailo Bongo
Format: Article
Language:English
Published: SpringerOpen 2017-10-01
Series:Data Science and Engineering
Subjects:
Online Access:http://link.springer.com/article/10.1007/s41019-017-0047-z
Description
Summary:Abstract Scalability is increasingly important for bioinformatics analysis services, since these must handle larger datasets, more jobs, and more users. The pipelines used to implement analyses must therefore scale with respect to the resources on a single compute node, the number of nodes on a cluster, and also to cost-performance. Here, we survey several scalable bioinformatics pipelines and compare their design and their use of underlying frameworks and infrastructures. We also discuss current trends for bioinformatics pipeline development.
ISSN:2364-1185
2364-1541