Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics

Abstract Background Distributed approaches based on the MapReduce programming paradigm have started to be proposed in the Bioinformatics domain, due to the large amount of data produced by the next-generation sequencing techniques. However, the use of MapReduce and related Big Data technologies and...

Full description

Bibliographic Details
Main Authors: Umberto Ferraro Petrillo, Mara Sorella, Giuseppe Cattaneo, Raffaele Giancarlo, Simona E. Rombo
Format: Article
Language:English
Published: BMC 2019-04-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2694-8