affyPara—a Bioconductor Package for Parallelized Preprocessing Algorithms of Affymetrix Microarray Data

Microarray data repositories as well as large clinical applications of gene expression allow to analyse several hundreds of microarrays at one time. The preprocessing of large amounts of microarrays is still a challenge. The algorithms are limited by the available computer hardware. For example, bui...

Full description

Bibliographic Details
Main Authors: Markus Schmidberger, Esmeralda Vicedo, Ulrich Mansmann
Format: Article
Language:English
Published: SAGE Publishing 2009-01-01
Series:Bioinformatics and Biology Insights
Online Access:https://doi.org/10.4137/BBI.S3060
Description
Summary:Microarray data repositories as well as large clinical applications of gene expression allow to analyse several hundreds of microarrays at one time. The preprocessing of large amounts of microarrays is still a challenge. The algorithms are limited by the available computer hardware. For example, building classification or prognostic rules from large microarray sets will be very time consuming. Here, preprocessing has to be a part of the cross-validation and resampling strategy which is necessary to estimate the rule's prediction quality honestly. This paper proposes the new Bioconductor package affyPara for parallelized preprocessing of Affymetrix microarray data. Partition of data can be applied on arrays and parallelization of algorithms is a straightforward consequence. The partition of data and distribution to several nodes solves the main memory problems and accelerates preprocessing by up to the factor 20 for 200 or more arrays. affyPara is a free and open source package, under GPL license, available form the Bioconductor project at www.bioconductor.org . A user guide and examples are provided with the package.
ISSN:1177-9322