A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data

<p>Abstract</p> <p>Background</p> <p>Microarray techniques are one of the main methods used to investigate thousands of gene expression profiles for enlightening complex biological processes responsible for serious diseases, with a great scientific impact and a wide app...

Full description

Bibliographic Details
Main Authors: Scaglione Silvia, Porro Ivan, Fato Marco, Corradi Luca, Torterolo Livia
Format: Article
Language:English
Published: BMC 2008-11-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/9/480
id doaj-b2f3e8f0aca84cc48e5b994237211534
record_format Article
spelling doaj-b2f3e8f0aca84cc48e5b9942372115342020-11-24T21:53:28ZengBMCBMC Bioinformatics1471-21052008-11-019148010.1186/1471-2105-9-480A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression dataScaglione SilviaPorro IvanFato MarcoCorradi LucaTorterolo Livia<p>Abstract</p> <p>Background</p> <p>Microarray techniques are one of the main methods used to investigate thousands of gene expression profiles for enlightening complex biological processes responsible for serious diseases, with a great scientific impact and a wide application area. Several standalone applications had been developed in order to analyze microarray data. Two of the most known free analysis software packages are the R-based Bioconductor and dChip. The part of dChip software concerning the calculation and the analysis of gene expression has been modified to permit its execution on both cluster environments (supercomputers) and Grid infrastructures (distributed computing).</p> <p>This work is not aimed at replacing existing tools, but it provides researchers with a method to analyze large datasets without any hardware or software constraints.</p> <p>Results</p> <p>An application able to perform the computation and the analysis of gene expression on large datasets has been developed using algorithms provided by dChip. Different tests have been carried out in order to validate the results and to compare the performances obtained on different infrastructures. Validation tests have been performed using a small dataset related to the comparison of HUVEC (Human Umbilical Vein Endothelial Cells) and Fibroblasts, derived from same donors, treated with IFN-α.</p> <p>Moreover performance tests have been executed just to compare performances on different environments using a large dataset including about 1000 samples related to Breast Cancer patients.</p> <p>Conclusion</p> <p>A Grid-enabled software application for the analysis of large Microarray datasets has been proposed. DChip software has been ported on Linux platform and modified, using appropriate parallelization strategies, to permit its execution on both cluster environments and Grid infrastructures. The added value provided by the use of Grid technologies is the possibility to exploit both computational and data Grid infrastructures to analyze large datasets of distributed data. The software has been validated and performances on cluster and Grid environments have been compared obtaining quite good scalability results.</p> http://www.biomedcentral.com/1471-2105/9/480
collection DOAJ
language English
format Article
sources DOAJ
author Scaglione Silvia
Porro Ivan
Fato Marco
Corradi Luca
Torterolo Livia
spellingShingle Scaglione Silvia
Porro Ivan
Fato Marco
Corradi Luca
Torterolo Livia
A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data
BMC Bioinformatics
author_facet Scaglione Silvia
Porro Ivan
Fato Marco
Corradi Luca
Torterolo Livia
author_sort Scaglione Silvia
title A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data
title_short A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data
title_full A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data
title_fullStr A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data
title_full_unstemmed A Web-based and Grid-enabled dChip version for the analysis of large sets of gene expression data
title_sort web-based and grid-enabled dchip version for the analysis of large sets of gene expression data
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2008-11-01
description <p>Abstract</p> <p>Background</p> <p>Microarray techniques are one of the main methods used to investigate thousands of gene expression profiles for enlightening complex biological processes responsible for serious diseases, with a great scientific impact and a wide application area. Several standalone applications had been developed in order to analyze microarray data. Two of the most known free analysis software packages are the R-based Bioconductor and dChip. The part of dChip software concerning the calculation and the analysis of gene expression has been modified to permit its execution on both cluster environments (supercomputers) and Grid infrastructures (distributed computing).</p> <p>This work is not aimed at replacing existing tools, but it provides researchers with a method to analyze large datasets without any hardware or software constraints.</p> <p>Results</p> <p>An application able to perform the computation and the analysis of gene expression on large datasets has been developed using algorithms provided by dChip. Different tests have been carried out in order to validate the results and to compare the performances obtained on different infrastructures. Validation tests have been performed using a small dataset related to the comparison of HUVEC (Human Umbilical Vein Endothelial Cells) and Fibroblasts, derived from same donors, treated with IFN-α.</p> <p>Moreover performance tests have been executed just to compare performances on different environments using a large dataset including about 1000 samples related to Breast Cancer patients.</p> <p>Conclusion</p> <p>A Grid-enabled software application for the analysis of large Microarray datasets has been proposed. DChip software has been ported on Linux platform and modified, using appropriate parallelization strategies, to permit its execution on both cluster environments and Grid infrastructures. The added value provided by the use of Grid technologies is the possibility to exploit both computational and data Grid infrastructures to analyze large datasets of distributed data. The software has been validated and performances on cluster and Grid environments have been compared obtaining quite good scalability results.</p>
url http://www.biomedcentral.com/1471-2105/9/480
work_keys_str_mv AT scaglionesilvia awebbasedandgridenableddchipversionfortheanalysisoflargesetsofgeneexpressiondata
AT porroivan awebbasedandgridenableddchipversionfortheanalysisoflargesetsofgeneexpressiondata
AT fatomarco awebbasedandgridenableddchipversionfortheanalysisoflargesetsofgeneexpressiondata
AT corradiluca awebbasedandgridenableddchipversionfortheanalysisoflargesetsofgeneexpressiondata
AT torterololivia awebbasedandgridenableddchipversionfortheanalysisoflargesetsofgeneexpressiondata
AT scaglionesilvia webbasedandgridenableddchipversionfortheanalysisoflargesetsofgeneexpressiondata
AT porroivan webbasedandgridenableddchipversionfortheanalysisoflargesetsofgeneexpressiondata
AT fatomarco webbasedandgridenableddchipversionfortheanalysisoflargesetsofgeneexpressiondata
AT corradiluca webbasedandgridenableddchipversionfortheanalysisoflargesetsofgeneexpressiondata
AT torterololivia webbasedandgridenableddchipversionfortheanalysisoflargesetsofgeneexpressiondata
_version_ 1725871952768270336