Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

<p>Abstract</p> <p>Background</p> <p>The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer a...

Full description

Bibliographic Details
Main Authors: Eils Roland, Warnat Patrick, Brors Benedikt
Format: Article
Language:English
Published: BMC 2005-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://www.biomedcentral.com/1471-2105/6/265
id doaj-9d3d083e39a54b47a7b273ccc4bfa278
record_format Article
spelling doaj-9d3d083e39a54b47a7b273ccc4bfa2782020-11-25T00:04:53ZengBMCBMC Bioinformatics1471-21052005-11-016126510.1186/1471-2105-6-265Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypesEils RolandWarnat PatrickBrors Benedikt<p>Abstract</p> <p>Background</p> <p>The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods.</p> <p>Results</p> <p>In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85%) were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis.</p> <p>Conclusion</p> <p>Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and microarray technologies. Predictive models generated by this approach are better validated than those generated on a single data set, while showing high predictive power and improved generalization performance.</p> http://www.biomedcentral.com/1471-2105/6/265gene expression profilingDNA microarraycross-platform analysisclassificationcancer
collection DOAJ
language English
format Article
sources DOAJ
author Eils Roland
Warnat Patrick
Brors Benedikt
spellingShingle Eils Roland
Warnat Patrick
Brors Benedikt
Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
BMC Bioinformatics
gene expression profiling
DNA microarray
cross-platform analysis
classification
cancer
author_facet Eils Roland
Warnat Patrick
Brors Benedikt
author_sort Eils Roland
title Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
title_short Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
title_full Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
title_fullStr Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
title_full_unstemmed Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
title_sort cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2005-11-01
description <p>Abstract</p> <p>Background</p> <p>The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods.</p> <p>Results</p> <p>In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85%) were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis.</p> <p>Conclusion</p> <p>Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and microarray technologies. Predictive models generated by this approach are better validated than those generated on a single data set, while showing high predictive power and improved generalization performance.</p>
topic gene expression profiling
DNA microarray
cross-platform analysis
classification
cancer
url http://www.biomedcentral.com/1471-2105/6/265
work_keys_str_mv AT eilsroland crossplatformanalysisofcancermicroarraydataimprovesgeneexpressionbasedclassificationofphenotypes
AT warnatpatrick crossplatformanalysisofcancermicroarraydataimprovesgeneexpressionbasedclassificationofphenotypes
AT brorsbenedikt crossplatformanalysisofcancermicroarraydataimprovesgeneexpressionbasedclassificationofphenotypes
_version_ 1725427441971757056