<it>In silico </it>microdissection of microarray data from heterogeneous cell populations

<p>Abstract</p> <p>Background</p> <p>Very few analytical approaches have been reported to resolve the variability in microarray measurements stemming from sample heterogeneity. For example, tissue samples used in cancer studies are usually contaminated with the surround...

Full description

Bibliographic Details
Main Authors: Yli-Harja Olli, Dunmire Valerie, Shmulevich llya, Lähdesmäki Harri, Zhang Wei
Format: Article
Language:English
Published: BMC 2005-03-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/6/54
id doaj-3c68326e795546308d4a2827f1b0e245
record_format Article
spelling doaj-3c68326e795546308d4a2827f1b0e2452020-11-24T22:17:23ZengBMCBMC Bioinformatics1471-21052005-03-01615410.1186/1471-2105-6-54<it>In silico </it>microdissection of microarray data from heterogeneous cell populationsYli-Harja OlliDunmire ValerieShmulevich llyaLähdesmäki HarriZhang Wei<p>Abstract</p> <p>Background</p> <p>Very few analytical approaches have been reported to resolve the variability in microarray measurements stemming from sample heterogeneity. For example, tissue samples used in cancer studies are usually contaminated with the surrounding or infiltrating cell types. This heterogeneity in the sample preparation hinders further statistical analysis, significantly so if different samples contain different proportions of these cell types. Thus, sample heterogeneity can result in the identification of differentially expressed genes that may be unrelated to the biological question being studied. Similarly, irrelevant gene combinations can be discovered in the case of gene expression based classification.</p> <p>Results</p> <p>We propose a computational framework for removing the effects of sample heterogeneity by "microdissecting" microarray data <it>in silico</it>. The computational method provides estimates of the expression values of the pure (non-heterogeneous) cell samples. The inversion of the sample heterogeneity can be facilitated by providing accurate estimates of the mixing percentages of different cell types in each measurement. For those cases where no such information is available, we develop an optimization-based method for joint estimation of the mixing percentages and the expression values of the pure cell samples. We also consider the problem of selecting the correct number of cell types.</p> <p>Conclusion</p> <p>The efficiency of the proposed methods is illustrated by applying them to a carefully controlled cDNA microarray data obtained from heterogeneous samples. The results demonstrate that the methods are capable of reconstructing both the sample and cell type specific expression values from heterogeneous mixtures and that the mixing percentages of different cell types can also be estimated. Furthermore, a general purpose model selection method can be used to select the correct number of cell types.</p> http://www.biomedcentral.com/1471-2105/6/54
collection DOAJ
language English
format Article
sources DOAJ
author Yli-Harja Olli
Dunmire Valerie
Shmulevich llya
Lähdesmäki Harri
Zhang Wei
spellingShingle Yli-Harja Olli
Dunmire Valerie
Shmulevich llya
Lähdesmäki Harri
Zhang Wei
<it>In silico </it>microdissection of microarray data from heterogeneous cell populations
BMC Bioinformatics
author_facet Yli-Harja Olli
Dunmire Valerie
Shmulevich llya
Lähdesmäki Harri
Zhang Wei
author_sort Yli-Harja Olli
title <it>In silico </it>microdissection of microarray data from heterogeneous cell populations
title_short <it>In silico </it>microdissection of microarray data from heterogeneous cell populations
title_full <it>In silico </it>microdissection of microarray data from heterogeneous cell populations
title_fullStr <it>In silico </it>microdissection of microarray data from heterogeneous cell populations
title_full_unstemmed <it>In silico </it>microdissection of microarray data from heterogeneous cell populations
title_sort <it>in silico </it>microdissection of microarray data from heterogeneous cell populations
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2005-03-01
description <p>Abstract</p> <p>Background</p> <p>Very few analytical approaches have been reported to resolve the variability in microarray measurements stemming from sample heterogeneity. For example, tissue samples used in cancer studies are usually contaminated with the surrounding or infiltrating cell types. This heterogeneity in the sample preparation hinders further statistical analysis, significantly so if different samples contain different proportions of these cell types. Thus, sample heterogeneity can result in the identification of differentially expressed genes that may be unrelated to the biological question being studied. Similarly, irrelevant gene combinations can be discovered in the case of gene expression based classification.</p> <p>Results</p> <p>We propose a computational framework for removing the effects of sample heterogeneity by "microdissecting" microarray data <it>in silico</it>. The computational method provides estimates of the expression values of the pure (non-heterogeneous) cell samples. The inversion of the sample heterogeneity can be facilitated by providing accurate estimates of the mixing percentages of different cell types in each measurement. For those cases where no such information is available, we develop an optimization-based method for joint estimation of the mixing percentages and the expression values of the pure cell samples. We also consider the problem of selecting the correct number of cell types.</p> <p>Conclusion</p> <p>The efficiency of the proposed methods is illustrated by applying them to a carefully controlled cDNA microarray data obtained from heterogeneous samples. The results demonstrate that the methods are capable of reconstructing both the sample and cell type specific expression values from heterogeneous mixtures and that the mixing percentages of different cell types can also be estimated. Furthermore, a general purpose model selection method can be used to select the correct number of cell types.</p>
url http://www.biomedcentral.com/1471-2105/6/54
work_keys_str_mv AT yliharjaolli itinsilicoitmicrodissectionofmicroarraydatafromheterogeneouscellpopulations
AT dunmirevalerie itinsilicoitmicrodissectionofmicroarraydatafromheterogeneouscellpopulations
AT shmulevichllya itinsilicoitmicrodissectionofmicroarraydatafromheterogeneouscellpopulations
AT lahdesmakiharri itinsilicoitmicrodissectionofmicroarraydatafromheterogeneouscellpopulations
AT zhangwei itinsilicoitmicrodissectionofmicroarraydatafromheterogeneouscellpopulations
_version_ 1725784997285068800