Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods.

Gene set methods aim to assess the overall evidence of association of a set of genes with a phenotype, such as disease or a quantitative trait. Multiple approaches for gene set analysis of expression data have been proposed. They can be divided into two types: competitive and self-contained. Benefit...

Full description

Bibliographic Details
Main Authors:	Brooke L Fridley, Gregory D Jenkins, Joanna M Biernacka
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2010-09-01
Series:	PLoS ONE
Online Access:	http://europepmc.org/articles/PMC2941449?pdf=render

id	doaj-116d20bef5684f2b92362fa882ff4fbb
record_format	Article
spelling	doaj-116d20bef5684f2b92362fa882ff4fbb2020-11-24T22:16:17ZengPublic Library of Science (PLoS)PLoS ONE1932-62032010-09-015910.1371/journal.pone.0012693Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods.Brooke L FridleyGregory D JenkinsJoanna M BiernackaGene set methods aim to assess the overall evidence of association of a set of genes with a phenotype, such as disease or a quantitative trait. Multiple approaches for gene set analysis of expression data have been proposed. They can be divided into two types: competitive and self-contained. Benefits of self-contained methods include that they can be used for genome-wide, candidate gene, or pathway studies, and have been reported to be more powerful than competitive methods. We therefore investigated ten self-contained methods that can be used for continuous, discrete and time-to-event phenotypes. To assess the power and type I error rate for the various previously proposed and novel approaches, an extensive simulation study was completed in which the scenarios varied according to: number of genes in a gene set, number of genes associated with the phenotype, effect sizes, correlation between expression of genes within a gene set, and the sample size. In addition to the simulated data, the various methods were applied to a pharmacogenomic study of the drug gemcitabine. Simulation results demonstrated that overall Fisher's method and the global model with random effects have the highest power for a wide range of scenarios, while the analysis based on the first principal component and Kolmogorov-Smirnov test tended to have lowest power. The methods investigated here are likely to play an important role in identifying pathways that contribute to complex traits.http://europepmc.org/articles/PMC2941449?pdf=render
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Brooke L Fridley Gregory D Jenkins Joanna M Biernacka
spellingShingle	Brooke L Fridley Gregory D Jenkins Joanna M Biernacka Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods. PLoS ONE
author_facet	Brooke L Fridley Gregory D Jenkins Joanna M Biernacka
author_sort	Brooke L Fridley
title	Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods.
title_short	Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods.
title_full	Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods.
title_fullStr	Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods.
title_full_unstemmed	Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods.
title_sort	self-contained gene-set analysis of expression data: an evaluation of existing and novel methods.
publisher	Public Library of Science (PLoS)
series	PLoS ONE
issn	1932-6203
publishDate	2010-09-01
description	Gene set methods aim to assess the overall evidence of association of a set of genes with a phenotype, such as disease or a quantitative trait. Multiple approaches for gene set analysis of expression data have been proposed. They can be divided into two types: competitive and self-contained. Benefits of self-contained methods include that they can be used for genome-wide, candidate gene, or pathway studies, and have been reported to be more powerful than competitive methods. We therefore investigated ten self-contained methods that can be used for continuous, discrete and time-to-event phenotypes. To assess the power and type I error rate for the various previously proposed and novel approaches, an extensive simulation study was completed in which the scenarios varied according to: number of genes in a gene set, number of genes associated with the phenotype, effect sizes, correlation between expression of genes within a gene set, and the sample size. In addition to the simulated data, the various methods were applied to a pharmacogenomic study of the drug gemcitabine. Simulation results demonstrated that overall Fisher's method and the global model with random effects have the highest power for a wide range of scenarios, while the analysis based on the first principal component and Kolmogorov-Smirnov test tended to have lowest power. The methods investigated here are likely to play an important role in identifying pathways that contribute to complex traits.
url	http://europepmc.org/articles/PMC2941449?pdf=render
work_keys_str_mv	AT brookelfridley selfcontainedgenesetanalysisofexpressiondataanevaluationofexistingandnovelmethods AT gregorydjenkins selfcontainedgenesetanalysisofexpressiondataanevaluationofexistingandnovelmethods AT joannambiernacka selfcontainedgenesetanalysisofexpressiondataanevaluationofexistingandnovelmethods
_version_	1725790954630152192

Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods.

Similar Items