Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets

<p>Abstract</p> <p>Background</p> <p>Analysis of microarray experiments often involves testing for the overrepresentation of pre-defined sets of genes among lists of genes deemed individually significant. Most popular gene set testing methods assume the independence of...

Full description

Bibliographic Details
Main Authors: Nobel Andrew B, Barry William T, Gatti Daniel M, Rusyn Ivan, Wright Fred A
Format: Article
Language:English
Published: BMC 2010-10-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/11/574
id doaj-eb95fe612f9e4291815012fd45f6a4f5
record_format Article
spelling doaj-eb95fe612f9e4291815012fd45f6a4f52020-11-24T23:58:12ZengBMCBMC Genomics1471-21642010-10-0111157410.1186/1471-2164-11-574Heading Down the Wrong Pathway: on the Influence of Correlation within Gene SetsNobel Andrew BBarry William TGatti Daniel MRusyn IvanWright Fred A<p>Abstract</p> <p>Background</p> <p>Analysis of microarray experiments often involves testing for the overrepresentation of pre-defined sets of genes among lists of genes deemed individually significant. Most popular gene set testing methods assume the independence of genes within each set, an assumption that is seriously violated, as extensive correlation between genes is a well-documented phenomenon.</p> <p>Results</p> <p>We conducted a meta-analysis of over 200 datasets from the Gene Expression Omnibus in order to demonstrate the practical impact of strong gene correlation patterns that are highly consistent across experiments. We show that a common independence assumption-based gene set testing procedure produces very high false positive rates when applied to data sets for which treatment groups have been randomized, and that gene sets with high internal correlation are more likely to be declared significant. A reanalysis of the same datasets using an array resampling approach properly controls false positive rates, leading to more parsimonious and high-confidence gene set findings, which should facilitate pathway-based interpretation of the microarray data.</p> <p>Conclusions</p> <p>These findings call into question many of the gene set testing results in the literature and argue strongly for the adoption of resampling based gene set testing criteria in the peer reviewed biomedical literature.</p> http://www.biomedcentral.com/1471-2164/11/574
collection DOAJ
language English
format Article
sources DOAJ
author Nobel Andrew B
Barry William T
Gatti Daniel M
Rusyn Ivan
Wright Fred A
spellingShingle Nobel Andrew B
Barry William T
Gatti Daniel M
Rusyn Ivan
Wright Fred A
Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
BMC Genomics
author_facet Nobel Andrew B
Barry William T
Gatti Daniel M
Rusyn Ivan
Wright Fred A
author_sort Nobel Andrew B
title Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
title_short Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
title_full Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
title_fullStr Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
title_full_unstemmed Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets
title_sort heading down the wrong pathway: on the influence of correlation within gene sets
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2010-10-01
description <p>Abstract</p> <p>Background</p> <p>Analysis of microarray experiments often involves testing for the overrepresentation of pre-defined sets of genes among lists of genes deemed individually significant. Most popular gene set testing methods assume the independence of genes within each set, an assumption that is seriously violated, as extensive correlation between genes is a well-documented phenomenon.</p> <p>Results</p> <p>We conducted a meta-analysis of over 200 datasets from the Gene Expression Omnibus in order to demonstrate the practical impact of strong gene correlation patterns that are highly consistent across experiments. We show that a common independence assumption-based gene set testing procedure produces very high false positive rates when applied to data sets for which treatment groups have been randomized, and that gene sets with high internal correlation are more likely to be declared significant. A reanalysis of the same datasets using an array resampling approach properly controls false positive rates, leading to more parsimonious and high-confidence gene set findings, which should facilitate pathway-based interpretation of the microarray data.</p> <p>Conclusions</p> <p>These findings call into question many of the gene set testing results in the literature and argue strongly for the adoption of resampling based gene set testing criteria in the peer reviewed biomedical literature.</p>
url http://www.biomedcentral.com/1471-2164/11/574
work_keys_str_mv AT nobelandrewb headingdownthewrongpathwayontheinfluenceofcorrelationwithingenesets
AT barrywilliamt headingdownthewrongpathwayontheinfluenceofcorrelationwithingenesets
AT gattidanielm headingdownthewrongpathwayontheinfluenceofcorrelationwithingenesets
AT rusynivan headingdownthewrongpathwayontheinfluenceofcorrelationwithingenesets
AT wrightfreda headingdownthewrongpathwayontheinfluenceofcorrelationwithingenesets
_version_ 1725451201823113216