Statistical correction of the Winner's Curse explains replication variability in quantitative trait genome-wide association studies.
Genome-wide association studies (GWAS) have identified hundreds of SNPs responsible for variation in human quantitative traits. However, genome-wide-significant associations often fail to replicate across independent cohorts, in apparent inconsistency with their apparent strong effects in discovery...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2017-07-01
|
Series: | PLoS Genetics |
Online Access: | http://europepmc.org/articles/PMC5536394?pdf=render |
id |
doaj-036fffa4414f4787969636e8fe466d97 |
---|---|
record_format |
Article |
spelling |
doaj-036fffa4414f4787969636e8fe466d972020-11-25T02:25:27ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042017-07-01137e100691610.1371/journal.pgen.1006916Statistical correction of the Winner's Curse explains replication variability in quantitative trait genome-wide association studies.Cameron PalmerItsik Pe'erGenome-wide association studies (GWAS) have identified hundreds of SNPs responsible for variation in human quantitative traits. However, genome-wide-significant associations often fail to replicate across independent cohorts, in apparent inconsistency with their apparent strong effects in discovery cohorts. This limited success of replication raises pervasive questions about the utility of the GWAS field. We identify all 332 studies of quantitative traits from the NHGRI-EBI GWAS Database with attempted replication. We find that the majority of studies provide insufficient data to evaluate replication rates. The remaining papers replicate significantly worse than expected (p < 10-14), even when adjusting for regression-to-the-mean of effect size between discovery- and replication-cohorts termed the Winner's Curse (p < 10-16). We show this is due in part to misreporting replication cohort-size as a maximum number, rather than per-locus one. In 39 studies accurately reporting per-locus cohort-size for attempted replication of 707 loci in samples with similar ancestry, replication rate matched expectation (predicted 458, observed 457, p = 0.94). In contrast, ancestry differences between replication and discovery (13 studies, 385 loci) cause the most highly-powered decile of loci to replicate worse than expected, due to difference in linkage disequilibrium.http://europepmc.org/articles/PMC5536394?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Cameron Palmer Itsik Pe'er |
spellingShingle |
Cameron Palmer Itsik Pe'er Statistical correction of the Winner's Curse explains replication variability in quantitative trait genome-wide association studies. PLoS Genetics |
author_facet |
Cameron Palmer Itsik Pe'er |
author_sort |
Cameron Palmer |
title |
Statistical correction of the Winner's Curse explains replication variability in quantitative trait genome-wide association studies. |
title_short |
Statistical correction of the Winner's Curse explains replication variability in quantitative trait genome-wide association studies. |
title_full |
Statistical correction of the Winner's Curse explains replication variability in quantitative trait genome-wide association studies. |
title_fullStr |
Statistical correction of the Winner's Curse explains replication variability in quantitative trait genome-wide association studies. |
title_full_unstemmed |
Statistical correction of the Winner's Curse explains replication variability in quantitative trait genome-wide association studies. |
title_sort |
statistical correction of the winner's curse explains replication variability in quantitative trait genome-wide association studies. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS Genetics |
issn |
1553-7390 1553-7404 |
publishDate |
2017-07-01 |
description |
Genome-wide association studies (GWAS) have identified hundreds of SNPs responsible for variation in human quantitative traits. However, genome-wide-significant associations often fail to replicate across independent cohorts, in apparent inconsistency with their apparent strong effects in discovery cohorts. This limited success of replication raises pervasive questions about the utility of the GWAS field. We identify all 332 studies of quantitative traits from the NHGRI-EBI GWAS Database with attempted replication. We find that the majority of studies provide insufficient data to evaluate replication rates. The remaining papers replicate significantly worse than expected (p < 10-14), even when adjusting for regression-to-the-mean of effect size between discovery- and replication-cohorts termed the Winner's Curse (p < 10-16). We show this is due in part to misreporting replication cohort-size as a maximum number, rather than per-locus one. In 39 studies accurately reporting per-locus cohort-size for attempted replication of 707 loci in samples with similar ancestry, replication rate matched expectation (predicted 458, observed 457, p = 0.94). In contrast, ancestry differences between replication and discovery (13 studies, 385 loci) cause the most highly-powered decile of loci to replicate worse than expected, due to difference in linkage disequilibrium. |
url |
http://europepmc.org/articles/PMC5536394?pdf=render |
work_keys_str_mv |
AT cameronpalmer statisticalcorrectionofthewinnerscurseexplainsreplicationvariabilityinquantitativetraitgenomewideassociationstudies AT itsikpeer statisticalcorrectionofthewinnerscurseexplainsreplicationvariabilityinquantitativetraitgenomewideassociationstudies |
_version_ |
1724851268936007680 |