Robust prediction of expression differences among human individuals using only genotype information.
Many genetic variants that are significantly correlated to gene expression changes across human individuals have been identified, but the ability of these variants to predict expression of unseen individuals has rarely been evaluated. Here, we devise an algorithm that, given training expression and...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2013-03-01
|
Series: | PLoS Genetics |
Online Access: | http://europepmc.org/articles/PMC3610805?pdf=render |
id |
doaj-ad6fdac17a8d47d59a3689a0ee6b66c4 |
---|---|
record_format |
Article |
spelling |
doaj-ad6fdac17a8d47d59a3689a0ee6b66c42020-11-25T02:23:07ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042013-03-0193e100339610.1371/journal.pgen.1003396Robust prediction of expression differences among human individuals using only genotype information.Ohad ManorEran SegalMany genetic variants that are significantly correlated to gene expression changes across human individuals have been identified, but the ability of these variants to predict expression of unseen individuals has rarely been evaluated. Here, we devise an algorithm that, given training expression and genotype data for a set of individuals, predicts the expression of genes of unseen test individuals given only their genotype in the local genomic vicinity of the predicted gene. Notably, the resulting predictions are remarkably robust in that they agree well between the training and test sets, even when the training and test sets consist of individuals from distinct populations. Thus, although the overall number of genes that can be predicted is relatively small, as expected from our choice to ignore effects such as environmental factors and trans sequence variation, the robust nature of the predictions means that the identity and quantitative degree to which genes can be predicted is known in advance. We also present an extension that incorporates heterogeneous types of genomic annotations to differentially weigh the importance of the various genetic variants, and we show that assigning higher weights to variants with particular annotations such as proximity to genes and high regional G/C content can further improve the predictions. Finally, genes that are successfully predicted have, on average, higher expression and more variability across individuals, providing insight into the characteristics of the types of genes that can be predicted from their cis genetic variation.http://europepmc.org/articles/PMC3610805?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ohad Manor Eran Segal |
spellingShingle |
Ohad Manor Eran Segal Robust prediction of expression differences among human individuals using only genotype information. PLoS Genetics |
author_facet |
Ohad Manor Eran Segal |
author_sort |
Ohad Manor |
title |
Robust prediction of expression differences among human individuals using only genotype information. |
title_short |
Robust prediction of expression differences among human individuals using only genotype information. |
title_full |
Robust prediction of expression differences among human individuals using only genotype information. |
title_fullStr |
Robust prediction of expression differences among human individuals using only genotype information. |
title_full_unstemmed |
Robust prediction of expression differences among human individuals using only genotype information. |
title_sort |
robust prediction of expression differences among human individuals using only genotype information. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS Genetics |
issn |
1553-7390 1553-7404 |
publishDate |
2013-03-01 |
description |
Many genetic variants that are significantly correlated to gene expression changes across human individuals have been identified, but the ability of these variants to predict expression of unseen individuals has rarely been evaluated. Here, we devise an algorithm that, given training expression and genotype data for a set of individuals, predicts the expression of genes of unseen test individuals given only their genotype in the local genomic vicinity of the predicted gene. Notably, the resulting predictions are remarkably robust in that they agree well between the training and test sets, even when the training and test sets consist of individuals from distinct populations. Thus, although the overall number of genes that can be predicted is relatively small, as expected from our choice to ignore effects such as environmental factors and trans sequence variation, the robust nature of the predictions means that the identity and quantitative degree to which genes can be predicted is known in advance. We also present an extension that incorporates heterogeneous types of genomic annotations to differentially weigh the importance of the various genetic variants, and we show that assigning higher weights to variants with particular annotations such as proximity to genes and high regional G/C content can further improve the predictions. Finally, genes that are successfully predicted have, on average, higher expression and more variability across individuals, providing insight into the characteristics of the types of genes that can be predicted from their cis genetic variation. |
url |
http://europepmc.org/articles/PMC3610805?pdf=render |
work_keys_str_mv |
AT ohadmanor robustpredictionofexpressiondifferencesamonghumanindividualsusingonlygenotypeinformation AT eransegal robustpredictionofexpressiondifferencesamonghumanindividualsusingonlygenotypeinformation |
_version_ |
1724859781762514944 |