Inferring the ancestry of parents and grandparents from genetic data.

Inference of admixture proportions is a classical statistical problem in population genetics. Standard methods implicitly assume that both parents of an individual have the same admixture fraction. However, this is rarely the case in real data. In this paper we show that the distribution of admixtur...

Full description

Bibliographic Details
Main Authors: Jingwen Pei, Yiming Zhang, Rasmus Nielsen, Yufeng Wu
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-08-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1008065
id doaj-764401046d504658b19e717cb11d9c4d
record_format Article
spelling doaj-764401046d504658b19e717cb11d9c4d2021-04-21T15:16:49ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582020-08-01168e100806510.1371/journal.pcbi.1008065Inferring the ancestry of parents and grandparents from genetic data.Jingwen PeiYiming ZhangRasmus NielsenYufeng WuInference of admixture proportions is a classical statistical problem in population genetics. Standard methods implicitly assume that both parents of an individual have the same admixture fraction. However, this is rarely the case in real data. In this paper we show that the distribution of admixture tract lengths in a genome contains information about the admixture proportions of the ancestors of an individual. We develop a Hidden Markov Model (HMM) framework for estimating the admixture proportions of the immediate ancestors of an individual, i.e. a type of decomposition of an individual's admixture proportions into further subsets of ancestral proportions in the ancestors. Based on a genealogical model for admixture tracts, we develop an efficient algorithm for computing the sampling probability of the genome from a single individual, as a function of the admixture proportions of the ancestors of this individual. This allows us to perform probabilistic inference of admixture proportions of ancestors only using the genome of an extant individual. We perform extensive simulations to quantify the error in the estimation of ancestral admixture proportions under various conditions. To illustrate the utility of the method, we apply it to real genetic data.https://doi.org/10.1371/journal.pcbi.1008065
collection DOAJ
language English
format Article
sources DOAJ
author Jingwen Pei
Yiming Zhang
Rasmus Nielsen
Yufeng Wu
spellingShingle Jingwen Pei
Yiming Zhang
Rasmus Nielsen
Yufeng Wu
Inferring the ancestry of parents and grandparents from genetic data.
PLoS Computational Biology
author_facet Jingwen Pei
Yiming Zhang
Rasmus Nielsen
Yufeng Wu
author_sort Jingwen Pei
title Inferring the ancestry of parents and grandparents from genetic data.
title_short Inferring the ancestry of parents and grandparents from genetic data.
title_full Inferring the ancestry of parents and grandparents from genetic data.
title_fullStr Inferring the ancestry of parents and grandparents from genetic data.
title_full_unstemmed Inferring the ancestry of parents and grandparents from genetic data.
title_sort inferring the ancestry of parents and grandparents from genetic data.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2020-08-01
description Inference of admixture proportions is a classical statistical problem in population genetics. Standard methods implicitly assume that both parents of an individual have the same admixture fraction. However, this is rarely the case in real data. In this paper we show that the distribution of admixture tract lengths in a genome contains information about the admixture proportions of the ancestors of an individual. We develop a Hidden Markov Model (HMM) framework for estimating the admixture proportions of the immediate ancestors of an individual, i.e. a type of decomposition of an individual's admixture proportions into further subsets of ancestral proportions in the ancestors. Based on a genealogical model for admixture tracts, we develop an efficient algorithm for computing the sampling probability of the genome from a single individual, as a function of the admixture proportions of the ancestors of this individual. This allows us to perform probabilistic inference of admixture proportions of ancestors only using the genome of an extant individual. We perform extensive simulations to quantify the error in the estimation of ancestral admixture proportions under various conditions. To illustrate the utility of the method, we apply it to real genetic data.
url https://doi.org/10.1371/journal.pcbi.1008065
work_keys_str_mv AT jingwenpei inferringtheancestryofparentsandgrandparentsfromgeneticdata
AT yimingzhang inferringtheancestryofparentsandgrandparentsfromgeneticdata
AT rasmusnielsen inferringtheancestryofparentsandgrandparentsfromgeneticdata
AT yufengwu inferringtheancestryofparentsandgrandparentsfromgeneticdata
_version_ 1714667526369050624