Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories

Inference of demographic history from genetic data is a primary goal of population genetics of model and nonmodel organisms. Whole genome-based approaches such as the pairwise/multiple sequentially Markovian coalescent methods use genomic data from one to four individuals to infer the demographic hi...

Full description

Bibliographic Details
Main Authors: Annabel C. Beichman, Tanya N. Phung, Kirk E. Lohmueller
Format: Article
Language:English
Published: Oxford University Press 2017-11-01
Series:G3: Genes, Genomes, Genetics
Subjects:
Online Access:http://g3journal.org/lookup/doi/10.1534/g3.117.300259
id doaj-e4c663d2d0df4a648fc6ac8e9c0c4d13
record_format Article
spelling doaj-e4c663d2d0df4a648fc6ac8e9c0c4d132021-07-02T01:41:29ZengOxford University PressG3: Genes, Genomes, Genetics2160-18362017-11-017113605362010.1534/g3.117.3002595Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic HistoriesAnnabel C. BeichmanTanya N. PhungKirk E. LohmuellerInference of demographic history from genetic data is a primary goal of population genetics of model and nonmodel organisms. Whole genome-based approaches such as the pairwise/multiple sequentially Markovian coalescent methods use genomic data from one to four individuals to infer the demographic history of an entire population, while site frequency spectrum (SFS)-based methods use the distribution of allele frequencies in a sample to reconstruct the same historical events. Although both methods are extensively used in empirical studies and perform well on data simulated under simple models, there have been only limited comparisons of them in more complex and realistic settings. Here we use published demographic models based on data from three human populations (Yoruba, descendants of northwest-Europeans, and Han Chinese) as an empirical test case to study the behavior of both inference procedures. We find that several of the demographic histories inferred by the whole genome-based methods do not predict the genome-wide distribution of heterozygosity, nor do they predict the empirical SFS. However, using simulated data, we also find that the whole genome methods can reconstruct the complex demographic models inferred by SFS-based methods, suggesting that the discordant patterns of genetic variation are not attributable to a lack of statistical power, but may reflect unmodeled complexities in the underlying demography. More generally, our findings indicate that demographic inference from a small number of genomes, routine in genomic studies of nonmodel organisms, should be interpreted cautiously, as these models cannot recapitulate other summaries of the data.http://g3journal.org/lookup/doi/10.1534/g3.117.300259pairwise sequentially Markovian coalescentsite frequency spectrumpopulation geneticsdemographic inferencenonmodel organisms
collection DOAJ
language English
format Article
sources DOAJ
author Annabel C. Beichman
Tanya N. Phung
Kirk E. Lohmueller
spellingShingle Annabel C. Beichman
Tanya N. Phung
Kirk E. Lohmueller
Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories
G3: Genes, Genomes, Genetics
pairwise sequentially Markovian coalescent
site frequency spectrum
population genetics
demographic inference
nonmodel organisms
author_facet Annabel C. Beichman
Tanya N. Phung
Kirk E. Lohmueller
author_sort Annabel C. Beichman
title Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories
title_short Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories
title_full Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories
title_fullStr Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories
title_full_unstemmed Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories
title_sort comparison of single genome and allele frequency data reveals discordant demographic histories
publisher Oxford University Press
series G3: Genes, Genomes, Genetics
issn 2160-1836
publishDate 2017-11-01
description Inference of demographic history from genetic data is a primary goal of population genetics of model and nonmodel organisms. Whole genome-based approaches such as the pairwise/multiple sequentially Markovian coalescent methods use genomic data from one to four individuals to infer the demographic history of an entire population, while site frequency spectrum (SFS)-based methods use the distribution of allele frequencies in a sample to reconstruct the same historical events. Although both methods are extensively used in empirical studies and perform well on data simulated under simple models, there have been only limited comparisons of them in more complex and realistic settings. Here we use published demographic models based on data from three human populations (Yoruba, descendants of northwest-Europeans, and Han Chinese) as an empirical test case to study the behavior of both inference procedures. We find that several of the demographic histories inferred by the whole genome-based methods do not predict the genome-wide distribution of heterozygosity, nor do they predict the empirical SFS. However, using simulated data, we also find that the whole genome methods can reconstruct the complex demographic models inferred by SFS-based methods, suggesting that the discordant patterns of genetic variation are not attributable to a lack of statistical power, but may reflect unmodeled complexities in the underlying demography. More generally, our findings indicate that demographic inference from a small number of genomes, routine in genomic studies of nonmodel organisms, should be interpreted cautiously, as these models cannot recapitulate other summaries of the data.
topic pairwise sequentially Markovian coalescent
site frequency spectrum
population genetics
demographic inference
nonmodel organisms
url http://g3journal.org/lookup/doi/10.1534/g3.117.300259
work_keys_str_mv AT annabelcbeichman comparisonofsinglegenomeandallelefrequencydatarevealsdiscordantdemographichistories
AT tanyanphung comparisonofsinglegenomeandallelefrequencydatarevealsdiscordantdemographichistories
AT kirkelohmueller comparisonofsinglegenomeandallelefrequencydatarevealsdiscordantdemographichistories
_version_ 1721344686763802624