Accuracy of imputation to whole-genome sequence in sheep

Abstract Background The use of whole-genome sequence (WGS) data for genomic prediction and association studies is highly desirable because the causal mutations should be present in the data. The sequencing of 935 sheep from a range of breeds provides the opportunity to impute sheep genotyped with si...

Full description

Bibliographic Details
Main Authors: Sunduimijid Bolormaa, Amanda J. Chamberlain, Majid Khansefid, Paul Stothard, Andrew A. Swan, Brett Mason, Claire P. Prowse-Wilkins, Naomi Duijvesteijn, Nasir Moghaddar, Julius H. van der Werf, Hans D. Daetwyler, Iona M. MacLeod
Format: Article
Language:deu
Published: BMC 2019-01-01
Series:Genetics Selection Evolution
Online Access:http://link.springer.com/article/10.1186/s12711-018-0443-5
id doaj-86960020a32d4d128e09accc32a9d310
record_format Article
spelling doaj-86960020a32d4d128e09accc32a9d3102020-11-25T02:19:28ZdeuBMCGenetics Selection Evolution1297-96862019-01-0151111710.1186/s12711-018-0443-5Accuracy of imputation to whole-genome sequence in sheepSunduimijid Bolormaa0Amanda J. Chamberlain1Majid Khansefid2Paul Stothard3Andrew A. Swan4Brett Mason5Claire P. Prowse-Wilkins6Naomi Duijvesteijn7Nasir Moghaddar8Julius H. van der Werf9Hans D. Daetwyler10Iona M. MacLeod11Agriculture Victoria, AgriBio, Centre for AgriBioscienceAgriculture Victoria, AgriBio, Centre for AgriBioscienceAgriculture Victoria, AgriBio, Centre for AgriBioscienceFaculty of Agricultural, Life and Environmental Sciences, University of AlbertaCooperative Research Centre for Sheep Industry InnovationAgriculture Victoria, AgriBio, Centre for AgriBioscienceAgriculture Victoria, AgriBio, Centre for AgriBioscienceCooperative Research Centre for Sheep Industry InnovationCooperative Research Centre for Sheep Industry InnovationCooperative Research Centre for Sheep Industry InnovationAgriculture Victoria, AgriBio, Centre for AgriBioscienceAgriculture Victoria, AgriBio, Centre for AgriBioscienceAbstract Background The use of whole-genome sequence (WGS) data for genomic prediction and association studies is highly desirable because the causal mutations should be present in the data. The sequencing of 935 sheep from a range of breeds provides the opportunity to impute sheep genotyped with single nucleotide polymorphism (SNP) arrays to WGS. This study evaluated the accuracy of imputation from SNP genotypes to WGS using this reference population of 935 sequenced sheep. Results The accuracy of imputation from the Ovine Infinium® HD BeadChip SNP (~ 500 k) to WGS was assessed for three target breeds: Merino, Poll Dorset and F1 Border Leicester × Merino. Imputation accuracy was highest for the Poll Dorset breed, although there were more Merino individuals in the sequenced reference population than Poll Dorset individuals. In addition, empirical imputation accuracies were higher (by up to 1.7%) when using larger multi-breed reference populations compared to using a smaller single-breed reference population. The mean accuracy of imputation across target breeds using the Minimac3 or the FImpute software was 0.94. The empirical imputation accuracy varied considerably across the genome; six chromosomes carried regions of one or more Mb with a mean imputation accuracy of < 0.7. Imputation accuracy in five variant annotation classes ranged from 0.87 (missense) up to 0.94 (intronic variants), where lower accuracy corresponded to higher proportions of rare alleles. The imputation quality statistic reported from Minimac3 (R 2) had a clear positive relationship with the empirical imputation accuracy. Therefore, by first discarding imputed variants with an R 2 below 0.4, the mean empirical accuracy across target breeds increased to 0.97. Although accuracy of genomic prediction was less affected by filtering on R 2 in a multi-breed population of sheep with imputed WGS, the genomic heritability clearly tended to be lower when using variants with an R 2 ≤ 0.4. Conclusions The mean imputation accuracy was high for all target breeds and was increased by combining smaller breed sets into a multi-breed reference. We found that the Minimac3 software imputation quality statistic (R 2) was a useful indicator of empirical imputation accuracy, enabling removal of very poorly imputed variants before downstream analyses.http://link.springer.com/article/10.1186/s12711-018-0443-5
collection DOAJ
language deu
format Article
sources DOAJ
author Sunduimijid Bolormaa
Amanda J. Chamberlain
Majid Khansefid
Paul Stothard
Andrew A. Swan
Brett Mason
Claire P. Prowse-Wilkins
Naomi Duijvesteijn
Nasir Moghaddar
Julius H. van der Werf
Hans D. Daetwyler
Iona M. MacLeod
spellingShingle Sunduimijid Bolormaa
Amanda J. Chamberlain
Majid Khansefid
Paul Stothard
Andrew A. Swan
Brett Mason
Claire P. Prowse-Wilkins
Naomi Duijvesteijn
Nasir Moghaddar
Julius H. van der Werf
Hans D. Daetwyler
Iona M. MacLeod
Accuracy of imputation to whole-genome sequence in sheep
Genetics Selection Evolution
author_facet Sunduimijid Bolormaa
Amanda J. Chamberlain
Majid Khansefid
Paul Stothard
Andrew A. Swan
Brett Mason
Claire P. Prowse-Wilkins
Naomi Duijvesteijn
Nasir Moghaddar
Julius H. van der Werf
Hans D. Daetwyler
Iona M. MacLeod
author_sort Sunduimijid Bolormaa
title Accuracy of imputation to whole-genome sequence in sheep
title_short Accuracy of imputation to whole-genome sequence in sheep
title_full Accuracy of imputation to whole-genome sequence in sheep
title_fullStr Accuracy of imputation to whole-genome sequence in sheep
title_full_unstemmed Accuracy of imputation to whole-genome sequence in sheep
title_sort accuracy of imputation to whole-genome sequence in sheep
publisher BMC
series Genetics Selection Evolution
issn 1297-9686
publishDate 2019-01-01
description Abstract Background The use of whole-genome sequence (WGS) data for genomic prediction and association studies is highly desirable because the causal mutations should be present in the data. The sequencing of 935 sheep from a range of breeds provides the opportunity to impute sheep genotyped with single nucleotide polymorphism (SNP) arrays to WGS. This study evaluated the accuracy of imputation from SNP genotypes to WGS using this reference population of 935 sequenced sheep. Results The accuracy of imputation from the Ovine Infinium® HD BeadChip SNP (~ 500 k) to WGS was assessed for three target breeds: Merino, Poll Dorset and F1 Border Leicester × Merino. Imputation accuracy was highest for the Poll Dorset breed, although there were more Merino individuals in the sequenced reference population than Poll Dorset individuals. In addition, empirical imputation accuracies were higher (by up to 1.7%) when using larger multi-breed reference populations compared to using a smaller single-breed reference population. The mean accuracy of imputation across target breeds using the Minimac3 or the FImpute software was 0.94. The empirical imputation accuracy varied considerably across the genome; six chromosomes carried regions of one or more Mb with a mean imputation accuracy of < 0.7. Imputation accuracy in five variant annotation classes ranged from 0.87 (missense) up to 0.94 (intronic variants), where lower accuracy corresponded to higher proportions of rare alleles. The imputation quality statistic reported from Minimac3 (R 2) had a clear positive relationship with the empirical imputation accuracy. Therefore, by first discarding imputed variants with an R 2 below 0.4, the mean empirical accuracy across target breeds increased to 0.97. Although accuracy of genomic prediction was less affected by filtering on R 2 in a multi-breed population of sheep with imputed WGS, the genomic heritability clearly tended to be lower when using variants with an R 2 ≤ 0.4. Conclusions The mean imputation accuracy was high for all target breeds and was increased by combining smaller breed sets into a multi-breed reference. We found that the Minimac3 software imputation quality statistic (R 2) was a useful indicator of empirical imputation accuracy, enabling removal of very poorly imputed variants before downstream analyses.
url http://link.springer.com/article/10.1186/s12711-018-0443-5
work_keys_str_mv AT sunduimijidbolormaa accuracyofimputationtowholegenomesequenceinsheep
AT amandajchamberlain accuracyofimputationtowholegenomesequenceinsheep
AT majidkhansefid accuracyofimputationtowholegenomesequenceinsheep
AT paulstothard accuracyofimputationtowholegenomesequenceinsheep
AT andrewaswan accuracyofimputationtowholegenomesequenceinsheep
AT brettmason accuracyofimputationtowholegenomesequenceinsheep
AT clairepprowsewilkins accuracyofimputationtowholegenomesequenceinsheep
AT naomiduijvesteijn accuracyofimputationtowholegenomesequenceinsheep
AT nasirmoghaddar accuracyofimputationtowholegenomesequenceinsheep
AT juliushvanderwerf accuracyofimputationtowholegenomesequenceinsheep
AT hansddaetwyler accuracyofimputationtowholegenomesequenceinsheep
AT ionammacleod accuracyofimputationtowholegenomesequenceinsheep
_version_ 1724876794617659392