How Population Structure Impacts Genomic Selection Accuracy in Cross-Validation: Implications for Practical Breeding

Over the last two decades, the application of genomic selection has been extensively studied in various crop species, and it has become a common practice to report prediction accuracies using cross validation. However, genomic prediction accuracies obtained from random cross validation can be strong...

Full description

Bibliographic Details
Published in:Frontiers in Plant Science
Main Authors: Christian R. Werner, R. Chris Gaynor, Gregor Gorjanc, John M. Hickey, Tobias Kox, Amine Abbadi, Gunhild Leckband, Rod J. Snowdon, Andreas Stahl
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-12-01
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpls.2020.592977/full
_version_ 1852822929240227840
author Christian R. Werner
R. Chris Gaynor
Gregor Gorjanc
John M. Hickey
Tobias Kox
Amine Abbadi
Gunhild Leckband
Rod J. Snowdon
Andreas Stahl
Andreas Stahl
author_facet Christian R. Werner
R. Chris Gaynor
Gregor Gorjanc
John M. Hickey
Tobias Kox
Amine Abbadi
Gunhild Leckband
Rod J. Snowdon
Andreas Stahl
Andreas Stahl
author_sort Christian R. Werner
collection DOAJ
container_title Frontiers in Plant Science
description Over the last two decades, the application of genomic selection has been extensively studied in various crop species, and it has become a common practice to report prediction accuracies using cross validation. However, genomic prediction accuracies obtained from random cross validation can be strongly inflated due to population or family structure, a characteristic shared by many breeding populations. An understanding of the effect of population and family structure on prediction accuracy is essential for the successful application of genomic selection in plant breeding programs. The objective of this study was to make this effect and its implications for practical breeding programs comprehensible for breeders and scientists with a limited background in quantitative genetics and genomic selection theory. We, therefore, compared genomic prediction accuracies obtained from different random cross validation approaches and within-family prediction in three different prediction scenarios. We used a highly structured population of 940 Brassica napus hybrids coming from 46 testcross families and two subpopulations. Our demonstrations show how genomic prediction accuracies obtained from among-family predictions in random cross validation and within-family predictions capture different measures of prediction accuracy. While among-family prediction accuracy measures prediction accuracy of both the parent average component and the Mendelian sampling term, within-family prediction only measures how accurately the Mendelian sampling term can be predicted. With this paper we aim to foster a critical approach to different measures of genomic prediction accuracy and a careful analysis of values observed in genomic selection experiments and reported in literature.
format Article
id doaj-art-68e6be21435f4fb8a2eb6f3c301efc3e
institution Directory of Open Access Journals
issn 1664-462X
language English
publishDate 2020-12-01
publisher Frontiers Media S.A.
record_format Article
spelling doaj-art-68e6be21435f4fb8a2eb6f3c301efc3e2025-08-19T20:31:34ZengFrontiers Media S.A.Frontiers in Plant Science1664-462X2020-12-011110.3389/fpls.2020.592977592977How Population Structure Impacts Genomic Selection Accuracy in Cross-Validation: Implications for Practical BreedingChristian R. Werner0R. Chris Gaynor1Gregor Gorjanc2John M. Hickey3Tobias Kox4Amine Abbadi5Gunhild Leckband6Rod J. Snowdon7Andreas Stahl8Andreas Stahl9The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Research Centre, Midlothian, United KingdomThe Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Research Centre, Midlothian, United KingdomThe Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Research Centre, Midlothian, United KingdomThe Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Research Centre, Midlothian, United KingdomNPZ Innovation GmbH, Holtsee, GermanyNPZ Innovation GmbH, Holtsee, GermanyGerman Seed Alliance GmbH, Hohenlieth, GermanyDepartment of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, GermanyDepartment of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, GermanyJulius Kuehn Institute (JKI), Federal Research Centre for Cultivated Plants, Institute for Resistance Research and Stress Tolerance, Quedlinburg, GermanyOver the last two decades, the application of genomic selection has been extensively studied in various crop species, and it has become a common practice to report prediction accuracies using cross validation. However, genomic prediction accuracies obtained from random cross validation can be strongly inflated due to population or family structure, a characteristic shared by many breeding populations. An understanding of the effect of population and family structure on prediction accuracy is essential for the successful application of genomic selection in plant breeding programs. The objective of this study was to make this effect and its implications for practical breeding programs comprehensible for breeders and scientists with a limited background in quantitative genetics and genomic selection theory. We, therefore, compared genomic prediction accuracies obtained from different random cross validation approaches and within-family prediction in three different prediction scenarios. We used a highly structured population of 940 Brassica napus hybrids coming from 46 testcross families and two subpopulations. Our demonstrations show how genomic prediction accuracies obtained from among-family predictions in random cross validation and within-family predictions capture different measures of prediction accuracy. While among-family prediction accuracy measures prediction accuracy of both the parent average component and the Mendelian sampling term, within-family prediction only measures how accurately the Mendelian sampling term can be predicted. With this paper we aim to foster a critical approach to different measures of genomic prediction accuracy and a careful analysis of values observed in genomic selection experiments and reported in literature.https://www.frontiersin.org/articles/10.3389/fpls.2020.592977/fullpredictive breedinggenomic predictionoilseed rapenested association mapping populationstructure
spellingShingle Christian R. Werner
R. Chris Gaynor
Gregor Gorjanc
John M. Hickey
Tobias Kox
Amine Abbadi
Gunhild Leckband
Rod J. Snowdon
Andreas Stahl
Andreas Stahl
How Population Structure Impacts Genomic Selection Accuracy in Cross-Validation: Implications for Practical Breeding
predictive breeding
genomic prediction
oilseed rape
nested association mapping population
structure
title How Population Structure Impacts Genomic Selection Accuracy in Cross-Validation: Implications for Practical Breeding
title_full How Population Structure Impacts Genomic Selection Accuracy in Cross-Validation: Implications for Practical Breeding
title_fullStr How Population Structure Impacts Genomic Selection Accuracy in Cross-Validation: Implications for Practical Breeding
title_full_unstemmed How Population Structure Impacts Genomic Selection Accuracy in Cross-Validation: Implications for Practical Breeding
title_short How Population Structure Impacts Genomic Selection Accuracy in Cross-Validation: Implications for Practical Breeding
title_sort how population structure impacts genomic selection accuracy in cross validation implications for practical breeding
topic predictive breeding
genomic prediction
oilseed rape
nested association mapping population
structure
url https://www.frontiersin.org/articles/10.3389/fpls.2020.592977/full
work_keys_str_mv AT christianrwerner howpopulationstructureimpactsgenomicselectionaccuracyincrossvalidationimplicationsforpracticalbreeding
AT rchrisgaynor howpopulationstructureimpactsgenomicselectionaccuracyincrossvalidationimplicationsforpracticalbreeding
AT gregorgorjanc howpopulationstructureimpactsgenomicselectionaccuracyincrossvalidationimplicationsforpracticalbreeding
AT johnmhickey howpopulationstructureimpactsgenomicselectionaccuracyincrossvalidationimplicationsforpracticalbreeding
AT tobiaskox howpopulationstructureimpactsgenomicselectionaccuracyincrossvalidationimplicationsforpracticalbreeding
AT amineabbadi howpopulationstructureimpactsgenomicselectionaccuracyincrossvalidationimplicationsforpracticalbreeding
AT gunhildleckband howpopulationstructureimpactsgenomicselectionaccuracyincrossvalidationimplicationsforpracticalbreeding
AT rodjsnowdon howpopulationstructureimpactsgenomicselectionaccuracyincrossvalidationimplicationsforpracticalbreeding
AT andreasstahl howpopulationstructureimpactsgenomicselectionaccuracyincrossvalidationimplicationsforpracticalbreeding
AT andreasstahl howpopulationstructureimpactsgenomicselectionaccuracyincrossvalidationimplicationsforpracticalbreeding