Genomic Prediction Enhanced Sparse Testing for Multi-environment Trials

“Sparse testing” refers to reduced multi-environment breeding trials in which not all genotypes of interest are grown in each environment. Using genomic-enabled prediction and a model embracing genotype × environment interaction (GE), the non-observed genotype-in-environment combinations can be pred...

Full description

Bibliographic Details
Main Authors: Diego Jarquin, Reka Howard, Jose Crossa, Yoseph Beyene, Manje Gowda, Johannes W. R. Martini, Giovanny Covarrubias Pazaran, Juan Burgueño, Angela Pacheco, Martin Grondona, Valentin Wimmer, Boddupalli M. Prasanna
Format: Article
Language:English
Published: Oxford University Press 2020-08-01
Series:G3: Genes, Genomes, Genetics
Subjects:
Online Access:http://g3journal.org/lookup/doi/10.1534/g3.120.401349
id doaj-6021aeb04ef64db39adeebff4f7d532f
record_format Article
spelling doaj-6021aeb04ef64db39adeebff4f7d532f2021-07-02T12:09:47ZengOxford University PressG3: Genes, Genomes, Genetics2160-18362020-08-011082725273910.1534/g3.120.40134915Genomic Prediction Enhanced Sparse Testing for Multi-environment TrialsDiego JarquinReka HowardJose CrossaYoseph BeyeneManje GowdaJohannes W. R. MartiniGiovanny Covarrubias PazaranJuan BurgueñoAngela PachecoMartin GrondonaValentin WimmerBoddupalli M. Prasanna“Sparse testing” refers to reduced multi-environment breeding trials in which not all genotypes of interest are grown in each environment. Using genomic-enabled prediction and a model embracing genotype × environment interaction (GE), the non-observed genotype-in-environment combinations can be predicted. Consequently, the overall costs can be reduced and the testing capacities can be increased. The accuracy of predicting the unobserved data depends on different factors including (1) how many genotypes overlap between environments, (2) in how many environments each genotype is grown, and (3) which prediction method is used. In this research, we studied the predictive ability obtained when using a fixed number of plots and different sparse testing designs. The considered designs included the extreme cases of (1) no overlap of genotypes between environments, and (2) complete overlap of the genotypes between environments. In the latter case, the prediction set fully consists of genotypes that have not been tested at all. Moreover, we gradually go from one extreme to the other considering (3) intermediates between the two previous cases with varying numbers of different or non-overlapping (NO)/overlapping (O) genotypes. The empirical study is built upon two different maize hybrid data sets consisting of different genotypes crossed to two different testers (T1 and T2) and each data set was analyzed separately. For each set, phenotypic records on yield from three different environments are available. Three different prediction models were implemented, two main effects models (M1 and M2), and a model (M3) including GE. The results showed that the genome-based model including GE (M3) captured more phenotypic variation than the models that did not include this component. Also, M3 provided higher prediction accuracy than models M1 and M2 for the different allocation scenarios. Reducing the size of the calibration sets decreased the prediction accuracy under all allocation designs with M3 being the less affected model; however, using the genome-enabled models (i.e., M2 and M3) the predictive ability is recovered when more genotypes are tested across environments. Our results indicate that a substantial part of the testing resources can be saved when using genome-based models including GE for optimizing sparse testing designs.http://g3journal.org/lookup/doi/10.1534/g3.120.401349genomic-enabled prediction accuracysparse testing methodsallocation of non-overlapping/overlapping genotypes in environmentsrandom cross-validationsmaize multi-environment trialsgenotype-by-environment interaction gegenpredshared data resources
collection DOAJ
language English
format Article
sources DOAJ
author Diego Jarquin
Reka Howard
Jose Crossa
Yoseph Beyene
Manje Gowda
Johannes W. R. Martini
Giovanny Covarrubias Pazaran
Juan Burgueño
Angela Pacheco
Martin Grondona
Valentin Wimmer
Boddupalli M. Prasanna
spellingShingle Diego Jarquin
Reka Howard
Jose Crossa
Yoseph Beyene
Manje Gowda
Johannes W. R. Martini
Giovanny Covarrubias Pazaran
Juan Burgueño
Angela Pacheco
Martin Grondona
Valentin Wimmer
Boddupalli M. Prasanna
Genomic Prediction Enhanced Sparse Testing for Multi-environment Trials
G3: Genes, Genomes, Genetics
genomic-enabled prediction accuracy
sparse testing methods
allocation of non-overlapping/overlapping genotypes in environments
random cross-validations
maize multi-environment trials
genotype-by-environment interaction ge
genpred
shared data resources
author_facet Diego Jarquin
Reka Howard
Jose Crossa
Yoseph Beyene
Manje Gowda
Johannes W. R. Martini
Giovanny Covarrubias Pazaran
Juan Burgueño
Angela Pacheco
Martin Grondona
Valentin Wimmer
Boddupalli M. Prasanna
author_sort Diego Jarquin
title Genomic Prediction Enhanced Sparse Testing for Multi-environment Trials
title_short Genomic Prediction Enhanced Sparse Testing for Multi-environment Trials
title_full Genomic Prediction Enhanced Sparse Testing for Multi-environment Trials
title_fullStr Genomic Prediction Enhanced Sparse Testing for Multi-environment Trials
title_full_unstemmed Genomic Prediction Enhanced Sparse Testing for Multi-environment Trials
title_sort genomic prediction enhanced sparse testing for multi-environment trials
publisher Oxford University Press
series G3: Genes, Genomes, Genetics
issn 2160-1836
publishDate 2020-08-01
description “Sparse testing” refers to reduced multi-environment breeding trials in which not all genotypes of interest are grown in each environment. Using genomic-enabled prediction and a model embracing genotype × environment interaction (GE), the non-observed genotype-in-environment combinations can be predicted. Consequently, the overall costs can be reduced and the testing capacities can be increased. The accuracy of predicting the unobserved data depends on different factors including (1) how many genotypes overlap between environments, (2) in how many environments each genotype is grown, and (3) which prediction method is used. In this research, we studied the predictive ability obtained when using a fixed number of plots and different sparse testing designs. The considered designs included the extreme cases of (1) no overlap of genotypes between environments, and (2) complete overlap of the genotypes between environments. In the latter case, the prediction set fully consists of genotypes that have not been tested at all. Moreover, we gradually go from one extreme to the other considering (3) intermediates between the two previous cases with varying numbers of different or non-overlapping (NO)/overlapping (O) genotypes. The empirical study is built upon two different maize hybrid data sets consisting of different genotypes crossed to two different testers (T1 and T2) and each data set was analyzed separately. For each set, phenotypic records on yield from three different environments are available. Three different prediction models were implemented, two main effects models (M1 and M2), and a model (M3) including GE. The results showed that the genome-based model including GE (M3) captured more phenotypic variation than the models that did not include this component. Also, M3 provided higher prediction accuracy than models M1 and M2 for the different allocation scenarios. Reducing the size of the calibration sets decreased the prediction accuracy under all allocation designs with M3 being the less affected model; however, using the genome-enabled models (i.e., M2 and M3) the predictive ability is recovered when more genotypes are tested across environments. Our results indicate that a substantial part of the testing resources can be saved when using genome-based models including GE for optimizing sparse testing designs.
topic genomic-enabled prediction accuracy
sparse testing methods
allocation of non-overlapping/overlapping genotypes in environments
random cross-validations
maize multi-environment trials
genotype-by-environment interaction ge
genpred
shared data resources
url http://g3journal.org/lookup/doi/10.1534/g3.120.401349
work_keys_str_mv AT diegojarquin genomicpredictionenhancedsparsetestingformultienvironmenttrials
AT rekahoward genomicpredictionenhancedsparsetestingformultienvironmenttrials
AT josecrossa genomicpredictionenhancedsparsetestingformultienvironmenttrials
AT yosephbeyene genomicpredictionenhancedsparsetestingformultienvironmenttrials
AT manjegowda genomicpredictionenhancedsparsetestingformultienvironmenttrials
AT johanneswrmartini genomicpredictionenhancedsparsetestingformultienvironmenttrials
AT giovannycovarrubiaspazaran genomicpredictionenhancedsparsetestingformultienvironmenttrials
AT juanburgueno genomicpredictionenhancedsparsetestingformultienvironmenttrials
AT angelapacheco genomicpredictionenhancedsparsetestingformultienvironmenttrials
AT martingrondona genomicpredictionenhancedsparsetestingformultienvironmenttrials
AT valentinwimmer genomicpredictionenhancedsparsetestingformultienvironmenttrials
AT boddupallimprasanna genomicpredictionenhancedsparsetestingformultienvironmenttrials
_version_ 1721330280243920896