Penalized classification for optimal statistical selection of markers from high-throughput genotyping: application in sheep breeds

The identification of individuals’ breed of origin has several practical applications in livestock and is useful in different biological contexts such as conservation genetics, breeding and authentication of animal products. In this paper, penalized multinomial regression was applied to identify the...

Full description

Bibliographic Details
Main Authors: G. Sottile, M.T. Sardina, S. Mastrangelo, R. Di Gerlando, M. Tolone, M. Chiodi, B. Portolano
Format: Article
Language:English
Published: Elsevier 2018-01-01
Series:Animal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S175173111700266X
id doaj-60c0bd367816435d80d670f902be9820
record_format Article
spelling doaj-60c0bd367816435d80d670f902be98202021-06-06T04:54:06ZengElsevierAnimal1751-73112018-01-0112611181125Penalized classification for optimal statistical selection of markers from high-throughput genotyping: application in sheep breedsG. Sottile0M.T. Sardina1S. Mastrangelo2R. Di Gerlando3M. Tolone4M. Chiodi5B. Portolano6Dipartimento Scienze Economiche, Aziendali e Statistiche, University of Palermo, Palermo, ItalyDipartimento Scienze Agrarie, Alimentari e Forestali, University of Palermo, Palermo, ItalyDipartimento Scienze Agrarie, Alimentari e Forestali, University of Palermo, Palermo, ItalyDipartimento Scienze Agrarie, Alimentari e Forestali, University of Palermo, Palermo, ItalyDipartimento Scienze Agrarie, Alimentari e Forestali, University of Palermo, Palermo, ItalyDipartimento Scienze Economiche, Aziendali e Statistiche, University of Palermo, Palermo, ItalyDipartimento Scienze Agrarie, Alimentari e Forestali, University of Palermo, Palermo, ItalyThe identification of individuals’ breed of origin has several practical applications in livestock and is useful in different biological contexts such as conservation genetics, breeding and authentication of animal products. In this paper, penalized multinomial regression was applied to identify the minimum number of single nucleotide polymorphisms (SNPs) from high-throughput genotyping data for individual assignment to dairy sheep breeds reared in Sicily. The combined use of penalized multinomial regression and stability selection reduced the number of SNPs required to 48. A final validation step on an independent population was carried out obtaining 100% correctly classified individuals. The results using independent analysis, such as admixture, Fst, principal component analysis and random forest, confirmed the ability of these methods in selecting distinctive markers. The identified SNPs may constitute a starting point for the development of a SNP based identification test as a tool for breed assignment and traceability of animal products.http://www.sciencedirect.com/science/article/pii/S175173111700266Xpenalized multinomial regressionstability selectionsheep breedslivestock genetic resourcessingle nucleotide polymorphism markers
collection DOAJ
language English
format Article
sources DOAJ
author G. Sottile
M.T. Sardina
S. Mastrangelo
R. Di Gerlando
M. Tolone
M. Chiodi
B. Portolano
spellingShingle G. Sottile
M.T. Sardina
S. Mastrangelo
R. Di Gerlando
M. Tolone
M. Chiodi
B. Portolano
Penalized classification for optimal statistical selection of markers from high-throughput genotyping: application in sheep breeds
Animal
penalized multinomial regression
stability selection
sheep breeds
livestock genetic resources
single nucleotide polymorphism markers
author_facet G. Sottile
M.T. Sardina
S. Mastrangelo
R. Di Gerlando
M. Tolone
M. Chiodi
B. Portolano
author_sort G. Sottile
title Penalized classification for optimal statistical selection of markers from high-throughput genotyping: application in sheep breeds
title_short Penalized classification for optimal statistical selection of markers from high-throughput genotyping: application in sheep breeds
title_full Penalized classification for optimal statistical selection of markers from high-throughput genotyping: application in sheep breeds
title_fullStr Penalized classification for optimal statistical selection of markers from high-throughput genotyping: application in sheep breeds
title_full_unstemmed Penalized classification for optimal statistical selection of markers from high-throughput genotyping: application in sheep breeds
title_sort penalized classification for optimal statistical selection of markers from high-throughput genotyping: application in sheep breeds
publisher Elsevier
series Animal
issn 1751-7311
publishDate 2018-01-01
description The identification of individuals’ breed of origin has several practical applications in livestock and is useful in different biological contexts such as conservation genetics, breeding and authentication of animal products. In this paper, penalized multinomial regression was applied to identify the minimum number of single nucleotide polymorphisms (SNPs) from high-throughput genotyping data for individual assignment to dairy sheep breeds reared in Sicily. The combined use of penalized multinomial regression and stability selection reduced the number of SNPs required to 48. A final validation step on an independent population was carried out obtaining 100% correctly classified individuals. The results using independent analysis, such as admixture, Fst, principal component analysis and random forest, confirmed the ability of these methods in selecting distinctive markers. The identified SNPs may constitute a starting point for the development of a SNP based identification test as a tool for breed assignment and traceability of animal products.
topic penalized multinomial regression
stability selection
sheep breeds
livestock genetic resources
single nucleotide polymorphism markers
url http://www.sciencedirect.com/science/article/pii/S175173111700266X
work_keys_str_mv AT gsottile penalizedclassificationforoptimalstatisticalselectionofmarkersfromhighthroughputgenotypingapplicationinsheepbreeds
AT mtsardina penalizedclassificationforoptimalstatisticalselectionofmarkersfromhighthroughputgenotypingapplicationinsheepbreeds
AT smastrangelo penalizedclassificationforoptimalstatisticalselectionofmarkersfromhighthroughputgenotypingapplicationinsheepbreeds
AT rdigerlando penalizedclassificationforoptimalstatisticalselectionofmarkersfromhighthroughputgenotypingapplicationinsheepbreeds
AT mtolone penalizedclassificationforoptimalstatisticalselectionofmarkersfromhighthroughputgenotypingapplicationinsheepbreeds
AT mchiodi penalizedclassificationforoptimalstatisticalselectionofmarkersfromhighthroughputgenotypingapplicationinsheepbreeds
AT bportolano penalizedclassificationforoptimalstatisticalselectionofmarkersfromhighthroughputgenotypingapplicationinsheepbreeds
_version_ 1721394853717213184