Missing SNP Genotype Imputation

High-throughput single nucleotide polymorphism (SNP) genotyping technologies conveniently produce large SNP genotype datasets for genome-wide linkage and association studies. Various factors, from array design and hybridization, can give rise to a certain percentage of missing calls, and the problem...

Full description

Bibliographic Details
Main Author:	Wang, Yining
Other Authors:	Lin, Guohui (Computing Science)
Format:	Others
Language:	en_US
Published:	2011
Online Access:	http://hdl.handle.net/10048/1949

id	ndltd-LACETR-oai-collectionscanada.gc.ca-AEU.10048-1949
record_format	oai_dc
spelling	ndltd-LACETR-oai-collectionscanada.gc.ca-AEU.10048-19492011-12-13T13:53:36ZLin, Guohui (Computing Science)Wang, Yining2011-06-02T19:07:51Z2011-06-02T19:07:51Z2011-06-02T19:07:51Zhttp://hdl.handle.net/10048/1949High-throughput single nucleotide polymorphism (SNP) genotyping technologies conveniently produce large SNP genotype datasets for genome-wide linkage and association studies. Various factors, from array design and hybridization, can give rise to a certain percentage of missing calls, and the problem becomes severe when the target organisms such as cattle do not have a high resolution genomic sequence available. Missing calls in SNP genotype datasets would undermine downstream data analysis. Therefore, effective methodologies for dealing with missing genotypes are in urgent need. In this dissertation, we start with a brief introduction to the concepts in genetics, then present a collection of imputation methods, with focus on machine learning algorithms, to tackle the missing SNP genotype problem. We demonstrate that these imputation approaches can achieve satisfactory accuracies, tested on the real population SNP genotype datasets, and highlight the places where our new methods find useful. We conclude with some possible future directions for the genome-wide SNP genotype imputation problem.465867 bytesapplication/pdfen_USMissing SNP Genotype ImputationThesisMaster of ScienceMaster'sDepartment of Computing ScienceUniversity of Alberta2011-11ScienceGreiner, Russ (Computing Science)Li, Changxi (Agricultural, Food and Nutritional Science)
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	High-throughput single nucleotide polymorphism (SNP) genotyping technologies conveniently produce large SNP genotype datasets for genome-wide linkage and association studies. Various factors, from array design and hybridization, can give rise to a certain percentage of missing calls, and the problem becomes severe when the target organisms such as cattle do not have a high resolution genomic sequence available. Missing calls in SNP genotype datasets would undermine downstream data analysis. Therefore, effective methodologies for dealing with missing genotypes are in urgent need. In this dissertation, we start with a brief introduction to the concepts in genetics, then present a collection of imputation methods, with focus on machine learning algorithms, to tackle the missing SNP genotype problem. We demonstrate that these imputation approaches can achieve satisfactory accuracies, tested on the real population SNP genotype datasets, and highlight the places where our new methods find useful. We conclude with some possible future directions for the genome-wide SNP genotype imputation problem. === Science
author2	Lin, Guohui (Computing Science)
author_facet	Lin, Guohui (Computing Science) Wang, Yining
author	Wang, Yining
spellingShingle	Wang, Yining Missing SNP Genotype Imputation
author_sort	Wang, Yining
title	Missing SNP Genotype Imputation
title_short	Missing SNP Genotype Imputation
title_full	Missing SNP Genotype Imputation
title_fullStr	Missing SNP Genotype Imputation
title_full_unstemmed	Missing SNP Genotype Imputation
title_sort	missing snp genotype imputation
publishDate	2011
url	http://hdl.handle.net/10048/1949
work_keys_str_mv	AT wangyining missingsnpgenotypeimputation
_version_	1716389216239222784

Missing SNP Genotype Imputation

Similar Items