Research and Application of Tag SNPs Selection Using Linkage Disequilibrium

碩士 === 國立高雄應用科技大學 === 電子與資訊工程研究所碩士班 === 96 === Single nucleotide polymorphisms (SNPs) constitute a certain type of genetic variation in deoxyribonucleic acid (DNA) molecules. They are widely used in the search for genes specifically related to certain diseases and to characterize particular human tr...

Full description

Bibliographic Details
Main Authors: Yan-Jhu Chang, 張妍竹
Other Authors: Cheng-Hong Yang
Format: Others
Language:zh-TW
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/82460874457489513073
Description
Summary:碩士 === 國立高雄應用科技大學 === 電子與資訊工程研究所碩士班 === 96 === Single nucleotide polymorphisms (SNPs) constitute a certain type of genetic variation in deoxyribonucleic acid (DNA) molecules. They are widely used in the search for genes specifically related to certain diseases and to characterize particular human traits. Known disease-causing SNPs and their effects on treatment currently make up only a small part of SNPs identified so far. For a comprehensive disease analysis, filtering of the SNPs is often employed in a first step in order to select a set of relevant and informative tag SNPs which define the entire SNP. Linkage disequilibrium (LD) can be effectively used to select tag SNPs which characterize a SNP and to remove SNPs irrelevant for the disease. With this filtering method, the necessary computational time can be significantly decreased and classification accuracy can be improved. In this thesis, the linkage disequilibrium was measured by haplotype inference. The Hardy Weinberg Equilibrium was used to judge whether the obtained inference values were in a state of linkage disequilibrium, and these LD instances were then calculated by an expectation maximization (EM) algorithm. The tag SNPs selection itself was controlled by genetic algorithms. Thus, the entire tag SNP selection process is divided into two parts, namely the LD calculation and with the subsequent calculations performed by the genetic algorithm. The results obtained in this thesis show that the number of relevant tag SNPs necessary to define a SNP could be significantly lowered. Thus, SNP identification could be improved, and the selection of similar or identical SNPs avoided.