A Stochastic Grammar of 3' Terminal of Homo Sapiens Genes with Dependency Graphs and Their Expanded Bayesian Networks

碩士 === 國立清華大學 === 電機工程學系 === 93 === In bioinformatics, one of the challenging issue is to determine the specific structure of each gene from the 3 billion base-pairs of human DNA sequences. Polyadenylation site is a specific feature at the terminus of a gene which involves the endonucleolytic cleava...

Full description

Bibliographic Details
Main Authors: Chiung-Wen Ho, 何瓊雯
Other Authors: Chung-Chin Lu
Format: Others
Language:zh-TW
Published: 2005
Online Access:http://ndltd.ncl.edu.tw/handle/22116550613464072301
Description
Summary:碩士 === 國立清華大學 === 電機工程學系 === 93 === In bioinformatics, one of the challenging issue is to determine the specific structure of each gene from the 3 billion base-pairs of human DNA sequences. Polyadenylation site is a specific feature at the terminus of a gene which involves the endonucleolytic cleavage of the pre-mRNA followed by the addition of a poly(A) tail, which is found at the 3’-terminal of the majority of mRNA. Factors related to cleavage and polyadenylation have to recognize associated signals, i.e., polyadenylation signal (PAS) and downstream element(DSE). PAS is the signal appearing in 10 to 30 nucleotide upstream of the cleavage and polyadenylation site and is with a highly conserved hexamer AAUAAA and a common variant AUUAAA in pre-mRNAs. DSE is in 20 to 40 nucleotide downstream to the cleavage and polyadenylation site and consists of a much less conserved U- or GU-rich sequence. In this thesis, we will construct a stochastic grammar of 3’-terminal of human genes by establishing the dependency graphs and their expanded Bayesian networks of the features in this region. Further more we will compare the performances of this stochastic grammar and the PAS detector provided by former researchers.