Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information

碩士 === 國立成功大學 === 醫學資訊研究所 === 96 === In the post-genome period, the protein domain structures have been published rapidly. For figuring out the cell function, the mechanism of protein-DNA interaction is an important subject in resent bioinformatics research and has not been comprehensively studied....

Full description

Bibliographic Details
Main Authors: Wei-Jhih Chen, 陳韋志
Other Authors: Hung-Yu Kao
Format: Others
Language:en_US
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/20716134364431964671
id ndltd-TW-096NCKU5674009
record_format oai_dc
spelling ndltd-TW-096NCKU56740092015-11-23T04:03:11Z http://ndltd.ncl.edu.tw/handle/20716134364431964671 Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information 利用序列與結構資訊之隱藏式馬可夫模型之與去氧核醣核酸結合蛋白質預測 Wei-Jhih Chen 陳韋志 碩士 國立成功大學 醫學資訊研究所 96 In the post-genome period, the protein domain structures have been published rapidly. For figuring out the cell function, the mechanism of protein-DNA interaction is an important subject in resent bioinformatics research and has not been comprehensively studied. Several machine learning based methods have been attempted to solve this issue. Until recently, few studies have been successful in translating the tertiary structure characteristics of proteins into appropriate features for utilizing the learning mechanism to predict DNA-binding Proteins. In this work, a novel machine learning approach based on using HMMs (hidden Markov Models) to express the characteristics of DNA-binding Proteins in the both aspects of amino acid sequence and tertiary structure has been presented. Moreover, several helpful features of DNA-binding Proteins have also been utilized in the proposed method, such as residue composition, structure pattern composition and accessible surface area of residues. We also develop a SVM (Support Vector Machine) based classifier to predict general DNA-binding Proteins, and obtain the accuracy of 88.45% through 5-folds cross-validation. Furthermore, a response element specific classifier is constructed for predicting response element specific DNA-binding Proteins, and is obtained the precision of 96.57% with recall rate as 88.83% in average. Finally, this high accuracy classifier is employed to predict the DNA-binding Proteins from MCF-7 which likely to bind to estrogen response elements. Hung-Yu Kao 高宏宇 2008 學位論文 ; thesis 63 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立成功大學 === 醫學資訊研究所 === 96 === In the post-genome period, the protein domain structures have been published rapidly. For figuring out the cell function, the mechanism of protein-DNA interaction is an important subject in resent bioinformatics research and has not been comprehensively studied. Several machine learning based methods have been attempted to solve this issue. Until recently, few studies have been successful in translating the tertiary structure characteristics of proteins into appropriate features for utilizing the learning mechanism to predict DNA-binding Proteins. In this work, a novel machine learning approach based on using HMMs (hidden Markov Models) to express the characteristics of DNA-binding Proteins in the both aspects of amino acid sequence and tertiary structure has been presented. Moreover, several helpful features of DNA-binding Proteins have also been utilized in the proposed method, such as residue composition, structure pattern composition and accessible surface area of residues. We also develop a SVM (Support Vector Machine) based classifier to predict general DNA-binding Proteins, and obtain the accuracy of 88.45% through 5-folds cross-validation. Furthermore, a response element specific classifier is constructed for predicting response element specific DNA-binding Proteins, and is obtained the precision of 96.57% with recall rate as 88.83% in average. Finally, this high accuracy classifier is employed to predict the DNA-binding Proteins from MCF-7 which likely to bind to estrogen response elements.
author2 Hung-Yu Kao
author_facet Hung-Yu Kao
Wei-Jhih Chen
陳韋志
author Wei-Jhih Chen
陳韋志
spellingShingle Wei-Jhih Chen
陳韋志
Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information
author_sort Wei-Jhih Chen
title Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information
title_short Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information
title_full Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information
title_fullStr Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information
title_full_unstemmed Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information
title_sort hidden markov model based dna-binding proteins prediction by mining on sequence and structure information
publishDate 2008
url http://ndltd.ncl.edu.tw/handle/20716134364431964671
work_keys_str_mv AT weijhihchen hiddenmarkovmodelbaseddnabindingproteinspredictionbyminingonsequenceandstructureinformation
AT chénwéizhì hiddenmarkovmodelbaseddnabindingproteinspredictionbyminingonsequenceandstructureinformation
AT weijhihchen lìyòngxùlièyǔjiégòuzīxùnzhīyǐncángshìmǎkěfūmóxíngzhīyǔqùyǎnghétánghésuānjiéhédànbáizhìyùcè
AT chénwéizhì lìyòngxùlièyǔjiégòuzīxùnzhīyǐncángshìmǎkěfūmóxíngzhīyǔqùyǎnghétánghésuānjiéhédànbáizhìyùcè
_version_ 1718134368818429952