Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information
碩士 === 國立成功大學 === 醫學資訊研究所 === 96 === In the post-genome period, the protein domain structures have been published rapidly. For figuring out the cell function, the mechanism of protein-DNA interaction is an important subject in resent bioinformatics research and has not been comprehensively studied....
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2008
|
Online Access: | http://ndltd.ncl.edu.tw/handle/20716134364431964671 |
id |
ndltd-TW-096NCKU5674009 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-096NCKU56740092015-11-23T04:03:11Z http://ndltd.ncl.edu.tw/handle/20716134364431964671 Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information 利用序列與結構資訊之隱藏式馬可夫模型之與去氧核醣核酸結合蛋白質預測 Wei-Jhih Chen 陳韋志 碩士 國立成功大學 醫學資訊研究所 96 In the post-genome period, the protein domain structures have been published rapidly. For figuring out the cell function, the mechanism of protein-DNA interaction is an important subject in resent bioinformatics research and has not been comprehensively studied. Several machine learning based methods have been attempted to solve this issue. Until recently, few studies have been successful in translating the tertiary structure characteristics of proteins into appropriate features for utilizing the learning mechanism to predict DNA-binding Proteins. In this work, a novel machine learning approach based on using HMMs (hidden Markov Models) to express the characteristics of DNA-binding Proteins in the both aspects of amino acid sequence and tertiary structure has been presented. Moreover, several helpful features of DNA-binding Proteins have also been utilized in the proposed method, such as residue composition, structure pattern composition and accessible surface area of residues. We also develop a SVM (Support Vector Machine) based classifier to predict general DNA-binding Proteins, and obtain the accuracy of 88.45% through 5-folds cross-validation. Furthermore, a response element specific classifier is constructed for predicting response element specific DNA-binding Proteins, and is obtained the precision of 96.57% with recall rate as 88.83% in average. Finally, this high accuracy classifier is employed to predict the DNA-binding Proteins from MCF-7 which likely to bind to estrogen response elements. Hung-Yu Kao 高宏宇 2008 學位論文 ; thesis 63 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立成功大學 === 醫學資訊研究所 === 96 === In the post-genome period, the protein domain structures have been published rapidly. For figuring out the cell function, the mechanism of protein-DNA interaction is an important subject in resent bioinformatics research and has not been comprehensively studied. Several machine learning based methods have been attempted to solve this issue. Until recently, few studies have been successful in translating the tertiary structure characteristics of proteins into appropriate features for utilizing the learning mechanism to predict DNA-binding Proteins. In this work, a novel machine learning approach based on using HMMs (hidden Markov Models) to express the characteristics of DNA-binding Proteins in the both aspects of amino acid sequence and tertiary structure has been presented. Moreover, several helpful features of DNA-binding Proteins have also been utilized in the proposed method, such as residue composition, structure pattern composition and accessible surface area of residues. We also develop a SVM (Support Vector Machine) based classifier to predict general DNA-binding Proteins, and obtain the accuracy of 88.45% through 5-folds cross-validation. Furthermore, a response element specific classifier is constructed for predicting response element specific DNA-binding Proteins, and is obtained the precision of 96.57% with recall rate as 88.83% in average. Finally, this high accuracy classifier is employed to predict the DNA-binding Proteins from MCF-7 which likely to bind to estrogen response elements.
|
author2 |
Hung-Yu Kao |
author_facet |
Hung-Yu Kao Wei-Jhih Chen 陳韋志 |
author |
Wei-Jhih Chen 陳韋志 |
spellingShingle |
Wei-Jhih Chen 陳韋志 Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information |
author_sort |
Wei-Jhih Chen |
title |
Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information |
title_short |
Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information |
title_full |
Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information |
title_fullStr |
Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information |
title_full_unstemmed |
Hidden Markov Model Based DNA-binding Proteins Prediction by Mining on Sequence and Structure Information |
title_sort |
hidden markov model based dna-binding proteins prediction by mining on sequence and structure information |
publishDate |
2008 |
url |
http://ndltd.ncl.edu.tw/handle/20716134364431964671 |
work_keys_str_mv |
AT weijhihchen hiddenmarkovmodelbaseddnabindingproteinspredictionbyminingonsequenceandstructureinformation AT chénwéizhì hiddenmarkovmodelbaseddnabindingproteinspredictionbyminingonsequenceandstructureinformation AT weijhihchen lìyòngxùlièyǔjiégòuzīxùnzhīyǐncángshìmǎkěfūmóxíngzhīyǔqùyǎnghétánghésuānjiéhédànbáizhìyùcè AT chénwéizhì lìyòngxùlièyǔjiégòuzīxùnzhīyǐncángshìmǎkěfūmóxíngzhīyǔqùyǎnghétánghésuānjiéhédànbáizhìyùcè |
_version_ |
1718134368818429952 |