Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining
碩士 === 國立臺灣大學 === 生物產業機電工程學研究所 === 96 === Abstract Recent advances in fully sequenced genomes have provided a huge amount of accessible sequence information. It raises a great challenge to detect the interface residues participating in protein-protein interactions directly from the primary structure...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2008
|
Online Access: | http://ndltd.ncl.edu.tw/handle/14615197860443997117 |
id |
ndltd-TW-096NTU05415026 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-096NTU054150262015-11-25T04:04:36Z http://ndltd.ncl.edu.tw/handle/14615197860443997117 Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining 利用序列特徵探勘預測蛋白質-蛋白質互動鍵結區之配對 Chien-Chieh Lin 林千捷 碩士 國立臺灣大學 生物產業機電工程學研究所 96 Abstract Recent advances in fully sequenced genomes have provided a huge amount of accessible sequence information. It raises a great challenge to detect the interface residues participating in protein-protein interactions directly from the primary structures, the amino acid sequences. To address the problem, we propose a two-phase pattern mining method to predict the interacting regions of a pair of proteins, which are known to have physical interactions, based on the co-occurrence of residues found in a set of concatenated protein homologues. Once a valid training data can be prepared, it is potential to recognize the interacting regions by the patterns that cross two proteins. In this thesis, we apply the proposed approach to 41 protein pairs from three different data sets. The performance of the proposed method is evaulated by calculating the distance between the predicted paired interacting regions from different protein chains in existing structure complexes. In summary, we predicted 128 conserved regions in the first phase of mining, where 60 of them can find their potential partners among the patterns derived in the second phase. Thirty three of the predicted interacting pairs are found to be within 10 Å in available complexes, resulting an accuracy of 56% (33/60). If we only trust the mining results from protein pairs with similar evolution rates, our method can deliver an accuracy of 72% (24/33). This reveals the potential of our method and suggests that how to incorporating other useful information to refine the current predictions deserves more studies in the future. 陳倩瑜 2008 學位論文 ; thesis 45 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 生物產業機電工程學研究所 === 96 === Abstract
Recent advances in fully sequenced genomes have provided a huge amount of accessible sequence information. It raises a great challenge to detect the interface residues participating in protein-protein interactions directly from the primary structures, the amino acid sequences. To address the problem, we propose a two-phase pattern mining method to predict the interacting regions of a pair of proteins, which are known to have physical interactions, based on the co-occurrence of residues found in a set of concatenated protein homologues. Once a valid training data can be prepared, it is potential to recognize the interacting regions by the patterns that cross two proteins. In this thesis, we apply the proposed approach to 41 protein pairs from three different data sets. The performance of the proposed method is evaulated by calculating the distance between the predicted paired interacting regions from different protein chains in existing structure complexes. In summary, we predicted 128 conserved regions in the first phase of mining, where 60 of them can find their potential partners among the patterns derived in the second phase. Thirty three of the predicted interacting pairs are found to be within 10 Å in available complexes, resulting an accuracy of 56% (33/60). If we only trust the mining results from protein pairs with similar evolution rates, our method can deliver an accuracy of 72% (24/33). This reveals the potential of our method and suggests that how to incorporating other useful information to refine the current predictions deserves more studies in the future.
|
author2 |
陳倩瑜 |
author_facet |
陳倩瑜 Chien-Chieh Lin 林千捷 |
author |
Chien-Chieh Lin 林千捷 |
spellingShingle |
Chien-Chieh Lin 林千捷 Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining |
author_sort |
Chien-Chieh Lin |
title |
Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining |
title_short |
Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining |
title_full |
Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining |
title_fullStr |
Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining |
title_full_unstemmed |
Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining |
title_sort |
prediction of paired binding regions in protein-protein interactions by sequential pattern mining |
publishDate |
2008 |
url |
http://ndltd.ncl.edu.tw/handle/14615197860443997117 |
work_keys_str_mv |
AT chienchiehlin predictionofpairedbindingregionsinproteinproteininteractionsbysequentialpatternmining AT línqiānjié predictionofpairedbindingregionsinproteinproteininteractionsbysequentialpatternmining AT chienchiehlin lìyòngxùliètèzhēngtànkānyùcèdànbáizhìdànbáizhìhùdòngjiànjiéqūzhīpèiduì AT línqiānjié lìyòngxùliètèzhēngtànkānyùcèdànbáizhìdànbáizhìhùdòngjiànjiéqūzhīpèiduì |
_version_ |
1718135312127885312 |