Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining

碩士 === 國立臺灣大學 === 生物產業機電工程學研究所 === 96 === Abstract Recent advances in fully sequenced genomes have provided a huge amount of accessible sequence information. It raises a great challenge to detect the interface residues participating in protein-protein interactions directly from the primary structure...

Full description

Bibliographic Details
Main Authors:	Chien-Chieh Lin, 林千捷
Other Authors:	陳倩瑜
Format:	Others
Language:	en_US
Published:	2008
Online Access:	http://ndltd.ncl.edu.tw/handle/14615197860443997117

id	ndltd-TW-096NTU05415026
record_format	oai_dc
spelling	ndltd-TW-096NTU054150262015-11-25T04:04:36Z http://ndltd.ncl.edu.tw/handle/14615197860443997117 Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining 利用序列特徵探勘預測蛋白質-蛋白質互動鍵結區之配對 Chien-Chieh Lin 林千捷碩士國立臺灣大學生物產業機電工程學研究所 96 Abstract Recent advances in fully sequenced genomes have provided a huge amount of accessible sequence information. It raises a great challenge to detect the interface residues participating in protein-protein interactions directly from the primary structures, the amino acid sequences. To address the problem, we propose a two-phase pattern mining method to predict the interacting regions of a pair of proteins, which are known to have physical interactions, based on the co-occurrence of residues found in a set of concatenated protein homologues. Once a valid training data can be prepared, it is potential to recognize the interacting regions by the patterns that cross two proteins. In this thesis, we apply the proposed approach to 41 protein pairs from three different data sets. The performance of the proposed method is evaulated by calculating the distance between the predicted paired interacting regions from different protein chains in existing structure complexes. In summary, we predicted 128 conserved regions in the first phase of mining, where 60 of them can find their potential partners among the patterns derived in the second phase. Thirty three of the predicted interacting pairs are found to be within 10 Å in available complexes, resulting an accuracy of 56% (33/60). If we only trust the mining results from protein pairs with similar evolution rates, our method can deliver an accuracy of 72% (24/33). This reveals the potential of our method and suggests that how to incorporating other useful information to refine the current predictions deserves more studies in the future. 陳倩瑜 2008 學位論文 ; thesis 45 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立臺灣大學 === 生物產業機電工程學研究所 === 96 === Abstract Recent advances in fully sequenced genomes have provided a huge amount of accessible sequence information. It raises a great challenge to detect the interface residues participating in protein-protein interactions directly from the primary structures, the amino acid sequences. To address the problem, we propose a two-phase pattern mining method to predict the interacting regions of a pair of proteins, which are known to have physical interactions, based on the co-occurrence of residues found in a set of concatenated protein homologues. Once a valid training data can be prepared, it is potential to recognize the interacting regions by the patterns that cross two proteins. In this thesis, we apply the proposed approach to 41 protein pairs from three different data sets. The performance of the proposed method is evaulated by calculating the distance between the predicted paired interacting regions from different protein chains in existing structure complexes. In summary, we predicted 128 conserved regions in the first phase of mining, where 60 of them can find their potential partners among the patterns derived in the second phase. Thirty three of the predicted interacting pairs are found to be within 10 Å in available complexes, resulting an accuracy of 56% (33/60). If we only trust the mining results from protein pairs with similar evolution rates, our method can deliver an accuracy of 72% (24/33). This reveals the potential of our method and suggests that how to incorporating other useful information to refine the current predictions deserves more studies in the future.
author2	陳倩瑜
author_facet	陳倩瑜 Chien-Chieh Lin 林千捷
author	Chien-Chieh Lin 林千捷
spellingShingle	Chien-Chieh Lin 林千捷 Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining
author_sort	Chien-Chieh Lin
title	Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining
title_short	Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining
title_full	Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining
title_fullStr	Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining
title_full_unstemmed	Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining
title_sort	prediction of paired binding regions in protein-protein interactions by sequential pattern mining
publishDate	2008
url	http://ndltd.ncl.edu.tw/handle/14615197860443997117
work_keys_str_mv	AT chienchiehlin predictionofpairedbindingregionsinproteinproteininteractionsbysequentialpatternmining AT línqiānjié predictionofpairedbindingregionsinproteinproteininteractionsbysequentialpatternmining AT chienchiehlin lìyòngxùliètèzhēngtànkānyùcèdànbáizhìdànbáizhìhùdòngjiànjiéqūzhīpèiduì AT línqiānjié lìyòngxùliètèzhēngtànkānyùcèdànbáizhìdànbáizhìhùdòngjiànjiéqūzhīpèiduì
_version_	1718135312127885312

Prediction of Paired Binding Regions in Protein-Protein Interactions by Sequential Pattern Mining

Similar Items