Prediction of Human Protein-Protein Interactions Using SupportVector Machines

博士 === 臺灣大學 === 資訊工程學研究所 === 96 === The recent increase in the use of high-throughput two-hybrid analysis has generated a large amount of data on protein interactions. Specifically, the availability of information about experimental protein-protein interactions and other protein features on the Inte...

Full description

Bibliographic Details
Main Authors: Tao-Wei Huang, 黃韜維
Other Authors: 高成炎
Format: Others
Language:en_US
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/60898753640594858720
id ndltd-TW-096NTU05392008
record_format oai_dc
spelling ndltd-TW-096NTU053920082015-10-13T14:04:51Z http://ndltd.ncl.edu.tw/handle/60898753640594858720 Prediction of Human Protein-Protein Interactions Using SupportVector Machines 應用支援向量機在人類蛋白質交互作用的預測 Tao-Wei Huang 黃韜維 博士 臺灣大學 資訊工程學研究所 96 The recent increase in the use of high-throughput two-hybrid analysis has generated a large amount of data on protein interactions. Specifically, the availability of information about experimental protein-protein interactions and other protein features on the Internet enables human protein-protein interactions to be computationally predicted from co-evolution events (interolog). Computational methods must be developed to integrate these heterogeneous biological data to facilitate the maximum accuracy of the human protein interaction prediction. In knowledge-based study, we proposes a relative conservation score by identifying maximal quasi-cliques in protein interaction networks, and addressing of other interaction features to formulate a scoring method. The scoring method can be adopted to discover which protein pairs are the most likely to interact in multiple protein pairs. The predicted human protein-protein interactions associated with confidence scores are derived from six eukaryotic organisms - rat, mouse, fly, worm, thale cress and baker''s yeast. The evaluation of our proposed method using functional keyword and gene ontology annotations indicates that some confidence is justified in the accuracy of the predicted interactions. Comparisons among existing methods also reveal that the proposed method predicts human protein-protein interactions more accurately than other interolog-based methods. This study considers protein interaction features, including interolog, spatial proximity (sub-cellular localization, tissue-specificity), temporal synchronicity (the cell-cycle stage), and domain-domain pair combinations. Using these $6$ protein features, and combination of hydrophobic, charge, and volume amino acid property as $3$ sets of $16$-dimension features to construct committee models of support vector machines (SVMs). The final $5$-fold cross validation testing for $10$ different size test sets revealed that the accuracy of test set can be obtained above 90\%. Moreover, the analytical comparisons also suggested our proposed method have higher accuracy than other SVM-based methods. 高成炎 2008 學位論文 ; thesis 71 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 臺灣大學 === 資訊工程學研究所 === 96 === The recent increase in the use of high-throughput two-hybrid analysis has generated a large amount of data on protein interactions. Specifically, the availability of information about experimental protein-protein interactions and other protein features on the Internet enables human protein-protein interactions to be computationally predicted from co-evolution events (interolog). Computational methods must be developed to integrate these heterogeneous biological data to facilitate the maximum accuracy of the human protein interaction prediction. In knowledge-based study, we proposes a relative conservation score by identifying maximal quasi-cliques in protein interaction networks, and addressing of other interaction features to formulate a scoring method. The scoring method can be adopted to discover which protein pairs are the most likely to interact in multiple protein pairs. The predicted human protein-protein interactions associated with confidence scores are derived from six eukaryotic organisms - rat, mouse, fly, worm, thale cress and baker''s yeast. The evaluation of our proposed method using functional keyword and gene ontology annotations indicates that some confidence is justified in the accuracy of the predicted interactions. Comparisons among existing methods also reveal that the proposed method predicts human protein-protein interactions more accurately than other interolog-based methods. This study considers protein interaction features, including interolog, spatial proximity (sub-cellular localization, tissue-specificity), temporal synchronicity (the cell-cycle stage), and domain-domain pair combinations. Using these $6$ protein features, and combination of hydrophobic, charge, and volume amino acid property as $3$ sets of $16$-dimension features to construct committee models of support vector machines (SVMs). The final $5$-fold cross validation testing for $10$ different size test sets revealed that the accuracy of test set can be obtained above 90\%. Moreover, the analytical comparisons also suggested our proposed method have higher accuracy than other SVM-based methods.
author2 高成炎
author_facet 高成炎
Tao-Wei Huang
黃韜維
author Tao-Wei Huang
黃韜維
spellingShingle Tao-Wei Huang
黃韜維
Prediction of Human Protein-Protein Interactions Using SupportVector Machines
author_sort Tao-Wei Huang
title Prediction of Human Protein-Protein Interactions Using SupportVector Machines
title_short Prediction of Human Protein-Protein Interactions Using SupportVector Machines
title_full Prediction of Human Protein-Protein Interactions Using SupportVector Machines
title_fullStr Prediction of Human Protein-Protein Interactions Using SupportVector Machines
title_full_unstemmed Prediction of Human Protein-Protein Interactions Using SupportVector Machines
title_sort prediction of human protein-protein interactions using supportvector machines
publishDate 2008
url http://ndltd.ncl.edu.tw/handle/60898753640594858720
work_keys_str_mv AT taoweihuang predictionofhumanproteinproteininteractionsusingsupportvectormachines
AT huángtāowéi predictionofhumanproteinproteininteractionsusingsupportvectormachines
AT taoweihuang yīngyòngzhīyuánxiàngliàngjīzàirénlèidànbáizhìjiāohùzuòyòngdeyùcè
AT huángtāowéi yīngyòngzhīyuánxiàngliàngjīzàirénlèidànbáizhìjiāohùzuòyòngdeyùcè
_version_ 1717748476579676160