Clustering Algorithms on Protein Interaction Networks

博士 === 國立清華大學 === 資訊工程學系 === 104 === Nowadays, thanks for the high throughput sequencing and yeast-two-hybrid techniques, more and more biological network data are available, such as protein-protein interaction (PPI) or metabolic network data. Such data can help scientists to further study the biolo...

Full description

Bibliographic Details
Main Authors: Ma, Cheng-Yu, 馬誠佑
Other Authors: Tang, Chuan Yi
Format: Others
Language:en_US
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/nvcv2s
id ndltd-TW-104NTHU5392086
record_format oai_dc
spelling ndltd-TW-104NTHU53920862019-05-15T23:00:46Z http://ndltd.ncl.edu.tw/handle/nvcv2s Clustering Algorithms on Protein Interaction Networks 分群演算法在蛋白質交互作用網路之應用 Ma, Cheng-Yu 馬誠佑 博士 國立清華大學 資訊工程學系 104 Nowadays, thanks for the high throughput sequencing and yeast-two-hybrid techniques, more and more biological network data are available, such as protein-protein interaction (PPI) or metabolic network data. Such data can help scientists to further study the biological phenomena in systems-level. However, to extract meaningful information from the huge amount of data is quite a challenge. Thus, developing clustering algorithms is very important in systems biology. In this dissertation, we introduce our studies on clustering algorithms from the inter-species perspective as well as the intra-species perspective. Network alignment algorithms are one of the most important inter-species clustering techniques in the study of systems biology, and it focuses on collecting the functionally similar proteins of different species’ networks based on not only sequence similarity but also topology similarity. When more and more network alignment algorithms have been published, we develop an efficient network alignment booster which can refine the alignment results from any source with low cost. On the other hand, discovering protein complexes from a PPI network concentrates on clustering proteins that have highly connectivity between each other in one single network. In recent years, many approaches have been developed to solve this problem but the edge loss in PPI network is still the natural limitation. For this purpose, we develop a new algorithm which combines multiple network alignment with a new efficient connectivity measurement, NECC, to conquer this limitation. Also, we apply global multiple network alignment algorithm to the metabolic networks of bacteria and reconstruct the phyletic relationships between them and separate the genetically similar species into different groups based on their metabolic behavior. Furthermore, we try to identify the dissimilarity of metabolic pathways between close species. All in all, we demonstrate the effectiveness for each of the proposed clustering algorithms, which also reveals the important biological findings in systems biology. Tang, Chuan Yi Liao, Chung-Shou 唐傳義 廖崇碩 2016 學位論文 ; thesis 127 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立清華大學 === 資訊工程學系 === 104 === Nowadays, thanks for the high throughput sequencing and yeast-two-hybrid techniques, more and more biological network data are available, such as protein-protein interaction (PPI) or metabolic network data. Such data can help scientists to further study the biological phenomena in systems-level. However, to extract meaningful information from the huge amount of data is quite a challenge. Thus, developing clustering algorithms is very important in systems biology. In this dissertation, we introduce our studies on clustering algorithms from the inter-species perspective as well as the intra-species perspective. Network alignment algorithms are one of the most important inter-species clustering techniques in the study of systems biology, and it focuses on collecting the functionally similar proteins of different species’ networks based on not only sequence similarity but also topology similarity. When more and more network alignment algorithms have been published, we develop an efficient network alignment booster which can refine the alignment results from any source with low cost. On the other hand, discovering protein complexes from a PPI network concentrates on clustering proteins that have highly connectivity between each other in one single network. In recent years, many approaches have been developed to solve this problem but the edge loss in PPI network is still the natural limitation. For this purpose, we develop a new algorithm which combines multiple network alignment with a new efficient connectivity measurement, NECC, to conquer this limitation. Also, we apply global multiple network alignment algorithm to the metabolic networks of bacteria and reconstruct the phyletic relationships between them and separate the genetically similar species into different groups based on their metabolic behavior. Furthermore, we try to identify the dissimilarity of metabolic pathways between close species. All in all, we demonstrate the effectiveness for each of the proposed clustering algorithms, which also reveals the important biological findings in systems biology.
author2 Tang, Chuan Yi
author_facet Tang, Chuan Yi
Ma, Cheng-Yu
馬誠佑
author Ma, Cheng-Yu
馬誠佑
spellingShingle Ma, Cheng-Yu
馬誠佑
Clustering Algorithms on Protein Interaction Networks
author_sort Ma, Cheng-Yu
title Clustering Algorithms on Protein Interaction Networks
title_short Clustering Algorithms on Protein Interaction Networks
title_full Clustering Algorithms on Protein Interaction Networks
title_fullStr Clustering Algorithms on Protein Interaction Networks
title_full_unstemmed Clustering Algorithms on Protein Interaction Networks
title_sort clustering algorithms on protein interaction networks
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/nvcv2s
work_keys_str_mv AT machengyu clusteringalgorithmsonproteininteractionnetworks
AT mǎchéngyòu clusteringalgorithmsonproteininteractionnetworks
AT machengyu fēnqúnyǎnsuànfǎzàidànbáizhìjiāohùzuòyòngwǎnglùzhīyīngyòng
AT mǎchéngyòu fēnqúnyǎnsuànfǎzàidànbáizhìjiāohùzuòyòngwǎnglùzhīyīngyòng
_version_ 1719138296970346496