Clustering Algorithms on Protein Interaction Networks

博士 === 國立清華大學 === 資訊工程學系 === 104 === Nowadays, thanks for the high throughput sequencing and yeast-two-hybrid techniques, more and more biological network data are available, such as protein-protein interaction (PPI) or metabolic network data. Such data can help scientists to further study the biolo...

Full description

Bibliographic Details
Main Authors: Ma, Cheng-Yu, 馬誠佑
Other Authors: Tang, Chuan Yi
Format: Others
Language:en_US
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/nvcv2s
Description
Summary:博士 === 國立清華大學 === 資訊工程學系 === 104 === Nowadays, thanks for the high throughput sequencing and yeast-two-hybrid techniques, more and more biological network data are available, such as protein-protein interaction (PPI) or metabolic network data. Such data can help scientists to further study the biological phenomena in systems-level. However, to extract meaningful information from the huge amount of data is quite a challenge. Thus, developing clustering algorithms is very important in systems biology. In this dissertation, we introduce our studies on clustering algorithms from the inter-species perspective as well as the intra-species perspective. Network alignment algorithms are one of the most important inter-species clustering techniques in the study of systems biology, and it focuses on collecting the functionally similar proteins of different species’ networks based on not only sequence similarity but also topology similarity. When more and more network alignment algorithms have been published, we develop an efficient network alignment booster which can refine the alignment results from any source with low cost. On the other hand, discovering protein complexes from a PPI network concentrates on clustering proteins that have highly connectivity between each other in one single network. In recent years, many approaches have been developed to solve this problem but the edge loss in PPI network is still the natural limitation. For this purpose, we develop a new algorithm which combines multiple network alignment with a new efficient connectivity measurement, NECC, to conquer this limitation. Also, we apply global multiple network alignment algorithm to the metabolic networks of bacteria and reconstruct the phyletic relationships between them and separate the genetically similar species into different groups based on their metabolic behavior. Furthermore, we try to identify the dissimilarity of metabolic pathways between close species. All in all, we demonstrate the effectiveness for each of the proposed clustering algorithms, which also reveals the important biological findings in systems biology.