Comparative proteome interrogation based on protein domain architecture

博士 === 國立陽明大學 === 生物醫學資訊研究所 === 99 === Domains are the functional units to protein composition. About 60% and 80% of the proteins contain at least one domain in prokaryotes and eukaryotes, respectively. Proteins with the same domain architecture had been shown to be more likely to be derived fro...

Full description

Bibliographic Details
Main Authors: Ting-Wen Chen, 陳亭妏
Other Authors: Wen-Chang Lin
Format: Others
Language:en_US
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/48607053564896464907
id ndltd-TW-099YM005114006
record_format oai_dc
spelling ndltd-TW-099YM0051140062015-10-13T20:37:07Z http://ndltd.ncl.edu.tw/handle/48607053564896464907 Comparative proteome interrogation based on protein domain architecture 蛋白質模組結構應用在比較蛋白質體之探討 Ting-Wen Chen 陳亭妏 博士 國立陽明大學 生物醫學資訊研究所 99 Domains are the functional units to protein composition. About 60% and 80% of the proteins contain at least one domain in prokaryotes and eukaryotes, respectively. Proteins with the same domain architecture had been shown to be more likely to be derived from common ancestor instead of convergent evolution. Comparing to primary sequences, domain(s) or domain architectures on proteins are more directly related to the function of the protein and therefore more likely to be conserved through evolution. These properties make domain architecture informative in investigating evolution history of proteins. Here, we exploit the domain information to shed light on the ortholog detection or protein function prediction. We implement an efficient pipeline named DODO (DOmain based Detection of Ortholog) which utilize the domain architecture information to cluster proteins into homolog groups and further identify orthologs within those homolog groups. DODO has been shown to perform well while testing with several well-known ortholog databases such as InParanoid and HomoloGene. Aided by domain information, DODO is able to detect those distantly related orthologs even when their sequences may already become diverged and share low sequence similarity. In addition to the ortholog detection, we further investigated the domain architecture distribution and domain usage in other eukaryotes and constructed a protein domain architecture database (proDAD) where homolog proteins were clustered according to their domain architecture. In the database, those homolog proteins could be further aligned together, and the alignment result is shown to be useful in correcting the start site annotation of proteins. Finally, we construct a VIrus Protein domain DataBase (VIP DB) in which all domains on virus proteins are identified. VIP DB aims to provide clues for protein function from the protein domains and integrate information from domain GO annotation, domain-domain interaction and KEGG pathway based on those protein domains. With the advance of high throughput sequencing technologies, more and more genomes are sequenced. Efficient methods to identify the ortholog of newly sequenced protein and identify the function of those proteins would be beneficial. Our work of using domain information to identify proteins with common ancestor or protein functions makes important contribution to the post-sequencing analysis. Wen-Chang Lin 林文昌 2011 學位論文 ; thesis 96 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立陽明大學 === 生物醫學資訊研究所 === 99 === Domains are the functional units to protein composition. About 60% and 80% of the proteins contain at least one domain in prokaryotes and eukaryotes, respectively. Proteins with the same domain architecture had been shown to be more likely to be derived from common ancestor instead of convergent evolution. Comparing to primary sequences, domain(s) or domain architectures on proteins are more directly related to the function of the protein and therefore more likely to be conserved through evolution. These properties make domain architecture informative in investigating evolution history of proteins. Here, we exploit the domain information to shed light on the ortholog detection or protein function prediction. We implement an efficient pipeline named DODO (DOmain based Detection of Ortholog) which utilize the domain architecture information to cluster proteins into homolog groups and further identify orthologs within those homolog groups. DODO has been shown to perform well while testing with several well-known ortholog databases such as InParanoid and HomoloGene. Aided by domain information, DODO is able to detect those distantly related orthologs even when their sequences may already become diverged and share low sequence similarity. In addition to the ortholog detection, we further investigated the domain architecture distribution and domain usage in other eukaryotes and constructed a protein domain architecture database (proDAD) where homolog proteins were clustered according to their domain architecture. In the database, those homolog proteins could be further aligned together, and the alignment result is shown to be useful in correcting the start site annotation of proteins. Finally, we construct a VIrus Protein domain DataBase (VIP DB) in which all domains on virus proteins are identified. VIP DB aims to provide clues for protein function from the protein domains and integrate information from domain GO annotation, domain-domain interaction and KEGG pathway based on those protein domains. With the advance of high throughput sequencing technologies, more and more genomes are sequenced. Efficient methods to identify the ortholog of newly sequenced protein and identify the function of those proteins would be beneficial. Our work of using domain information to identify proteins with common ancestor or protein functions makes important contribution to the post-sequencing analysis.
author2 Wen-Chang Lin
author_facet Wen-Chang Lin
Ting-Wen Chen
陳亭妏
author Ting-Wen Chen
陳亭妏
spellingShingle Ting-Wen Chen
陳亭妏
Comparative proteome interrogation based on protein domain architecture
author_sort Ting-Wen Chen
title Comparative proteome interrogation based on protein domain architecture
title_short Comparative proteome interrogation based on protein domain architecture
title_full Comparative proteome interrogation based on protein domain architecture
title_fullStr Comparative proteome interrogation based on protein domain architecture
title_full_unstemmed Comparative proteome interrogation based on protein domain architecture
title_sort comparative proteome interrogation based on protein domain architecture
publishDate 2011
url http://ndltd.ncl.edu.tw/handle/48607053564896464907
work_keys_str_mv AT tingwenchen comparativeproteomeinterrogationbasedonproteindomainarchitecture
AT chéntíngwèn comparativeproteomeinterrogationbasedonproteindomainarchitecture
AT tingwenchen dànbáizhìmózǔjiégòuyīngyòngzàibǐjiàodànbáizhìtǐzhītàntǎo
AT chéntíngwèn dànbáizhìmózǔjiégòuyīngyòngzàibǐjiàodànbáizhìtǐzhītàntǎo
_version_ 1718049089244889088