Comparative proteome interrogation based on protein domain architecture

博士 === 國立陽明大學 === 生物醫學資訊研究所 === 99 === Domains are the functional units to protein composition. About 60% and 80% of the proteins contain at least one domain in prokaryotes and eukaryotes, respectively. Proteins with the same domain architecture had been shown to be more likely to be derived fro...

Full description

Bibliographic Details
Main Authors:	Ting-Wen Chen, 陳亭妏
Other Authors:	Wen-Chang Lin
Format:	Others
Language:	en_US
Published:	2011
Online Access:	http://ndltd.ncl.edu.tw/handle/48607053564896464907

id	ndltd-TW-099YM005114006
record_format	oai_dc
spelling	ndltd-TW-099YM0051140062015-10-13T20:37:07Z http://ndltd.ncl.edu.tw/handle/48607053564896464907 Comparative proteome interrogation based on protein domain architecture 蛋白質模組結構應用在比較蛋白質體之探討 Ting-Wen Chen 陳亭妏博士國立陽明大學生物醫學資訊研究所 99 Domains are the functional units to protein composition. About 60% and 80% of the proteins contain at least one domain in prokaryotes and eukaryotes, respectively. Proteins with the same domain architecture had been shown to be more likely to be derived from common ancestor instead of convergent evolution. Comparing to primary sequences, domain(s) or domain architectures on proteins are more directly related to the function of the protein and therefore more likely to be conserved through evolution. These properties make domain architecture informative in investigating evolution history of proteins. Here, we exploit the domain information to shed light on the ortholog detection or protein function prediction. We implement an efficient pipeline named DODO (DOmain based Detection of Ortholog) which utilize the domain architecture information to cluster proteins into homolog groups and further identify orthologs within those homolog groups. DODO has been shown to perform well while testing with several well-known ortholog databases such as InParanoid and HomoloGene. Aided by domain information, DODO is able to detect those distantly related orthologs even when their sequences may already become diverged and share low sequence similarity. In addition to the ortholog detection, we further investigated the domain architecture distribution and domain usage in other eukaryotes and constructed a protein domain architecture database (proDAD) where homolog proteins were clustered according to their domain architecture. In the database, those homolog proteins could be further aligned together, and the alignment result is shown to be useful in correcting the start site annotation of proteins. Finally, we construct a VIrus Protein domain DataBase (VIP DB) in which all domains on virus proteins are identified. VIP DB aims to provide clues for protein function from the protein domains and integrate information from domain GO annotation, domain-domain interaction and KEGG pathway based on those protein domains. With the advance of high throughput sequencing technologies, more and more genomes are sequenced. Efficient methods to identify the ortholog of newly sequenced protein and identify the function of those proteins would be beneficial. Our work of using domain information to identify proteins with common ancestor or protein functions makes important contribution to the post-sequencing analysis. Wen-Chang Lin 林文昌 2011 學位論文 ; thesis 96 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	博士 === 國立陽明大學 === 生物醫學資訊研究所 === 99 === Domains are the functional units to protein composition. About 60% and 80% of the proteins contain at least one domain in prokaryotes and eukaryotes, respectively. Proteins with the same domain architecture had been shown to be more likely to be derived from common ancestor instead of convergent evolution. Comparing to primary sequences, domain(s) or domain architectures on proteins are more directly related to the function of the protein and therefore more likely to be conserved through evolution. These properties make domain architecture informative in investigating evolution history of proteins. Here, we exploit the domain information to shed light on the ortholog detection or protein function prediction. We implement an efficient pipeline named DODO (DOmain based Detection of Ortholog) which utilize the domain architecture information to cluster proteins into homolog groups and further identify orthologs within those homolog groups. DODO has been shown to perform well while testing with several well-known ortholog databases such as InParanoid and HomoloGene. Aided by domain information, DODO is able to detect those distantly related orthologs even when their sequences may already become diverged and share low sequence similarity. In addition to the ortholog detection, we further investigated the domain architecture distribution and domain usage in other eukaryotes and constructed a protein domain architecture database (proDAD) where homolog proteins were clustered according to their domain architecture. In the database, those homolog proteins could be further aligned together, and the alignment result is shown to be useful in correcting the start site annotation of proteins. Finally, we construct a VIrus Protein domain DataBase (VIP DB) in which all domains on virus proteins are identified. VIP DB aims to provide clues for protein function from the protein domains and integrate information from domain GO annotation, domain-domain interaction and KEGG pathway based on those protein domains. With the advance of high throughput sequencing technologies, more and more genomes are sequenced. Efficient methods to identify the ortholog of newly sequenced protein and identify the function of those proteins would be beneficial. Our work of using domain information to identify proteins with common ancestor or protein functions makes important contribution to the post-sequencing analysis.
author2	Wen-Chang Lin
author_facet	Wen-Chang Lin Ting-Wen Chen 陳亭妏
author	Ting-Wen Chen 陳亭妏
spellingShingle	Ting-Wen Chen 陳亭妏 Comparative proteome interrogation based on protein domain architecture
author_sort	Ting-Wen Chen
title	Comparative proteome interrogation based on protein domain architecture
title_short	Comparative proteome interrogation based on protein domain architecture
title_full	Comparative proteome interrogation based on protein domain architecture
title_fullStr	Comparative proteome interrogation based on protein domain architecture
title_full_unstemmed	Comparative proteome interrogation based on protein domain architecture
title_sort	comparative proteome interrogation based on protein domain architecture
publishDate	2011
url	http://ndltd.ncl.edu.tw/handle/48607053564896464907
work_keys_str_mv	AT tingwenchen comparativeproteomeinterrogationbasedonproteindomainarchitecture AT chéntíngwèn comparativeproteomeinterrogationbasedonproteindomainarchitecture AT tingwenchen dànbáizhìmózǔjiégòuyīngyòngzàibǐjiàodànbáizhìtǐzhītàntǎo AT chéntíngwèn dànbáizhìmózǔjiégòuyīngyòngzàibǐjiàodànbáizhìtǐzhītàntǎo
_version_	1718049089244889088

Comparative proteome interrogation based on protein domain architecture

Similar Items