Quantitative sequence-function relationships in proteins based on gene ontology

<p>Abstract</p> <p>Background</p> <p>The relationship between divergence of amino-acid sequence and divergence of function among homologous proteins is complex. The assumption that homologs share function – the basis of transfer of annotations in databases – must theref...

Full description

Bibliographic Details
Main Authors: Lesk Arthur M, Altman Naomi, Blankenberg Daniel J, Sangar Vineet
Format: Article
Language:English
Published: BMC 2007-08-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/8/294
id doaj-381c5e6e0adb4ecc979ef59b86c6c05e
record_format Article
spelling doaj-381c5e6e0adb4ecc979ef59b86c6c05e2020-11-24T22:57:24ZengBMCBMC Bioinformatics1471-21052007-08-018129410.1186/1471-2105-8-294Quantitative sequence-function relationships in proteins based on gene ontologyLesk Arthur MAltman NaomiBlankenberg Daniel JSangar Vineet<p>Abstract</p> <p>Background</p> <p>The relationship between divergence of amino-acid sequence and divergence of function among homologous proteins is complex. The assumption that homologs share function – the basis of transfer of annotations in databases – must therefore be regarded with caution. Here, we present a quantitative study of sequence and function divergence, based on the Gene Ontology classification of function. We determined the relationship between sequence divergence and function divergence in 6828 protein families from the PFAM database. Within families there is a broad range of sequence similarity from very closely related proteins – for instance, orthologs in different mammals – to very distantly-related proteins at the limit of reliable recognition of homology.</p> <p>Results</p> <p>We correlated the divergence in sequences determined from pairwise alignments, and the divergence in function determined by path lengths in the Gene Ontology graph, taking into account the fact that many proteins have multiple functions. Our results show that, among homologous proteins, the proportion of divergent functions decreases dramatically above a threshold of sequence similarity at about 50% residue identity. For proteins with more than 50% residue identity, transfer of annotation between homologs will lead to an erroneous attribution with a totally dissimilar function in fewer than 6% of cases. This means that for very similar proteins (about 50 % identical residues) the chance of completely incorrect annotation is low; however, because of the phenomenon of recruitment, it is still non-zero.</p> <p>Conclusion</p> <p>Our results describe general features of the evolution of protein function, and serve as a guide to the reliability of annotation transfer, based on the closeness of the relationship between a new protein and its nearest annotated relative.</p> http://www.biomedcentral.com/1471-2105/8/294
collection DOAJ
language English
format Article
sources DOAJ
author Lesk Arthur M
Altman Naomi
Blankenberg Daniel J
Sangar Vineet
spellingShingle Lesk Arthur M
Altman Naomi
Blankenberg Daniel J
Sangar Vineet
Quantitative sequence-function relationships in proteins based on gene ontology
BMC Bioinformatics
author_facet Lesk Arthur M
Altman Naomi
Blankenberg Daniel J
Sangar Vineet
author_sort Lesk Arthur M
title Quantitative sequence-function relationships in proteins based on gene ontology
title_short Quantitative sequence-function relationships in proteins based on gene ontology
title_full Quantitative sequence-function relationships in proteins based on gene ontology
title_fullStr Quantitative sequence-function relationships in proteins based on gene ontology
title_full_unstemmed Quantitative sequence-function relationships in proteins based on gene ontology
title_sort quantitative sequence-function relationships in proteins based on gene ontology
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2007-08-01
description <p>Abstract</p> <p>Background</p> <p>The relationship between divergence of amino-acid sequence and divergence of function among homologous proteins is complex. The assumption that homologs share function – the basis of transfer of annotations in databases – must therefore be regarded with caution. Here, we present a quantitative study of sequence and function divergence, based on the Gene Ontology classification of function. We determined the relationship between sequence divergence and function divergence in 6828 protein families from the PFAM database. Within families there is a broad range of sequence similarity from very closely related proteins – for instance, orthologs in different mammals – to very distantly-related proteins at the limit of reliable recognition of homology.</p> <p>Results</p> <p>We correlated the divergence in sequences determined from pairwise alignments, and the divergence in function determined by path lengths in the Gene Ontology graph, taking into account the fact that many proteins have multiple functions. Our results show that, among homologous proteins, the proportion of divergent functions decreases dramatically above a threshold of sequence similarity at about 50% residue identity. For proteins with more than 50% residue identity, transfer of annotation between homologs will lead to an erroneous attribution with a totally dissimilar function in fewer than 6% of cases. This means that for very similar proteins (about 50 % identical residues) the chance of completely incorrect annotation is low; however, because of the phenomenon of recruitment, it is still non-zero.</p> <p>Conclusion</p> <p>Our results describe general features of the evolution of protein function, and serve as a guide to the reliability of annotation transfer, based on the closeness of the relationship between a new protein and its nearest annotated relative.</p>
url http://www.biomedcentral.com/1471-2105/8/294
work_keys_str_mv AT leskarthurm quantitativesequencefunctionrelationshipsinproteinsbasedongeneontology
AT altmannaomi quantitativesequencefunctionrelationshipsinproteinsbasedongeneontology
AT blankenbergdanielj quantitativesequencefunctionrelationshipsinproteinsbasedongeneontology
AT sangarvineet quantitativesequencefunctionrelationshipsinproteinsbasedongeneontology
_version_ 1725650864015671296