Detecting coevolution in and among protein domains.

Correlated changes of nucleic or amino acids have provided strong information about the structures and interactions of molecules. Despite the rich literature in coevolutionary sequence analysis, previous methods often have to trade off between generality, simplicity, phylogenetic information, and sp...

Full description

Bibliographic Details
Main Authors: Chen-Hsiang Yeang, David Haussler
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2007-11-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC2098842?pdf=render
id doaj-fd8e13845c8847d4a2a73119e630d882
record_format Article
spelling doaj-fd8e13845c8847d4a2a73119e630d8822020-11-24T21:49:06ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582007-11-01311e21110.1371/journal.pcbi.0030211Detecting coevolution in and among protein domains.Chen-Hsiang YeangDavid HausslerCorrelated changes of nucleic or amino acids have provided strong information about the structures and interactions of molecules. Despite the rich literature in coevolutionary sequence analysis, previous methods often have to trade off between generality, simplicity, phylogenetic information, and specific knowledge about interactions. Furthermore, despite the evidence of coevolution in selected protein families, a comprehensive screening of coevolution among all protein domains is still lacking. We propose an augmented continuous-time Markov process model for sequence coevolution. The model can handle different types of interactions, incorporate phylogenetic information and sequence substitution, has only one extra free parameter, and requires no knowledge about interaction rules. We employ this model to large-scale screenings on the entire protein domain database (Pfam). Strikingly, with 0.1 trillion tests executed, the majority of the inferred coevolving protein domains are functionally related, and the coevolving amino acid residues are spatially coupled. Moreover, many of the coevolving positions are located at functionally important sites of proteins/protein complexes, such as the subunit linkers of superoxide dismutase, the tRNA binding sites of ribosomes, the DNA binding region of RNA polymerase, and the active and ligand binding sites of various enzymes. The results suggest sequence coevolution manifests structural and functional constraints of proteins. The intricate relations between sequence coevolution and various selective constraints are worth pursuing at a deeper level.http://europepmc.org/articles/PMC2098842?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Chen-Hsiang Yeang
David Haussler
spellingShingle Chen-Hsiang Yeang
David Haussler
Detecting coevolution in and among protein domains.
PLoS Computational Biology
author_facet Chen-Hsiang Yeang
David Haussler
author_sort Chen-Hsiang Yeang
title Detecting coevolution in and among protein domains.
title_short Detecting coevolution in and among protein domains.
title_full Detecting coevolution in and among protein domains.
title_fullStr Detecting coevolution in and among protein domains.
title_full_unstemmed Detecting coevolution in and among protein domains.
title_sort detecting coevolution in and among protein domains.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2007-11-01
description Correlated changes of nucleic or amino acids have provided strong information about the structures and interactions of molecules. Despite the rich literature in coevolutionary sequence analysis, previous methods often have to trade off between generality, simplicity, phylogenetic information, and specific knowledge about interactions. Furthermore, despite the evidence of coevolution in selected protein families, a comprehensive screening of coevolution among all protein domains is still lacking. We propose an augmented continuous-time Markov process model for sequence coevolution. The model can handle different types of interactions, incorporate phylogenetic information and sequence substitution, has only one extra free parameter, and requires no knowledge about interaction rules. We employ this model to large-scale screenings on the entire protein domain database (Pfam). Strikingly, with 0.1 trillion tests executed, the majority of the inferred coevolving protein domains are functionally related, and the coevolving amino acid residues are spatially coupled. Moreover, many of the coevolving positions are located at functionally important sites of proteins/protein complexes, such as the subunit linkers of superoxide dismutase, the tRNA binding sites of ribosomes, the DNA binding region of RNA polymerase, and the active and ligand binding sites of various enzymes. The results suggest sequence coevolution manifests structural and functional constraints of proteins. The intricate relations between sequence coevolution and various selective constraints are worth pursuing at a deeper level.
url http://europepmc.org/articles/PMC2098842?pdf=render
work_keys_str_mv AT chenhsiangyeang detectingcoevolutioninandamongproteindomains
AT davidhaussler detectingcoevolutioninandamongproteindomains
_version_ 1725889568520011776