A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities

<p>Abstract</p> <p>Background</p> <p>Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that...

Full description

Bibliographic Details
Main Authors: Maréchal Eric, Ortet Philippe, Roy Sylvaine, Bastien Olivier
Format: Article
Language:English
Published: BMC 2005-03-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/6/49
id doaj-c46367067dc94897b416bff55fb171c4
record_format Article
spelling doaj-c46367067dc94897b416bff55fb171c42020-11-25T01:56:12ZengBMCBMC Bioinformatics1471-21052005-03-01614910.1186/1471-2105-6-49A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilitiesMaréchal EricOrtet PhilippeRoy SylvaineBastien Olivier<p>Abstract</p> <p>Background</p> <p>Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction.</p> <p>Results</p> <p>We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny.</p> <p>Conclusion</p> <p>The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations.</p> http://www.biomedcentral.com/1471-2105/6/49
collection DOAJ
language English
format Article
sources DOAJ
author Maréchal Eric
Ortet Philippe
Roy Sylvaine
Bastien Olivier
spellingShingle Maréchal Eric
Ortet Philippe
Roy Sylvaine
Bastien Olivier
A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
BMC Bioinformatics
author_facet Maréchal Eric
Ortet Philippe
Roy Sylvaine
Bastien Olivier
author_sort Maréchal Eric
title A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
title_short A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
title_full A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
title_fullStr A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
title_full_unstemmed A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities
title_sort configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise z-score probabilities
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2005-03-01
description <p>Abstract</p> <p>Background</p> <p>Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction.</p> <p>Results</p> <p>We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny.</p> <p>Conclusion</p> <p>The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations.</p>
url http://www.biomedcentral.com/1471-2105/6/49
work_keys_str_mv AT marechaleric aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT ortetphilippe aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT roysylvaine aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT bastienolivier aconfigurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT marechaleric configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT ortetphilippe configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT roysylvaine configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
AT bastienolivier configurationspaceofhomologousproteinsconservingmutualinformationandallowingaphylogenyinferencebasedonpairwisezscoreprobabilities
_version_ 1724980971807178752