Hierarchical clustering of maximum parsimony reconciliations

Abstract Background Maximum parsimony reconciliation in the duplication-transfer-loss model is a widely-used method for analyzing the evolutionary histories of pairs of entities such as hosts and parasites, symbiont species, and species and genes. While efficient algorithms are known for finding max...

Full description

Bibliographic Details
Main Authors: Ross Mawhorter, Ran Libeskind-Hadas
Format: Article
Language:English
Published: BMC 2019-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-3223-5
id doaj-ed255015b9404334bbeaaa171573d532
record_format Article
spelling doaj-ed255015b9404334bbeaaa171573d5322020-11-25T01:37:04ZengBMCBMC Bioinformatics1471-21052019-11-0120111210.1186/s12859-019-3223-5Hierarchical clustering of maximum parsimony reconciliationsRoss Mawhorter0Ran Libeskind-Hadas1Department of Computer Science, Harvey Mudd CollegeDepartment of Computer Science, Harvey Mudd CollegeAbstract Background Maximum parsimony reconciliation in the duplication-transfer-loss model is a widely-used method for analyzing the evolutionary histories of pairs of entities such as hosts and parasites, symbiont species, and species and genes. While efficient algorithms are known for finding maximum parsimony reconciliations, the number of such reconciliations can be exponential in the size of the trees. Since these reconciliations can differ substantially from one another, making inferences from any one reconciliation may lead to conclusions that are not supported, or may even be contradicted, by other maximum parsimony reconciliations. Therefore, there is a need to find small sets of best representative reconciliations when the space of solutions is large and diverse. Results We provide a general framework for hierarchical clustering the space of maximum parsimony reconciliations. We demonstrate this framework for two specific linkage criteria, one that seeks to maximize the average support of the events found in the reconciliations in each cluster and the other that seeks to minimize the distance between reconciliations in each cluster. We analyze the asymptotic worst-case running times and provide experimental results that demonstrate the viability and utility of this approach. Conclusions The hierarchical clustering algorithm method proposed here provides a new approach to find a set of representative reconciliations in the potentially vast and diverse space of maximum parsimony reconciliations.http://link.springer.com/article/10.1186/s12859-019-3223-5Phylogenetic treesMaximum parsimony reconciliationDuplication-transfer-loss model
collection DOAJ
language English
format Article
sources DOAJ
author Ross Mawhorter
Ran Libeskind-Hadas
spellingShingle Ross Mawhorter
Ran Libeskind-Hadas
Hierarchical clustering of maximum parsimony reconciliations
BMC Bioinformatics
Phylogenetic trees
Maximum parsimony reconciliation
Duplication-transfer-loss model
author_facet Ross Mawhorter
Ran Libeskind-Hadas
author_sort Ross Mawhorter
title Hierarchical clustering of maximum parsimony reconciliations
title_short Hierarchical clustering of maximum parsimony reconciliations
title_full Hierarchical clustering of maximum parsimony reconciliations
title_fullStr Hierarchical clustering of maximum parsimony reconciliations
title_full_unstemmed Hierarchical clustering of maximum parsimony reconciliations
title_sort hierarchical clustering of maximum parsimony reconciliations
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2019-11-01
description Abstract Background Maximum parsimony reconciliation in the duplication-transfer-loss model is a widely-used method for analyzing the evolutionary histories of pairs of entities such as hosts and parasites, symbiont species, and species and genes. While efficient algorithms are known for finding maximum parsimony reconciliations, the number of such reconciliations can be exponential in the size of the trees. Since these reconciliations can differ substantially from one another, making inferences from any one reconciliation may lead to conclusions that are not supported, or may even be contradicted, by other maximum parsimony reconciliations. Therefore, there is a need to find small sets of best representative reconciliations when the space of solutions is large and diverse. Results We provide a general framework for hierarchical clustering the space of maximum parsimony reconciliations. We demonstrate this framework for two specific linkage criteria, one that seeks to maximize the average support of the events found in the reconciliations in each cluster and the other that seeks to minimize the distance between reconciliations in each cluster. We analyze the asymptotic worst-case running times and provide experimental results that demonstrate the viability and utility of this approach. Conclusions The hierarchical clustering algorithm method proposed here provides a new approach to find a set of representative reconciliations in the potentially vast and diverse space of maximum parsimony reconciliations.
topic Phylogenetic trees
Maximum parsimony reconciliation
Duplication-transfer-loss model
url http://link.springer.com/article/10.1186/s12859-019-3223-5
work_keys_str_mv AT rossmawhorter hierarchicalclusteringofmaximumparsimonyreconciliations
AT ranlibeskindhadas hierarchicalclusteringofmaximumparsimonyreconciliations
_version_ 1725059901926932480