Not seeing the forest for the trees: size of the minimum spanning trees (MSTs) forest and branch significance in MST-based phylogenetic analysis.

Trees, including minimum spanning trees (MSTs), are commonly used in phylogenetic studies. But, for the research community, it may be unclear that the presented tree is just a hypothesis, chosen from among many possible alternatives. In this scenario, it is important to quantify our confidence in bo...

Full description

Bibliographic Details
Main Authors: Andreia Sofia Teixeira, Pedro T Monteiro, João A Carriço, Mário Ramirez, Alexandre P Francisco
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4370493?pdf=render
id doaj-24e4a165dc124fa2a7c2f848a47f0377
record_format Article
spelling doaj-24e4a165dc124fa2a7c2f848a47f03772020-11-25T00:59:37ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-01103e011931510.1371/journal.pone.0119315Not seeing the forest for the trees: size of the minimum spanning trees (MSTs) forest and branch significance in MST-based phylogenetic analysis.Andreia Sofia TeixeiraPedro T MonteiroJoão A CarriçoMário RamirezAlexandre P FranciscoTrees, including minimum spanning trees (MSTs), are commonly used in phylogenetic studies. But, for the research community, it may be unclear that the presented tree is just a hypothesis, chosen from among many possible alternatives. In this scenario, it is important to quantify our confidence in both the trees and the branches/edges included in such trees. In this paper, we address this problem for MSTs by introducing a new edge betweenness metric for undirected and weighted graphs. This spanning edge betweenness metric is defined as the fraction of equivalent MSTs where a given edge is present. The metric provides a per edge statistic that is similar to that of the bootstrap approach frequently used in phylogenetics to support the grouping of taxa. We provide methods for the exact computation of this metric based on the well known Kirchhoff's matrix tree theorem. Moreover, we implement and make available a module for the PHYLOViZ software and evaluate the proposed metric concerning both effectiveness and computational performance. Analysis of trees generated using multilocus sequence typing data (MLST) and the goeBURST algorithm revealed that the space of possible MSTs in real data sets is extremely large. Selection of the edge to be represented using bootstrap could lead to unreliable results since alternative edges are present in the same fraction of equivalent MSTs. The choice of the MST to be presented, results from criteria implemented in the algorithm that must be based in biologically plausible models.http://europepmc.org/articles/PMC4370493?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Andreia Sofia Teixeira
Pedro T Monteiro
João A Carriço
Mário Ramirez
Alexandre P Francisco
spellingShingle Andreia Sofia Teixeira
Pedro T Monteiro
João A Carriço
Mário Ramirez
Alexandre P Francisco
Not seeing the forest for the trees: size of the minimum spanning trees (MSTs) forest and branch significance in MST-based phylogenetic analysis.
PLoS ONE
author_facet Andreia Sofia Teixeira
Pedro T Monteiro
João A Carriço
Mário Ramirez
Alexandre P Francisco
author_sort Andreia Sofia Teixeira
title Not seeing the forest for the trees: size of the minimum spanning trees (MSTs) forest and branch significance in MST-based phylogenetic analysis.
title_short Not seeing the forest for the trees: size of the minimum spanning trees (MSTs) forest and branch significance in MST-based phylogenetic analysis.
title_full Not seeing the forest for the trees: size of the minimum spanning trees (MSTs) forest and branch significance in MST-based phylogenetic analysis.
title_fullStr Not seeing the forest for the trees: size of the minimum spanning trees (MSTs) forest and branch significance in MST-based phylogenetic analysis.
title_full_unstemmed Not seeing the forest for the trees: size of the minimum spanning trees (MSTs) forest and branch significance in MST-based phylogenetic analysis.
title_sort not seeing the forest for the trees: size of the minimum spanning trees (msts) forest and branch significance in mst-based phylogenetic analysis.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2015-01-01
description Trees, including minimum spanning trees (MSTs), are commonly used in phylogenetic studies. But, for the research community, it may be unclear that the presented tree is just a hypothesis, chosen from among many possible alternatives. In this scenario, it is important to quantify our confidence in both the trees and the branches/edges included in such trees. In this paper, we address this problem for MSTs by introducing a new edge betweenness metric for undirected and weighted graphs. This spanning edge betweenness metric is defined as the fraction of equivalent MSTs where a given edge is present. The metric provides a per edge statistic that is similar to that of the bootstrap approach frequently used in phylogenetics to support the grouping of taxa. We provide methods for the exact computation of this metric based on the well known Kirchhoff's matrix tree theorem. Moreover, we implement and make available a module for the PHYLOViZ software and evaluate the proposed metric concerning both effectiveness and computational performance. Analysis of trees generated using multilocus sequence typing data (MLST) and the goeBURST algorithm revealed that the space of possible MSTs in real data sets is extremely large. Selection of the edge to be represented using bootstrap could lead to unreliable results since alternative edges are present in the same fraction of equivalent MSTs. The choice of the MST to be presented, results from criteria implemented in the algorithm that must be based in biologically plausible models.
url http://europepmc.org/articles/PMC4370493?pdf=render
work_keys_str_mv AT andreiasofiateixeira notseeingtheforestforthetreessizeoftheminimumspanningtreesmstsforestandbranchsignificanceinmstbasedphylogeneticanalysis
AT pedrotmonteiro notseeingtheforestforthetreessizeoftheminimumspanningtreesmstsforestandbranchsignificanceinmstbasedphylogeneticanalysis
AT joaoacarrico notseeingtheforestforthetreessizeoftheminimumspanningtreesmstsforestandbranchsignificanceinmstbasedphylogeneticanalysis
AT marioramirez notseeingtheforestforthetreessizeoftheminimumspanningtreesmstsforestandbranchsignificanceinmstbasedphylogeneticanalysis
AT alexandrepfrancisco notseeingtheforestforthetreessizeoftheminimumspanningtreesmstsforestandbranchsignificanceinmstbasedphylogeneticanalysis
_version_ 1725217343429148672