MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation

Abstract Background The nonparametric bootstrap is widely used to measure the branch support of phylogenetic trees. However, bootstrapping is computationally expensive and remains a bottleneck in phylogenetic analyses. Recently, an ultrafast bootstrap approximation (UFBoot) approach was proposed for...

Full description

Bibliographic Details
Main Authors: Diep Thi Hoang, Le Sy Vinh, Tomáš Flouri, Alexandros Stamatakis, Arndt von Haeseler, Bui Quang Minh
Format: Article
Language:English
Published: BMC 2018-02-01
Series:BMC Evolutionary Biology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12862-018-1131-3
id doaj-d9852376e688479a8dfc04becf7505e6
record_format Article
spelling doaj-d9852376e688479a8dfc04becf7505e62021-09-02T05:42:58ZengBMCBMC Evolutionary Biology1471-21482018-02-0118111110.1186/s12862-018-1131-3MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximationDiep Thi Hoang0Le Sy Vinh1Tomáš Flouri2Alexandros Stamatakis3Arndt von Haeseler4Bui Quang Minh5University of Engineering and Technology, Vietnam National UniversityUniversity of Engineering and Technology, Vietnam National UniversityDepartment of Genetics, Evolution and Environment, University College LondonHeidelberg Institute for Theoretical StudiesCenter for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University ViennaCenter for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University ViennaAbstract Background The nonparametric bootstrap is widely used to measure the branch support of phylogenetic trees. However, bootstrapping is computationally expensive and remains a bottleneck in phylogenetic analyses. Recently, an ultrafast bootstrap approximation (UFBoot) approach was proposed for maximum likelihood analyses. However, such an approach is still missing for maximum parsimony. Results To close this gap we present MPBoot, an adaptation and extension of UFBoot to compute branch supports under the maximum parsimony principle. MPBoot works for both uniform and non-uniform cost matrices. Our analyses on biological DNA and protein showed that under uniform cost matrices, MPBoot runs on average 4.7 (DNA) to 7 times (protein data) (range: 1.2–20.7) faster than the standard parsimony bootstrap implemented in PAUP*; but 1.6 (DNA) to 4.1 times (protein data) slower than the standard bootstrap with a fast search routine in TNT (fast-TNT). However, for non-uniform cost matrices MPBoot is 5 (DNA) to 13 times (protein data) (range:0.3–63.9) faster than fast-TNT. We note that MPBoot achieves better scores more frequently than PAUP* and fast-TNT. However, this effect is less pronounced if an intensive but slower search in TNT is invoked. Moreover, experiments on large-scale simulated data show that while both PAUP* and TNT bootstrap estimates are too conservative, MPBoot bootstrap estimates appear more unbiased. Conclusions MPBoot provides an efficient alternative to the standard maximum parsimony bootstrap procedure. It shows favorable performance in terms of run time, the capability of finding a maximum parsimony tree, and high bootstrap accuracy on simulated as well as empirical data sets. MPBoot is easy-to-use, open-source and available at http://www.cibiv.at/software/mpboot.http://link.springer.com/article/10.1186/s12862-018-1131-3Phylogenetic inferenceNonparametric bootstrapMaximum parsimony
collection DOAJ
language English
format Article
sources DOAJ
author Diep Thi Hoang
Le Sy Vinh
Tomáš Flouri
Alexandros Stamatakis
Arndt von Haeseler
Bui Quang Minh
spellingShingle Diep Thi Hoang
Le Sy Vinh
Tomáš Flouri
Alexandros Stamatakis
Arndt von Haeseler
Bui Quang Minh
MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation
BMC Evolutionary Biology
Phylogenetic inference
Nonparametric bootstrap
Maximum parsimony
author_facet Diep Thi Hoang
Le Sy Vinh
Tomáš Flouri
Alexandros Stamatakis
Arndt von Haeseler
Bui Quang Minh
author_sort Diep Thi Hoang
title MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation
title_short MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation
title_full MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation
title_fullStr MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation
title_full_unstemmed MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation
title_sort mpboot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation
publisher BMC
series BMC Evolutionary Biology
issn 1471-2148
publishDate 2018-02-01
description Abstract Background The nonparametric bootstrap is widely used to measure the branch support of phylogenetic trees. However, bootstrapping is computationally expensive and remains a bottleneck in phylogenetic analyses. Recently, an ultrafast bootstrap approximation (UFBoot) approach was proposed for maximum likelihood analyses. However, such an approach is still missing for maximum parsimony. Results To close this gap we present MPBoot, an adaptation and extension of UFBoot to compute branch supports under the maximum parsimony principle. MPBoot works for both uniform and non-uniform cost matrices. Our analyses on biological DNA and protein showed that under uniform cost matrices, MPBoot runs on average 4.7 (DNA) to 7 times (protein data) (range: 1.2–20.7) faster than the standard parsimony bootstrap implemented in PAUP*; but 1.6 (DNA) to 4.1 times (protein data) slower than the standard bootstrap with a fast search routine in TNT (fast-TNT). However, for non-uniform cost matrices MPBoot is 5 (DNA) to 13 times (protein data) (range:0.3–63.9) faster than fast-TNT. We note that MPBoot achieves better scores more frequently than PAUP* and fast-TNT. However, this effect is less pronounced if an intensive but slower search in TNT is invoked. Moreover, experiments on large-scale simulated data show that while both PAUP* and TNT bootstrap estimates are too conservative, MPBoot bootstrap estimates appear more unbiased. Conclusions MPBoot provides an efficient alternative to the standard maximum parsimony bootstrap procedure. It shows favorable performance in terms of run time, the capability of finding a maximum parsimony tree, and high bootstrap accuracy on simulated as well as empirical data sets. MPBoot is easy-to-use, open-source and available at http://www.cibiv.at/software/mpboot.
topic Phylogenetic inference
Nonparametric bootstrap
Maximum parsimony
url http://link.springer.com/article/10.1186/s12862-018-1131-3
work_keys_str_mv AT diepthihoang mpbootfastphylogeneticmaximumparsimonytreeinferenceandbootstrapapproximation
AT lesyvinh mpbootfastphylogeneticmaximumparsimonytreeinferenceandbootstrapapproximation
AT tomasflouri mpbootfastphylogeneticmaximumparsimonytreeinferenceandbootstrapapproximation
AT alexandrosstamatakis mpbootfastphylogeneticmaximumparsimonytreeinferenceandbootstrapapproximation
AT arndtvonhaeseler mpbootfastphylogeneticmaximumparsimonytreeinferenceandbootstrapapproximation
AT buiquangminh mpbootfastphylogeneticmaximumparsimonytreeinferenceandbootstrapapproximation
_version_ 1721179438479048704