PTree: pattern-based, stochastic search for maximum parsimony phylogenies

Phylogenetic reconstruction is vital to analyzing the evolutionary relationship of genes within and across populations of different species. Nowadays, with next generation sequencing technologies producing sets comprising thousands of sequences, robust identification of the tree topology, which is o...

Full description

Bibliographic Details
Main Authors: Ivan Gregor, Lars Steinbrück, Alice C. McHardy
Format: Article
Language:English
Published: PeerJ Inc. 2013-06-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/89.pdf
id doaj-59342967a1c642fdb1e31feaea44f324
record_format Article
spelling doaj-59342967a1c642fdb1e31feaea44f3242020-11-24T23:24:22ZengPeerJ Inc.PeerJ2167-83592013-06-011e8910.7717/peerj.8989PTree: pattern-based, stochastic search for maximum parsimony phylogeniesIvan Gregor0Lars Steinbrück1Alice C. McHardy2Max-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, Saarbrücken, GermanyDepartment of Algorithmic Bioinformatics, Heinrich-Heine-University Düsseldorf, Düsseldorf, GermanyMax-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, Saarbrücken, GermanyPhylogenetic reconstruction is vital to analyzing the evolutionary relationship of genes within and across populations of different species. Nowadays, with next generation sequencing technologies producing sets comprising thousands of sequences, robust identification of the tree topology, which is optimal according to standard criteria such as maximum parsimony, maximum likelihood or posterior probability, with phylogenetic inference methods is a computationally very demanding task. Here, we describe a stochastic search method for a maximum parsimony tree, implemented in a software package we named PTree. Our method is based on a new pattern-based technique that enables us to infer intermediate sequences efficiently where the incorporation of these sequences in the current tree topology yields a phylogenetic tree with a lower cost. Evaluation across multiple datasets showed that our method is comparable to the algorithms implemented in PAUP* or TNT, which are widely used by the bioinformatics community, in terms of topological accuracy and runtime. We show that our method can process large-scale datasets of 1,000–8,000 sequences. We believe that our novel pattern-based method enriches the current set of tools and methods for phylogenetic tree inference. The software is available under: http://algbio.cs.uni-duesseldorf.de/webapps/wa-download/.https://peerj.com/articles/89.pdfPhylogeny reconstructionMaximum parsimonyLocal searchStochastic search
collection DOAJ
language English
format Article
sources DOAJ
author Ivan Gregor
Lars Steinbrück
Alice C. McHardy
spellingShingle Ivan Gregor
Lars Steinbrück
Alice C. McHardy
PTree: pattern-based, stochastic search for maximum parsimony phylogenies
PeerJ
Phylogeny reconstruction
Maximum parsimony
Local search
Stochastic search
author_facet Ivan Gregor
Lars Steinbrück
Alice C. McHardy
author_sort Ivan Gregor
title PTree: pattern-based, stochastic search for maximum parsimony phylogenies
title_short PTree: pattern-based, stochastic search for maximum parsimony phylogenies
title_full PTree: pattern-based, stochastic search for maximum parsimony phylogenies
title_fullStr PTree: pattern-based, stochastic search for maximum parsimony phylogenies
title_full_unstemmed PTree: pattern-based, stochastic search for maximum parsimony phylogenies
title_sort ptree: pattern-based, stochastic search for maximum parsimony phylogenies
publisher PeerJ Inc.
series PeerJ
issn 2167-8359
publishDate 2013-06-01
description Phylogenetic reconstruction is vital to analyzing the evolutionary relationship of genes within and across populations of different species. Nowadays, with next generation sequencing technologies producing sets comprising thousands of sequences, robust identification of the tree topology, which is optimal according to standard criteria such as maximum parsimony, maximum likelihood or posterior probability, with phylogenetic inference methods is a computationally very demanding task. Here, we describe a stochastic search method for a maximum parsimony tree, implemented in a software package we named PTree. Our method is based on a new pattern-based technique that enables us to infer intermediate sequences efficiently where the incorporation of these sequences in the current tree topology yields a phylogenetic tree with a lower cost. Evaluation across multiple datasets showed that our method is comparable to the algorithms implemented in PAUP* or TNT, which are widely used by the bioinformatics community, in terms of topological accuracy and runtime. We show that our method can process large-scale datasets of 1,000–8,000 sequences. We believe that our novel pattern-based method enriches the current set of tools and methods for phylogenetic tree inference. The software is available under: http://algbio.cs.uni-duesseldorf.de/webapps/wa-download/.
topic Phylogeny reconstruction
Maximum parsimony
Local search
Stochastic search
url https://peerj.com/articles/89.pdf
work_keys_str_mv AT ivangregor ptreepatternbasedstochasticsearchformaximumparsimonyphylogenies
AT larssteinbruck ptreepatternbasedstochasticsearchformaximumparsimonyphylogenies
AT alicecmchardy ptreepatternbasedstochasticsearchformaximumparsimonyphylogenies
_version_ 1725561076991393792