TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees

Abstract Objective The body of DNA sequence data lacking taxonomically informative sequence headers is rapidly growing in user and public databases (e.g. sequences lacking identification and contaminants). In the context of systematics studies, sorting such sequence data for taxonomic curation and/o...

Full description

Bibliographic Details
Main Authors:	Thomas Sauvage, Sophie Plouviez, William E. Schmidt, Suzanne Fredericq
Format:	Article
Language:	English
Published:	BMC 2018-03-01
Series:	BMC Research Notes
Subjects:	Barcoding Biodiversity Clone Contaminant Cryptic Environmental
Online Access:	http://link.springer.com/article/10.1186/s13104-018-3268-y

id	doaj-4445483b2f914dbeb1159ff6338d131d
record_format	Article
spelling	doaj-4445483b2f914dbeb1159ff6338d131d2020-11-25T02:37:14ZengBMCBMC Research Notes1756-05002018-03-011111610.1186/s13104-018-3268-yTREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic treesThomas Sauvage0Sophie Plouviez1William E. Schmidt2Suzanne Fredericq3Department of Biology, University of Louisiana at LafayetteDepartment of Biology, University of Louisiana at LafayetteDepartment of Biology, University of Louisiana at LafayetteDepartment of Biology, University of Louisiana at LafayetteAbstract Objective The body of DNA sequence data lacking taxonomically informative sequence headers is rapidly growing in user and public databases (e.g. sequences lacking identification and contaminants). In the context of systematics studies, sorting such sequence data for taxonomic curation and/or molecular diversity characterization (e.g. crypticism) often requires the building of exploratory phylogenetic trees with reference taxa. The subsequent step of segregating DNA sequences of interest based on observed topological relationships can represent a challenging task, especially for large datasets. Results We have written TREE2FASTA, a Perl script that enables and expedites the sorting of FASTA-formatted sequence data from exploratory phylogenetic trees. TREE2FASTA takes advantage of the interactive, rapid point-and-click color selection and/or annotations of tree leaves in the popular Java tree-viewer FigTree to segregate groups of FASTA sequences of interest to separate files. TREE2FASTA allows for both simple and nested segregation designs to facilitate the simultaneous preparation of multiple data sets that may overlap in sequence content.http://link.springer.com/article/10.1186/s13104-018-3268-yBarcodingBiodiversityCloneContaminantCrypticEnvironmental
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Thomas Sauvage Sophie Plouviez William E. Schmidt Suzanne Fredericq
spellingShingle	Thomas Sauvage Sophie Plouviez William E. Schmidt Suzanne Fredericq TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees BMC Research Notes Barcoding Biodiversity Clone Contaminant Cryptic Environmental
author_facet	Thomas Sauvage Sophie Plouviez William E. Schmidt Suzanne Fredericq
author_sort	Thomas Sauvage
title	TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees
title_short	TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees
title_full	TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees
title_fullStr	TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees
title_full_unstemmed	TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees
title_sort	tree2fasta: a flexible perl script for batch extraction of fasta sequences from exploratory phylogenetic trees
publisher	BMC
series	BMC Research Notes
issn	1756-0500
publishDate	2018-03-01
description	Abstract Objective The body of DNA sequence data lacking taxonomically informative sequence headers is rapidly growing in user and public databases (e.g. sequences lacking identification and contaminants). In the context of systematics studies, sorting such sequence data for taxonomic curation and/or molecular diversity characterization (e.g. crypticism) often requires the building of exploratory phylogenetic trees with reference taxa. The subsequent step of segregating DNA sequences of interest based on observed topological relationships can represent a challenging task, especially for large datasets. Results We have written TREE2FASTA, a Perl script that enables and expedites the sorting of FASTA-formatted sequence data from exploratory phylogenetic trees. TREE2FASTA takes advantage of the interactive, rapid point-and-click color selection and/or annotations of tree leaves in the popular Java tree-viewer FigTree to segregate groups of FASTA sequences of interest to separate files. TREE2FASTA allows for both simple and nested segregation designs to facilitate the simultaneous preparation of multiple data sets that may overlap in sequence content.
topic	Barcoding Biodiversity Clone Contaminant Cryptic Environmental
url	http://link.springer.com/article/10.1186/s13104-018-3268-y
work_keys_str_mv	AT thomassauvage tree2fastaaflexibleperlscriptforbatchextractionoffastasequencesfromexploratoryphylogenetictrees AT sophieplouviez tree2fastaaflexibleperlscriptforbatchextractionoffastasequencesfromexploratoryphylogenetictrees AT williameschmidt tree2fastaaflexibleperlscriptforbatchextractionoffastasequencesfromexploratoryphylogenetictrees AT suzannefredericq tree2fastaaflexibleperlscriptforbatchextractionoffastasequencesfromexploratoryphylogenetictrees
_version_	1724795902205362176

TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees

Similar Items