Taxallnomy: an extension of NCBI Taxonomy that produces a hierarchically complete taxonomic tree

Background: NCBI Taxonomy is the main taxonomic source for several bioinformatics tools and databases since all organisms with sequence accessions deposited on INSDC are organized in its hierarchical structure. Despite the extensive use and application of this data source, an alternative representat...

Full description

Bibliographic Details
Main Authors: Ortega, J.M (Author), Sakamoto, T. (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 03800nam a2200613Ia 4500
001 10.1186-s12859-021-04304-3
008 220427s2021 CNT 000 0 und d
020 |a 14712105 (ISSN) 
245 1 0 |a Taxallnomy: an extension of NCBI Taxonomy that produces a hierarchically complete taxonomic tree 
260 0 |b BioMed Central Ltd  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1186/s12859-021-04304-3 
520 3 |a Background: NCBI Taxonomy is the main taxonomic source for several bioinformatics tools and databases since all organisms with sequence accessions deposited on INSDC are organized in its hierarchical structure. Despite the extensive use and application of this data source, an alternative representation of data as a table would facilitate the use of information for processing bioinformatics data. To do so, since some taxonomic-ranks are missing in some lineages, an algorithm might propose provisional names for all taxonomic-ranks. Results: To address this issue, we developed an algorithm that takes the tree structure from NCBI Taxonomy and generates a hierarchically complete taxonomic table, maintaining its compatibility with the original tree. The procedures performed by the algorithm consist of attempting to assign a taxonomic-rank to an existing clade or “no rank” node when possible, using its name as part of the created taxonomic-rank name (e.g. Ord_Ornithischia) or interpolating parent nodes when needed (e.g. Cla_of_Ornithischia), both examples given for the dinosaur Brachylophosaurus lineage. The new hierarchical structure was named Taxallnomy because it contains names for all taxonomic-ranks, and it contains 41 hierarchical levels corresponding to the 41 taxonomic-ranks currently found in the NCBI Taxonomy database. From Taxallnomy, users can obtain the complete taxonomic lineage with 41 nodes of all taxa available in the NCBI Taxonomy database, without any hazard to the original tree information. In this work, we demonstrate its applicability by embedding taxonomic information of a specified rank into a phylogenetic tree and by producing metagenomics profiles. Conclusion: Taxallnomy applies to any bioinformatics analyses that depend on the information from NCBI Taxonomy. Taxallnomy is updated periodically but with a distributed PERL script users can generate it locally using NCBI Taxonomy as input. All Taxallnomy resources are available at http://bioinfo.icb.ufmg.br/taxallnomy. © 2021, The Author(s). 
650 0 4 |a algorithm 
650 0 4 |a article 
650 0 4 |a bioinformatics 
650 0 4 |a Bioinformatics 
650 0 4 |a Bioinformatics analysis 
650 0 4 |a Bioinformatics data 
650 0 4 |a Bioinformatics tools 
650 0 4 |a biology 
650 0 4 |a cladistics 
650 0 4 |a Computational Biology 
650 0 4 |a Data handling 
650 0 4 |a Database systems 
650 0 4 |a Databases, Genetic 
650 0 4 |a dinosaur 
650 0 4 |a embedding 
650 0 4 |a genetic database 
650 0 4 |a Hierarchical level 
650 0 4 |a Hierarchical structures 
650 0 4 |a information retrieval 
650 0 4 |a Information Storage and Retrieval 
650 0 4 |a Linnaean system 
650 0 4 |a metagenomics 
650 0 4 |a metagenomics 
650 0 4 |a Metagenomics 
650 0 4 |a NCBI Taxonomy 
650 0 4 |a No rank 
650 0 4 |a nonhuman 
650 0 4 |a phylogenetic tree 
650 0 4 |a Phylogenetic trees 
650 0 4 |a phylogeny 
650 0 4 |a Phylogeny 
650 0 4 |a Tantalum compounds 
650 0 4 |a Taxonomic lineage 
650 0 4 |a taxonomic rank 
650 0 4 |a Taxonomic rank 
650 0 4 |a Taxonomic trees 
650 0 4 |a Taxonomies 
650 0 4 |a Tree structures 
650 0 4 |a Trees (mathematics) 
700 1 |a Ortega, J.M.  |e author 
700 1 |a Sakamoto, T.  |e author 
773 |t BMC Bioinformatics