Key2Ann: a tool to process sequence sets by replacing database identifiers with a human-readable annotation

Deducing common properties or degrees of phylogenetic relationship by analyzing a grouping or clustering of sequence sets is a frequently used technique in computational biology. If interpreted by means of visual inspection, the conclusions depend for many of these applications on meaningful names f...

Full description

Bibliographic Details
Main Authors: Pürzer Andreas, Grassmann Felix, Birzer Dietmar, Merkl Rainer
Format: Article
Language:English
Published: De Gruyter 2011-03-01
Series:Journal of Integrative Bioinformatics
Online Access:https://doi.org/10.2390/biecoll-jib-2011-153
id doaj-db2acad2f1524cb384e68eae7a560335
record_format Article
spelling doaj-db2acad2f1524cb384e68eae7a5603352021-09-06T19:40:55ZengDe GruyterJournal of Integrative Bioinformatics1613-45162011-03-0181354610.2390/biecoll-jib-2011-153Key2Ann: a tool to process sequence sets by replacing database identifiers with a human-readable annotationPürzer Andreas0Grassmann Felix1Birzer Dietmar2Merkl Rainer3University of Applied Sciences, Department of Computer Science and Mathematics, 93025, Regensburg, GermanyInstitute of Biophysics and Physical Biochemistry, University of Regensburg, 93040, Regensburg, GermanyInstitute of Biophysics and Physical Biochemistry, University of Regensburg, 93040, Regensburg, GermanyInstitute of Biophysics and Physical Biochemistry, University of Regensburg, 93040, Regensburg, GermanyDeducing common properties or degrees of phylogenetic relationship by analyzing a grouping or clustering of sequence sets is a frequently used technique in computational biology. If interpreted by means of visual inspection, the conclusions depend for many of these applications on meaningful names for the input data. In accordance with the aim of the analysis, the sequences should be provided with names indicating the function of the genes or gene-products, the phylogenetic position or other properties characterizing the contributing species. However, sequences extracted from databases are most often annotated with identifiers which only implicitly contain the desired information. To solve this problem, we have designed and implemented a tool named Key2Ann, which replaces in multiple fasta files the database keys with short terms indicating the taxonomic position or other features like the gene name or the EC-number. In addition, properties like habitat, growth temperature or the degree of pathogenicity can be coded for microbial species. To allow for highest flexibility, the user can control the composition of the names by means of command line parameters. Key2Ann is written in Java and can be downloaded via http://www-bioinf.uni-regensburg.de/downl/Key2Ann.zip. We demonstrate the usage of Key2Ann by discussing three typical examples of phylogenetic analysis.https://doi.org/10.2390/biecoll-jib-2011-153
collection DOAJ
language English
format Article
sources DOAJ
author Pürzer Andreas
Grassmann Felix
Birzer Dietmar
Merkl Rainer
spellingShingle Pürzer Andreas
Grassmann Felix
Birzer Dietmar
Merkl Rainer
Key2Ann: a tool to process sequence sets by replacing database identifiers with a human-readable annotation
Journal of Integrative Bioinformatics
author_facet Pürzer Andreas
Grassmann Felix
Birzer Dietmar
Merkl Rainer
author_sort Pürzer Andreas
title Key2Ann: a tool to process sequence sets by replacing database identifiers with a human-readable annotation
title_short Key2Ann: a tool to process sequence sets by replacing database identifiers with a human-readable annotation
title_full Key2Ann: a tool to process sequence sets by replacing database identifiers with a human-readable annotation
title_fullStr Key2Ann: a tool to process sequence sets by replacing database identifiers with a human-readable annotation
title_full_unstemmed Key2Ann: a tool to process sequence sets by replacing database identifiers with a human-readable annotation
title_sort key2ann: a tool to process sequence sets by replacing database identifiers with a human-readable annotation
publisher De Gruyter
series Journal of Integrative Bioinformatics
issn 1613-4516
publishDate 2011-03-01
description Deducing common properties or degrees of phylogenetic relationship by analyzing a grouping or clustering of sequence sets is a frequently used technique in computational biology. If interpreted by means of visual inspection, the conclusions depend for many of these applications on meaningful names for the input data. In accordance with the aim of the analysis, the sequences should be provided with names indicating the function of the genes or gene-products, the phylogenetic position or other properties characterizing the contributing species. However, sequences extracted from databases are most often annotated with identifiers which only implicitly contain the desired information. To solve this problem, we have designed and implemented a tool named Key2Ann, which replaces in multiple fasta files the database keys with short terms indicating the taxonomic position or other features like the gene name or the EC-number. In addition, properties like habitat, growth temperature or the degree of pathogenicity can be coded for microbial species. To allow for highest flexibility, the user can control the composition of the names by means of command line parameters. Key2Ann is written in Java and can be downloaded via http://www-bioinf.uni-regensburg.de/downl/Key2Ann.zip. We demonstrate the usage of Key2Ann by discussing three typical examples of phylogenetic analysis.
url https://doi.org/10.2390/biecoll-jib-2011-153
work_keys_str_mv AT purzerandreas key2annatooltoprocesssequencesetsbyreplacingdatabaseidentifierswithahumanreadableannotation
AT grassmannfelix key2annatooltoprocesssequencesetsbyreplacingdatabaseidentifierswithahumanreadableannotation
AT birzerdietmar key2annatooltoprocesssequencesetsbyreplacingdatabaseidentifierswithahumanreadableannotation
AT merklrainer key2annatooltoprocesssequencesetsbyreplacingdatabaseidentifierswithahumanreadableannotation
_version_ 1717767448347803648