Assessing the performance of different approaches for functional and taxonomic annotation of metagenomes

Abstract Background Metagenomes can be analysed using different approaches and tools. One of the most important distinctions is the way to perform taxonomic and functional assignment, choosing between the use of assembly algorithms or the direct analysis of raw sequence reads instead by homology sea...

Full description

Bibliographic Details
Main Authors: Javier Tamames, Marta Cobo-Simón, Fernando Puente-Sánchez
Format: Article
Language:English
Published: BMC 2019-12-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-019-6289-6
id doaj-573d2a6595c349849db8278613706e6e
record_format Article
spelling doaj-573d2a6595c349849db8278613706e6e2020-12-13T12:18:02ZengBMCBMC Genomics1471-21642019-12-0120111610.1186/s12864-019-6289-6Assessing the performance of different approaches for functional and taxonomic annotation of metagenomesJavier Tamames0Marta Cobo-Simón1Fernando Puente-Sánchez2Systems Biology Department, Centro Nacional de Biotecnología, CSICSystems Biology Department, Centro Nacional de Biotecnología, CSICSystems Biology Department, Centro Nacional de Biotecnología, CSICAbstract Background Metagenomes can be analysed using different approaches and tools. One of the most important distinctions is the way to perform taxonomic and functional assignment, choosing between the use of assembly algorithms or the direct analysis of raw sequence reads instead by homology searching, k-mer analysys, or detection of marker genes. Many instances of each approach can be found in the literature, but to the best of our knowledge no evaluation of their different performances has been carried on, and we question if their results are comparable. Results We have analysed several real and mock metagenomes using different methodologies and tools, and compared the resulting taxonomic and functional profiles. Our results show that database completeness (the representation of diverse organisms and taxa in it) is the main factor determining the performance of the methods relying on direct read assignment either by homology, k-mer composition or similarity to marker genes, while methods relying on assembly and assignment of predicted genes are most influenced by metagenomic size, that in turn determines the completeness of the assembly (the percentage of read that were assembled). Conclusions Although differences exist, taxonomic profiles are rather similar between raw read assignment and assembly assignment methods, while they are more divergent for methods based on k-mers and marker genes. Regarding functional annotation, analysis of raw reads retrieves more functions, but it also makes a substantial number of over-predictions. Assembly methods are more advantageous as the size of the metagenome grows bigger.https://doi.org/10.1186/s12864-019-6289-6MetagenomicsFunctional annotationTaxonomic annotationAssembly
collection DOAJ
language English
format Article
sources DOAJ
author Javier Tamames
Marta Cobo-Simón
Fernando Puente-Sánchez
spellingShingle Javier Tamames
Marta Cobo-Simón
Fernando Puente-Sánchez
Assessing the performance of different approaches for functional and taxonomic annotation of metagenomes
BMC Genomics
Metagenomics
Functional annotation
Taxonomic annotation
Assembly
author_facet Javier Tamames
Marta Cobo-Simón
Fernando Puente-Sánchez
author_sort Javier Tamames
title Assessing the performance of different approaches for functional and taxonomic annotation of metagenomes
title_short Assessing the performance of different approaches for functional and taxonomic annotation of metagenomes
title_full Assessing the performance of different approaches for functional and taxonomic annotation of metagenomes
title_fullStr Assessing the performance of different approaches for functional and taxonomic annotation of metagenomes
title_full_unstemmed Assessing the performance of different approaches for functional and taxonomic annotation of metagenomes
title_sort assessing the performance of different approaches for functional and taxonomic annotation of metagenomes
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2019-12-01
description Abstract Background Metagenomes can be analysed using different approaches and tools. One of the most important distinctions is the way to perform taxonomic and functional assignment, choosing between the use of assembly algorithms or the direct analysis of raw sequence reads instead by homology searching, k-mer analysys, or detection of marker genes. Many instances of each approach can be found in the literature, but to the best of our knowledge no evaluation of their different performances has been carried on, and we question if their results are comparable. Results We have analysed several real and mock metagenomes using different methodologies and tools, and compared the resulting taxonomic and functional profiles. Our results show that database completeness (the representation of diverse organisms and taxa in it) is the main factor determining the performance of the methods relying on direct read assignment either by homology, k-mer composition or similarity to marker genes, while methods relying on assembly and assignment of predicted genes are most influenced by metagenomic size, that in turn determines the completeness of the assembly (the percentage of read that were assembled). Conclusions Although differences exist, taxonomic profiles are rather similar between raw read assignment and assembly assignment methods, while they are more divergent for methods based on k-mers and marker genes. Regarding functional annotation, analysis of raw reads retrieves more functions, but it also makes a substantial number of over-predictions. Assembly methods are more advantageous as the size of the metagenome grows bigger.
topic Metagenomics
Functional annotation
Taxonomic annotation
Assembly
url https://doi.org/10.1186/s12864-019-6289-6
work_keys_str_mv AT javiertamames assessingtheperformanceofdifferentapproachesforfunctionalandtaxonomicannotationofmetagenomes
AT martacobosimon assessingtheperformanceofdifferentapproachesforfunctionalandtaxonomicannotationofmetagenomes
AT fernandopuentesanchez assessingtheperformanceofdifferentapproachesforfunctionalandtaxonomicannotationofmetagenomes
_version_ 1724384885784707072