Optimizing de novo genome assembly from PCR-amplified metagenomes

Background Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling de novo assembly of genomes from metagenomes. These metagenome-assembled genomes are critical to provide ecological, evolutionary, and metabolic context for all the micro...

Full description

Bibliographic Details
Main Authors: Simon Roux, Gareth Trubl, Danielle Goudeau, Nandita Nath, Estelle Couradeau, Nathan A. Ahlgren, Yuanchao Zhan, David Marsan, Feng Chen, Jed A. Fuhrman, Trent R. Northen, Matthew B. Sullivan, Virginia I. Rich, Rex R. Malmstrom, Emiley A. Eloe-Fadrosh
Format: Article
Language:English
Published: PeerJ Inc. 2019-05-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/6902.pdf
id doaj-b412649ebea340ac9ec3c6a97c4b52a3
record_format Article
collection DOAJ
language English
format Article
sources DOAJ
author Simon Roux
Gareth Trubl
Danielle Goudeau
Nandita Nath
Estelle Couradeau
Nathan A. Ahlgren
Yuanchao Zhan
David Marsan
Feng Chen
Jed A. Fuhrman
Trent R. Northen
Matthew B. Sullivan
Virginia I. Rich
Rex R. Malmstrom
Emiley A. Eloe-Fadrosh
spellingShingle Simon Roux
Gareth Trubl
Danielle Goudeau
Nandita Nath
Estelle Couradeau
Nathan A. Ahlgren
Yuanchao Zhan
David Marsan
Feng Chen
Jed A. Fuhrman
Trent R. Northen
Matthew B. Sullivan
Virginia I. Rich
Rex R. Malmstrom
Emiley A. Eloe-Fadrosh
Optimizing de novo genome assembly from PCR-amplified metagenomes
PeerJ
Metagenomics
Microbial ecology
Genome assembly
Viral metagenomics
author_facet Simon Roux
Gareth Trubl
Danielle Goudeau
Nandita Nath
Estelle Couradeau
Nathan A. Ahlgren
Yuanchao Zhan
David Marsan
Feng Chen
Jed A. Fuhrman
Trent R. Northen
Matthew B. Sullivan
Virginia I. Rich
Rex R. Malmstrom
Emiley A. Eloe-Fadrosh
author_sort Simon Roux
title Optimizing de novo genome assembly from PCR-amplified metagenomes
title_short Optimizing de novo genome assembly from PCR-amplified metagenomes
title_full Optimizing de novo genome assembly from PCR-amplified metagenomes
title_fullStr Optimizing de novo genome assembly from PCR-amplified metagenomes
title_full_unstemmed Optimizing de novo genome assembly from PCR-amplified metagenomes
title_sort optimizing de novo genome assembly from pcr-amplified metagenomes
publisher PeerJ Inc.
series PeerJ
issn 2167-8359
publishDate 2019-05-01
description Background Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling de novo assembly of genomes from metagenomes. These metagenome-assembled genomes are critical to provide ecological, evolutionary, and metabolic context for all the microbes and viruses yet to be cultivated. Metagenomes can now be generated from nanogram to subnanogram amounts of DNA. However, these libraries require several rounds of PCR amplification before sequencing, and recent data suggest these typically yield smaller and more fragmented assemblies than regular metagenomes. Methods Here we evaluate de novo assembly methods of 169 PCR-amplified metagenomes, including 25 for which an unamplified counterpart is available, to optimize specific assembly approaches for PCR-amplified libraries. We first evaluated coverage bias by mapping reads from PCR-amplified metagenomes onto reference contigs obtained from unamplified metagenomes of the same samples. Then, we compared different assembly pipelines in terms of assembly size (number of bp in contigs ≥ 10 kb) and error rates to evaluate which are the best suited for PCR-amplified metagenomes. Results Read mapping analyses revealed that the depth of coverage within individual genomes is significantly more uneven in PCR-amplified datasets versus unamplified metagenomes, with regions of high depth of coverage enriched in short inserts. This enrichment scales with the number of PCR cycles performed, and is presumably due to preferential amplification of short inserts. Standard assembly pipelines are confounded by this type of coverage unevenness, so we evaluated other assembly options to mitigate these issues. We found that a pipeline combining read deduplication and an assembly algorithm originally designed to recover genomes from libraries generated after whole genome amplification (single-cell SPAdes) frequently improved assembly of contigs ≥10 kb by 10 to 100-fold for low input metagenomes. Conclusions PCR-amplified metagenomes have enabled scientists to explore communities traditionally challenging to describe, including some with extremely low biomass or from which DNA is particularly difficult to extract. Here we show that a modified assembly pipeline can lead to an improved de novo genome assembly from PCR-amplified datasets, and enables a better genome recovery from low input metagenomes.
topic Metagenomics
Microbial ecology
Genome assembly
Viral metagenomics
url https://peerj.com/articles/6902.pdf
work_keys_str_mv AT simonroux optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT garethtrubl optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT daniellegoudeau optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT nanditanath optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT estellecouradeau optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT nathanaahlgren optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT yuanchaozhan optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT davidmarsan optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT fengchen optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT jedafuhrman optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT trentrnorthen optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT matthewbsullivan optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT virginiairich optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT rexrmalmstrom optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
AT emileyaeloefadrosh optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes
_version_ 1725207745872789504
spelling doaj-b412649ebea340ac9ec3c6a97c4b52a32020-11-25T01:01:43ZengPeerJ Inc.PeerJ2167-83592019-05-017e690210.7717/peerj.6902Optimizing de novo genome assembly from PCR-amplified metagenomesSimon Roux0Gareth Trubl1Danielle Goudeau2Nandita Nath3Estelle Couradeau4Nathan A. Ahlgren5Yuanchao Zhan6David Marsan7Feng Chen8Jed A. Fuhrman9Trent R. Northen10Matthew B. Sullivan11Virginia I. Rich12Rex R. Malmstrom13Emiley A. Eloe-Fadrosh14Department of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaDepartment of Microbiology, Ohio State University, Columbus, OH, United States of AmericaDepartment of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaDepartment of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaEnvironmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of AmericaDepartment of Biology, Clark University, Worcester, MA, United States of AmericaInstitution of Marine and Environmental Technology, University of Maryland Center for Environmental Science, Cambridge, MD, United States of AmericaInstitution of Marine and Environmental Technology, University of Maryland Center for Environmental Science, Cambridge, MD, United States of AmericaInstitution of Marine and Environmental Technology, University of Maryland Center for Environmental Science, Cambridge, MD, United States of AmericaDepartment of Biological Sciences, University of Southern California, Los Angeles, CA, United States of AmericaDepartment of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaDepartment of Microbiology, Ohio State University, Columbus, OH, United States of AmericaDepartment of Microbiology, Ohio State University, Columbus, OH, United States of AmericaDepartment of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaDepartment of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaBackground Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling de novo assembly of genomes from metagenomes. These metagenome-assembled genomes are critical to provide ecological, evolutionary, and metabolic context for all the microbes and viruses yet to be cultivated. Metagenomes can now be generated from nanogram to subnanogram amounts of DNA. However, these libraries require several rounds of PCR amplification before sequencing, and recent data suggest these typically yield smaller and more fragmented assemblies than regular metagenomes. Methods Here we evaluate de novo assembly methods of 169 PCR-amplified metagenomes, including 25 for which an unamplified counterpart is available, to optimize specific assembly approaches for PCR-amplified libraries. We first evaluated coverage bias by mapping reads from PCR-amplified metagenomes onto reference contigs obtained from unamplified metagenomes of the same samples. Then, we compared different assembly pipelines in terms of assembly size (number of bp in contigs ≥ 10 kb) and error rates to evaluate which are the best suited for PCR-amplified metagenomes. Results Read mapping analyses revealed that the depth of coverage within individual genomes is significantly more uneven in PCR-amplified datasets versus unamplified metagenomes, with regions of high depth of coverage enriched in short inserts. This enrichment scales with the number of PCR cycles performed, and is presumably due to preferential amplification of short inserts. Standard assembly pipelines are confounded by this type of coverage unevenness, so we evaluated other assembly options to mitigate these issues. We found that a pipeline combining read deduplication and an assembly algorithm originally designed to recover genomes from libraries generated after whole genome amplification (single-cell SPAdes) frequently improved assembly of contigs ≥10 kb by 10 to 100-fold for low input metagenomes. Conclusions PCR-amplified metagenomes have enabled scientists to explore communities traditionally challenging to describe, including some with extremely low biomass or from which DNA is particularly difficult to extract. Here we show that a modified assembly pipeline can lead to an improved de novo genome assembly from PCR-amplified datasets, and enables a better genome recovery from low input metagenomes.https://peerj.com/articles/6902.pdfMetagenomicsMicrobial ecologyGenome assemblyViral metagenomics