Optimizing de novo genome assembly from PCR-amplified metagenomes
Background Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling de novo assembly of genomes from metagenomes. These metagenome-assembled genomes are critical to provide ecological, evolutionary, and metabolic context for all the micro...
Main Authors: | , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2019-05-01
|
Series: | PeerJ |
Subjects: | |
Online Access: | https://peerj.com/articles/6902.pdf |
id |
doaj-b412649ebea340ac9ec3c6a97c4b52a3 |
---|---|
record_format |
Article |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Simon Roux Gareth Trubl Danielle Goudeau Nandita Nath Estelle Couradeau Nathan A. Ahlgren Yuanchao Zhan David Marsan Feng Chen Jed A. Fuhrman Trent R. Northen Matthew B. Sullivan Virginia I. Rich Rex R. Malmstrom Emiley A. Eloe-Fadrosh |
spellingShingle |
Simon Roux Gareth Trubl Danielle Goudeau Nandita Nath Estelle Couradeau Nathan A. Ahlgren Yuanchao Zhan David Marsan Feng Chen Jed A. Fuhrman Trent R. Northen Matthew B. Sullivan Virginia I. Rich Rex R. Malmstrom Emiley A. Eloe-Fadrosh Optimizing de novo genome assembly from PCR-amplified metagenomes PeerJ Metagenomics Microbial ecology Genome assembly Viral metagenomics |
author_facet |
Simon Roux Gareth Trubl Danielle Goudeau Nandita Nath Estelle Couradeau Nathan A. Ahlgren Yuanchao Zhan David Marsan Feng Chen Jed A. Fuhrman Trent R. Northen Matthew B. Sullivan Virginia I. Rich Rex R. Malmstrom Emiley A. Eloe-Fadrosh |
author_sort |
Simon Roux |
title |
Optimizing de novo genome assembly from PCR-amplified metagenomes |
title_short |
Optimizing de novo genome assembly from PCR-amplified metagenomes |
title_full |
Optimizing de novo genome assembly from PCR-amplified metagenomes |
title_fullStr |
Optimizing de novo genome assembly from PCR-amplified metagenomes |
title_full_unstemmed |
Optimizing de novo genome assembly from PCR-amplified metagenomes |
title_sort |
optimizing de novo genome assembly from pcr-amplified metagenomes |
publisher |
PeerJ Inc. |
series |
PeerJ |
issn |
2167-8359 |
publishDate |
2019-05-01 |
description |
Background Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling de novo assembly of genomes from metagenomes. These metagenome-assembled genomes are critical to provide ecological, evolutionary, and metabolic context for all the microbes and viruses yet to be cultivated. Metagenomes can now be generated from nanogram to subnanogram amounts of DNA. However, these libraries require several rounds of PCR amplification before sequencing, and recent data suggest these typically yield smaller and more fragmented assemblies than regular metagenomes. Methods Here we evaluate de novo assembly methods of 169 PCR-amplified metagenomes, including 25 for which an unamplified counterpart is available, to optimize specific assembly approaches for PCR-amplified libraries. We first evaluated coverage bias by mapping reads from PCR-amplified metagenomes onto reference contigs obtained from unamplified metagenomes of the same samples. Then, we compared different assembly pipelines in terms of assembly size (number of bp in contigs ≥ 10 kb) and error rates to evaluate which are the best suited for PCR-amplified metagenomes. Results Read mapping analyses revealed that the depth of coverage within individual genomes is significantly more uneven in PCR-amplified datasets versus unamplified metagenomes, with regions of high depth of coverage enriched in short inserts. This enrichment scales with the number of PCR cycles performed, and is presumably due to preferential amplification of short inserts. Standard assembly pipelines are confounded by this type of coverage unevenness, so we evaluated other assembly options to mitigate these issues. We found that a pipeline combining read deduplication and an assembly algorithm originally designed to recover genomes from libraries generated after whole genome amplification (single-cell SPAdes) frequently improved assembly of contigs ≥10 kb by 10 to 100-fold for low input metagenomes. Conclusions PCR-amplified metagenomes have enabled scientists to explore communities traditionally challenging to describe, including some with extremely low biomass or from which DNA is particularly difficult to extract. Here we show that a modified assembly pipeline can lead to an improved de novo genome assembly from PCR-amplified datasets, and enables a better genome recovery from low input metagenomes. |
topic |
Metagenomics Microbial ecology Genome assembly Viral metagenomics |
url |
https://peerj.com/articles/6902.pdf |
work_keys_str_mv |
AT simonroux optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT garethtrubl optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT daniellegoudeau optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT nanditanath optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT estellecouradeau optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT nathanaahlgren optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT yuanchaozhan optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT davidmarsan optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT fengchen optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT jedafuhrman optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT trentrnorthen optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT matthewbsullivan optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT virginiairich optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT rexrmalmstrom optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes AT emileyaeloefadrosh optimizingdenovogenomeassemblyfrompcramplifiedmetagenomes |
_version_ |
1725207745872789504 |
spelling |
doaj-b412649ebea340ac9ec3c6a97c4b52a32020-11-25T01:01:43ZengPeerJ Inc.PeerJ2167-83592019-05-017e690210.7717/peerj.6902Optimizing de novo genome assembly from PCR-amplified metagenomesSimon Roux0Gareth Trubl1Danielle Goudeau2Nandita Nath3Estelle Couradeau4Nathan A. Ahlgren5Yuanchao Zhan6David Marsan7Feng Chen8Jed A. Fuhrman9Trent R. Northen10Matthew B. Sullivan11Virginia I. Rich12Rex R. Malmstrom13Emiley A. Eloe-Fadrosh14Department of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaDepartment of Microbiology, Ohio State University, Columbus, OH, United States of AmericaDepartment of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaDepartment of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaEnvironmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of AmericaDepartment of Biology, Clark University, Worcester, MA, United States of AmericaInstitution of Marine and Environmental Technology, University of Maryland Center for Environmental Science, Cambridge, MD, United States of AmericaInstitution of Marine and Environmental Technology, University of Maryland Center for Environmental Science, Cambridge, MD, United States of AmericaInstitution of Marine and Environmental Technology, University of Maryland Center for Environmental Science, Cambridge, MD, United States of AmericaDepartment of Biological Sciences, University of Southern California, Los Angeles, CA, United States of AmericaDepartment of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaDepartment of Microbiology, Ohio State University, Columbus, OH, United States of AmericaDepartment of Microbiology, Ohio State University, Columbus, OH, United States of AmericaDepartment of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaDepartment of Energy Joint Genome Institute, Walnut Creek, CA, United States of AmericaBackground Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling de novo assembly of genomes from metagenomes. These metagenome-assembled genomes are critical to provide ecological, evolutionary, and metabolic context for all the microbes and viruses yet to be cultivated. Metagenomes can now be generated from nanogram to subnanogram amounts of DNA. However, these libraries require several rounds of PCR amplification before sequencing, and recent data suggest these typically yield smaller and more fragmented assemblies than regular metagenomes. Methods Here we evaluate de novo assembly methods of 169 PCR-amplified metagenomes, including 25 for which an unamplified counterpart is available, to optimize specific assembly approaches for PCR-amplified libraries. We first evaluated coverage bias by mapping reads from PCR-amplified metagenomes onto reference contigs obtained from unamplified metagenomes of the same samples. Then, we compared different assembly pipelines in terms of assembly size (number of bp in contigs ≥ 10 kb) and error rates to evaluate which are the best suited for PCR-amplified metagenomes. Results Read mapping analyses revealed that the depth of coverage within individual genomes is significantly more uneven in PCR-amplified datasets versus unamplified metagenomes, with regions of high depth of coverage enriched in short inserts. This enrichment scales with the number of PCR cycles performed, and is presumably due to preferential amplification of short inserts. Standard assembly pipelines are confounded by this type of coverage unevenness, so we evaluated other assembly options to mitigate these issues. We found that a pipeline combining read deduplication and an assembly algorithm originally designed to recover genomes from libraries generated after whole genome amplification (single-cell SPAdes) frequently improved assembly of contigs ≥10 kb by 10 to 100-fold for low input metagenomes. Conclusions PCR-amplified metagenomes have enabled scientists to explore communities traditionally challenging to describe, including some with extremely low biomass or from which DNA is particularly difficult to extract. Here we show that a modified assembly pipeline can lead to an improved de novo genome assembly from PCR-amplified datasets, and enables a better genome recovery from low input metagenomes.https://peerj.com/articles/6902.pdfMetagenomicsMicrobial ecologyGenome assemblyViral metagenomics |