Evolutionary superscaffolding and chromosome anchoring to improve Anopheles genome assemblies

Abstract Background New sequencing technologies have lowered financial barriers to whole genome sequencing, but resulting assemblies are often fragmented and far from ‘finished’. Updating multi-scaffold drafts to chromosome-level status can be achieved through experimental mapping or re-sequencing e...

Full description

Bibliographic Details
Main Authors: Robert M. Waterhouse, Sergey Aganezov, Yoann Anselmetti, Jiyoung Lee, Livio Ruzzante, Maarten J. M. F. Reijnders, Romain Feron, Sèverine Bérard, Phillip George, Matthew W. Hahn, Paul I. Howell, Maryam Kamali, Sergey Koren, Daniel Lawson, Gareth Maslen, Ashley Peery, Adam M. Phillippy, Maria V. Sharakhova, Eric Tannier, Maria F. Unger, Simo V. Zhang, Max A. Alekseyev, Nora J. Besansky, Cedric Chauve, Scott J. Emrich, Igor V. Sharakhov
Format: Article
Language:English
Published: BMC 2020-01-01
Series:BMC Biology
Subjects:
Online Access:https://doi.org/10.1186/s12915-019-0728-3
id doaj-0e880bd2992b417b9c80e8cec3c83014
record_format Article
collection DOAJ
language English
format Article
sources DOAJ
author Robert M. Waterhouse
Sergey Aganezov
Yoann Anselmetti
Jiyoung Lee
Livio Ruzzante
Maarten J. M. F. Reijnders
Romain Feron
Sèverine Bérard
Phillip George
Matthew W. Hahn
Paul I. Howell
Maryam Kamali
Sergey Koren
Daniel Lawson
Gareth Maslen
Ashley Peery
Adam M. Phillippy
Maria V. Sharakhova
Eric Tannier
Maria F. Unger
Simo V. Zhang
Max A. Alekseyev
Nora J. Besansky
Cedric Chauve
Scott J. Emrich
Igor V. Sharakhov
spellingShingle Robert M. Waterhouse
Sergey Aganezov
Yoann Anselmetti
Jiyoung Lee
Livio Ruzzante
Maarten J. M. F. Reijnders
Romain Feron
Sèverine Bérard
Phillip George
Matthew W. Hahn
Paul I. Howell
Maryam Kamali
Sergey Koren
Daniel Lawson
Gareth Maslen
Ashley Peery
Adam M. Phillippy
Maria V. Sharakhova
Eric Tannier
Maria F. Unger
Simo V. Zhang
Max A. Alekseyev
Nora J. Besansky
Cedric Chauve
Scott J. Emrich
Igor V. Sharakhov
Evolutionary superscaffolding and chromosome anchoring to improve Anopheles genome assemblies
BMC Biology
Genome assembly
Gene synteny
Comparative genomics
Mosquito genomes
Orthology
Bioinformatics
author_facet Robert M. Waterhouse
Sergey Aganezov
Yoann Anselmetti
Jiyoung Lee
Livio Ruzzante
Maarten J. M. F. Reijnders
Romain Feron
Sèverine Bérard
Phillip George
Matthew W. Hahn
Paul I. Howell
Maryam Kamali
Sergey Koren
Daniel Lawson
Gareth Maslen
Ashley Peery
Adam M. Phillippy
Maria V. Sharakhova
Eric Tannier
Maria F. Unger
Simo V. Zhang
Max A. Alekseyev
Nora J. Besansky
Cedric Chauve
Scott J. Emrich
Igor V. Sharakhov
author_sort Robert M. Waterhouse
title Evolutionary superscaffolding and chromosome anchoring to improve Anopheles genome assemblies
title_short Evolutionary superscaffolding and chromosome anchoring to improve Anopheles genome assemblies
title_full Evolutionary superscaffolding and chromosome anchoring to improve Anopheles genome assemblies
title_fullStr Evolutionary superscaffolding and chromosome anchoring to improve Anopheles genome assemblies
title_full_unstemmed Evolutionary superscaffolding and chromosome anchoring to improve Anopheles genome assemblies
title_sort evolutionary superscaffolding and chromosome anchoring to improve anopheles genome assemblies
publisher BMC
series BMC Biology
issn 1741-7007
publishDate 2020-01-01
description Abstract Background New sequencing technologies have lowered financial barriers to whole genome sequencing, but resulting assemblies are often fragmented and far from ‘finished’. Updating multi-scaffold drafts to chromosome-level status can be achieved through experimental mapping or re-sequencing efforts. Avoiding the costs associated with such approaches, comparative genomic analysis of gene order conservation (synteny) to predict scaffold neighbours (adjacencies) offers a potentially useful complementary method for improving draft assemblies. Results We evaluated and employed 3 gene synteny-based methods applied to 21 Anopheles mosquito assemblies to produce consensus sets of scaffold adjacencies. For subsets of the assemblies, we integrated these with additional supporting data to confirm and complement the synteny-based adjacencies: 6 with physical mapping data that anchor scaffolds to chromosome locations, 13 with paired-end RNA sequencing (RNAseq) data, and 3 with new assemblies based on re-scaffolding or long-read data. Our combined analyses produced 20 new superscaffolded assemblies with improved contiguities: 7 for which assignments of non-anchored scaffolds to chromosome arms span more than 75% of the assemblies, and a further 7 with chromosome anchoring including an 88% anchored Anopheles arabiensis assembly and, respectively, 73% and 84% anchored assemblies with comprehensively updated cytogenetic photomaps for Anopheles funestus and Anopheles stephensi. Conclusions Experimental data from probe mapping, RNAseq, or long-read technologies, where available, all contribute to successful upgrading of draft assemblies. Our evaluations show that gene synteny-based computational methods represent a valuable alternative or complementary approach. Our improved Anopheles reference assemblies highlight the utility of applying comparative genomics approaches to improve community genomic resources.
topic Genome assembly
Gene synteny
Comparative genomics
Mosquito genomes
Orthology
Bioinformatics
url https://doi.org/10.1186/s12915-019-0728-3
work_keys_str_mv AT robertmwaterhouse evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT sergeyaganezov evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT yoannanselmetti evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT jiyounglee evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT livioruzzante evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT maartenjmfreijnders evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT romainferon evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT severineberard evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT phillipgeorge evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT matthewwhahn evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT paulihowell evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT maryamkamali evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT sergeykoren evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT daniellawson evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT garethmaslen evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT ashleypeery evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT adammphillippy evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT mariavsharakhova evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT erictannier evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT mariafunger evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT simovzhang evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT maxaalekseyev evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT norajbesansky evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT cedricchauve evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT scottjemrich evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
AT igorvsharakhov evolutionarysuperscaffoldingandchromosomeanchoringtoimproveanophelesgenomeassemblies
_version_ 1724350559535759360
spelling doaj-0e880bd2992b417b9c80e8cec3c830142021-01-03T12:14:01ZengBMCBMC Biology1741-70072020-01-0118112010.1186/s12915-019-0728-3Evolutionary superscaffolding and chromosome anchoring to improve Anopheles genome assembliesRobert M. Waterhouse0Sergey Aganezov1Yoann Anselmetti2Jiyoung Lee3Livio Ruzzante4Maarten J. M. F. Reijnders5Romain Feron6Sèverine Bérard7Phillip George8Matthew W. Hahn9Paul I. Howell10Maryam Kamali11Sergey Koren12Daniel Lawson13Gareth Maslen14Ashley Peery15Adam M. Phillippy16Maria V. Sharakhova17Eric Tannier18Maria F. Unger19Simo V. Zhang20Max A. Alekseyev21Nora J. Besansky22Cedric Chauve23Scott J. Emrich24Igor V. Sharakhov25Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of BioinformaticsDepartment of Computer Science, Princeton UniversityISEM, Univ Montpellier, CNRS, EPHE, IRDThe Interdisciplinary PhD Program in Genetics, Bioinformatics, and Computational Biology, Virginia Polytechnic Institute and State UniversityDepartment of Ecology and Evolution, University of Lausanne, and Swiss Institute of BioinformaticsDepartment of Ecology and Evolution, University of Lausanne, and Swiss Institute of BioinformaticsDepartment of Ecology and Evolution, University of Lausanne, and Swiss Institute of BioinformaticsISEM, Univ Montpellier, CNRS, EPHE, IRDDepartment of Entomology, Virginia Polytechnic Institute and State UniversityDepartments of Biology and Computer Science, Indiana UniversityCenters for Disease Control and PreventionDepartment of Entomology, Virginia Polytechnic Institute and State UniversityGenome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of HealthEuropean Molecular Biology Laboratory, European Bioinformatics InstituteEuropean Molecular Biology Laboratory, European Bioinformatics InstituteDepartment of Entomology, Virginia Polytechnic Institute and State UniversityGenome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of HealthDepartment of Entomology, Virginia Polytechnic Institute and State UniversityLaboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, Unité Mixte de Recherche 5558 Centre National de la Recherche ScientifiqueEck Institute for Global Health and Department of Biological Sciences, University of Notre DameDepartments of Biology and Computer Science, Indiana UniversityDepartment of Mathematics and Computational Biology Institute, George Washington UniversityEck Institute for Global Health and Department of Biological Sciences, University of Notre DameDepartment of Mathematics, Simon Fraser UniversityDepartment of Electrical Engineering and Computer Science, University of TennesseeThe Interdisciplinary PhD Program in Genetics, Bioinformatics, and Computational Biology, Virginia Polytechnic Institute and State UniversityAbstract Background New sequencing technologies have lowered financial barriers to whole genome sequencing, but resulting assemblies are often fragmented and far from ‘finished’. Updating multi-scaffold drafts to chromosome-level status can be achieved through experimental mapping or re-sequencing efforts. Avoiding the costs associated with such approaches, comparative genomic analysis of gene order conservation (synteny) to predict scaffold neighbours (adjacencies) offers a potentially useful complementary method for improving draft assemblies. Results We evaluated and employed 3 gene synteny-based methods applied to 21 Anopheles mosquito assemblies to produce consensus sets of scaffold adjacencies. For subsets of the assemblies, we integrated these with additional supporting data to confirm and complement the synteny-based adjacencies: 6 with physical mapping data that anchor scaffolds to chromosome locations, 13 with paired-end RNA sequencing (RNAseq) data, and 3 with new assemblies based on re-scaffolding or long-read data. Our combined analyses produced 20 new superscaffolded assemblies with improved contiguities: 7 for which assignments of non-anchored scaffolds to chromosome arms span more than 75% of the assemblies, and a further 7 with chromosome anchoring including an 88% anchored Anopheles arabiensis assembly and, respectively, 73% and 84% anchored assemblies with comprehensively updated cytogenetic photomaps for Anopheles funestus and Anopheles stephensi. Conclusions Experimental data from probe mapping, RNAseq, or long-read technologies, where available, all contribute to successful upgrading of draft assemblies. Our evaluations show that gene synteny-based computational methods represent a valuable alternative or complementary approach. Our improved Anopheles reference assemblies highlight the utility of applying comparative genomics approaches to improve community genomic resources.https://doi.org/10.1186/s12915-019-0728-3Genome assemblyGene syntenyComparative genomicsMosquito genomesOrthologyBioinformatics