Dynamic construction of pan-genome subgraphs

Marcus et al. (Bioinformatics 2014) proposed to use a compressed de Bruijn graph as a description of a pan-genome, comprising the genomes of many individuals/strains of the same or closely related species. Subsequent work improved the construction of the compressed de Bruijn graph in terms of run-ti...

Full description

Bibliographic Details
Main Authors:	Dede Kadir, Ohlebusch Enno
Format:	Article
Language:	English
Published:	De Gruyter 2020-04-01
Series:	Open Computer Science
Subjects:	compressed de bruijn graph burrows-wheeler transform backward search pan-genome analysis
Online Access:	https://doi.org/10.1515/comp-2020-0018

id	doaj-0159213ca2d640b8ada2d0be7b299629
record_format	Article
spelling	doaj-0159213ca2d640b8ada2d0be7b2996292021-09-06T19:19:43ZengDe GruyterOpen Computer Science2299-10932020-04-01101829610.1515/comp-2020-0018comp-2020-0018Dynamic construction of pan-genome subgraphsDede Kadir0Ohlebusch Enno1Institute of Theoretical Computer Science, Ulm University, D-89069 Ulm, GermanyInstitute of Theoretical Computer Science, Ulm University, D-89069 Ulm, GermanyMarcus et al. (Bioinformatics 2014) proposed to use a compressed de Bruijn graph as a description of a pan-genome, comprising the genomes of many individuals/strains of the same or closely related species. Subsequent work improved the construction of the compressed de Bruijn graph in terms of run-time and memory consumption. According to the Computational Pan-Genomics Consortium (Briefings in Bioinformatics 2016), a pan-genome data structure should support the following functionality: “All information within a data structure should be easily accessible for human eyes by visualization support on different scales.” However, a pan-genome graph can have thousands to millions of nodes and such an amount of information is certainly not easily accessible for human eyes. Thus, the possibility to construct pangenome subgraphs on demand would be quite valuable. In this article, we use the space-efficient representation of the compressed de Bruijn graph devised by Beller and Ohle-busch (Algorithms for Molecular Biology 2016) to construct pan-genome subgraphs on the fly. The user can specify a region in one of the genomes and the software tool will build a subgraph that contains the path corresponding to that region and all paths that are in the neighborhood of that path. The size of the neighborhood can be controlled by the user.https://doi.org/10.1515/comp-2020-0018compressed de bruijn graphburrows-wheeler transformbackward searchpan-genome analysis
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Dede Kadir Ohlebusch Enno
spellingShingle	Dede Kadir Ohlebusch Enno Dynamic construction of pan-genome subgraphs Open Computer Science compressed de bruijn graph burrows-wheeler transform backward search pan-genome analysis
author_facet	Dede Kadir Ohlebusch Enno
author_sort	Dede Kadir
title	Dynamic construction of pan-genome subgraphs
title_short	Dynamic construction of pan-genome subgraphs
title_full	Dynamic construction of pan-genome subgraphs
title_fullStr	Dynamic construction of pan-genome subgraphs
title_full_unstemmed	Dynamic construction of pan-genome subgraphs
title_sort	dynamic construction of pan-genome subgraphs
publisher	De Gruyter
series	Open Computer Science
issn	2299-1093
publishDate	2020-04-01
description	Marcus et al. (Bioinformatics 2014) proposed to use a compressed de Bruijn graph as a description of a pan-genome, comprising the genomes of many individuals/strains of the same or closely related species. Subsequent work improved the construction of the compressed de Bruijn graph in terms of run-time and memory consumption. According to the Computational Pan-Genomics Consortium (Briefings in Bioinformatics 2016), a pan-genome data structure should support the following functionality: “All information within a data structure should be easily accessible for human eyes by visualization support on different scales.” However, a pan-genome graph can have thousands to millions of nodes and such an amount of information is certainly not easily accessible for human eyes. Thus, the possibility to construct pangenome subgraphs on demand would be quite valuable. In this article, we use the space-efficient representation of the compressed de Bruijn graph devised by Beller and Ohle-busch (Algorithms for Molecular Biology 2016) to construct pan-genome subgraphs on the fly. The user can specify a region in one of the genomes and the software tool will build a subgraph that contains the path corresponding to that region and all paths that are in the neighborhood of that path. The size of the neighborhood can be controlled by the user.
topic	compressed de bruijn graph burrows-wheeler transform backward search pan-genome analysis
url	https://doi.org/10.1515/comp-2020-0018
work_keys_str_mv	AT dedekadir dynamicconstructionofpangenomesubgraphs AT ohlebuschenno dynamicconstructionofpangenomesubgraphs
_version_	1717777904957390848

Dynamic construction of pan-genome subgraphs

Similar Items