Getting insight into the pan-genome structure with PangTree

Abstract Background The term pan-genome was proposed to denominate collections of genomic sequences jointly analyzed or used as a reference. The constant growth of genomic data intensifies development of data structures and algorithms to investigate pan-genomes efficiently. Results This work focuses...

Full description

Bibliographic Details
Main Authors: Paulina Dziadkiewicz, Norbert Dojer
Format: Article
Language:English
Published: BMC 2020-04-01
Series:BMC Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12864-020-6610-4
id doaj-a2d0372e8a5645c3b78f349e68de25c2
record_format Article
spelling doaj-a2d0372e8a5645c3b78f349e68de25c22020-11-25T03:00:54ZengBMCBMC Genomics1471-21642020-04-0121S211310.1186/s12864-020-6610-4Getting insight into the pan-genome structure with PangTreePaulina Dziadkiewicz0Norbert Dojer1Faculty of Mathematics, Informatics and Mechanics, University of WarsawFaculty of Mathematics, Informatics and Mechanics, University of WarsawAbstract Background The term pan-genome was proposed to denominate collections of genomic sequences jointly analyzed or used as a reference. The constant growth of genomic data intensifies development of data structures and algorithms to investigate pan-genomes efficiently. Results This work focuses on providing a tool for discovering and visualizing the relationships between the sequences constituting a pan-genome. A new structure to represent such relationships – called affinity tree – is proposed. Each node of this tree has assigned a subset of genomes, as well as their homogeneity level and averaged consensus sequence. Moreover, subsets assigned to sibling nodes form a partition of the genomes assigned to their parent. Conclusions Functionality of affinity tree is demonstrated on simulated data and on the Ebola virus pan-genome. Furthermore, two software packages are provided: PangTreeBuild constructs affinity tree, while PangTreeVis presents its result.http://link.springer.com/article/10.1186/s12864-020-6610-4Pan-genomeMultiple genome alignmentAffinity tree
collection DOAJ
language English
format Article
sources DOAJ
author Paulina Dziadkiewicz
Norbert Dojer
spellingShingle Paulina Dziadkiewicz
Norbert Dojer
Getting insight into the pan-genome structure with PangTree
BMC Genomics
Pan-genome
Multiple genome alignment
Affinity tree
author_facet Paulina Dziadkiewicz
Norbert Dojer
author_sort Paulina Dziadkiewicz
title Getting insight into the pan-genome structure with PangTree
title_short Getting insight into the pan-genome structure with PangTree
title_full Getting insight into the pan-genome structure with PangTree
title_fullStr Getting insight into the pan-genome structure with PangTree
title_full_unstemmed Getting insight into the pan-genome structure with PangTree
title_sort getting insight into the pan-genome structure with pangtree
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2020-04-01
description Abstract Background The term pan-genome was proposed to denominate collections of genomic sequences jointly analyzed or used as a reference. The constant growth of genomic data intensifies development of data structures and algorithms to investigate pan-genomes efficiently. Results This work focuses on providing a tool for discovering and visualizing the relationships between the sequences constituting a pan-genome. A new structure to represent such relationships – called affinity tree – is proposed. Each node of this tree has assigned a subset of genomes, as well as their homogeneity level and averaged consensus sequence. Moreover, subsets assigned to sibling nodes form a partition of the genomes assigned to their parent. Conclusions Functionality of affinity tree is demonstrated on simulated data and on the Ebola virus pan-genome. Furthermore, two software packages are provided: PangTreeBuild constructs affinity tree, while PangTreeVis presents its result.
topic Pan-genome
Multiple genome alignment
Affinity tree
url http://link.springer.com/article/10.1186/s12864-020-6610-4
work_keys_str_mv AT paulinadziadkiewicz gettinginsightintothepangenomestructurewithpangtree
AT norbertdojer gettinginsightintothepangenomestructurewithpangtree
_version_ 1724696169589768192