Advantages of using graph databases to explore chromatin conformation capture experiments

Abstract Background High-throughput sequencing Chromosome Conformation Capture (Hi-C) allows the study of DNA interactions and 3D chromosome folding at the genome-wide scale. Usually, these data are represented as matrices describing the binary contacts among the different chromosome regions. On the...

Full description

Bibliographic Details
Main Authors: Daniele D’Agostino, Pietro Liò, Marco Aldinucci, Ivan Merelli
Format: Article
Language:English
Published: BMC 2021-04-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-020-03937-0
id doaj-06063e5d8faf4783a41b3cf45194d05e
record_format Article
spelling doaj-06063e5d8faf4783a41b3cf45194d05e2021-05-02T11:49:29ZengBMCBMC Bioinformatics1471-21052021-04-0122S211610.1186/s12859-020-03937-0Advantages of using graph databases to explore chromatin conformation capture experimentsDaniele D’Agostino0Pietro Liò1Marco Aldinucci2Ivan Merelli3Institute of Electronics, Computer and Telecommunication Engineering, National Research Council of ItalyComputer Laboratory, University of CambridgeComputer Science Department, University of TurinInstitute for Biomedical Technologies, National Research Council of ItalyAbstract Background High-throughput sequencing Chromosome Conformation Capture (Hi-C) allows the study of DNA interactions and 3D chromosome folding at the genome-wide scale. Usually, these data are represented as matrices describing the binary contacts among the different chromosome regions. On the other hand, a graph-based representation can be advantageous to describe the complex topology achieved by the DNA in the nucleus of eukaryotic cells. Methods Here we discuss the use of a graph database for storing and analysing data achieved by performing Hi-C experiments. The main issue is the size of the produced data and, working with a graph-based representation, the consequent necessity of adequately managing a large number of edges (contacts) connecting nodes (genes), which represents the sources of information. For this, currently available graph visualisation tools and libraries fall short with Hi-C data. The use of graph databases, instead, supports both the analysis and the visualisation of the spatial pattern present in Hi-C data, in particular for comparing different experiments or for re-mapping omics data in a space-aware context efficiently. In particular, the possibility of describing graphs through statistical indicators and, even more, the capability of correlating them through statistical distributions allows highlighting similarities and differences among different Hi-C experiments, in different cell conditions or different cell types. Results These concepts have been implemented in NeoHiC, an open-source and user-friendly web application for the progressive visualisation and analysis of Hi-C networks based on the use of the Neo4j graph database (version 3.5). Conclusion With the accumulation of more experiments, the tool will provide invaluable support to compare neighbours of genes across experiments and conditions, helping in highlighting changes in functional domains and identifying new co-organised genomic compartments.https://doi.org/10.1186/s12859-020-03937-0Hi-CChromatin captureGraph databasesGraph visualisation
collection DOAJ
language English
format Article
sources DOAJ
author Daniele D’Agostino
Pietro Liò
Marco Aldinucci
Ivan Merelli
spellingShingle Daniele D’Agostino
Pietro Liò
Marco Aldinucci
Ivan Merelli
Advantages of using graph databases to explore chromatin conformation capture experiments
BMC Bioinformatics
Hi-C
Chromatin capture
Graph databases
Graph visualisation
author_facet Daniele D’Agostino
Pietro Liò
Marco Aldinucci
Ivan Merelli
author_sort Daniele D’Agostino
title Advantages of using graph databases to explore chromatin conformation capture experiments
title_short Advantages of using graph databases to explore chromatin conformation capture experiments
title_full Advantages of using graph databases to explore chromatin conformation capture experiments
title_fullStr Advantages of using graph databases to explore chromatin conformation capture experiments
title_full_unstemmed Advantages of using graph databases to explore chromatin conformation capture experiments
title_sort advantages of using graph databases to explore chromatin conformation capture experiments
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2021-04-01
description Abstract Background High-throughput sequencing Chromosome Conformation Capture (Hi-C) allows the study of DNA interactions and 3D chromosome folding at the genome-wide scale. Usually, these data are represented as matrices describing the binary contacts among the different chromosome regions. On the other hand, a graph-based representation can be advantageous to describe the complex topology achieved by the DNA in the nucleus of eukaryotic cells. Methods Here we discuss the use of a graph database for storing and analysing data achieved by performing Hi-C experiments. The main issue is the size of the produced data and, working with a graph-based representation, the consequent necessity of adequately managing a large number of edges (contacts) connecting nodes (genes), which represents the sources of information. For this, currently available graph visualisation tools and libraries fall short with Hi-C data. The use of graph databases, instead, supports both the analysis and the visualisation of the spatial pattern present in Hi-C data, in particular for comparing different experiments or for re-mapping omics data in a space-aware context efficiently. In particular, the possibility of describing graphs through statistical indicators and, even more, the capability of correlating them through statistical distributions allows highlighting similarities and differences among different Hi-C experiments, in different cell conditions or different cell types. Results These concepts have been implemented in NeoHiC, an open-source and user-friendly web application for the progressive visualisation and analysis of Hi-C networks based on the use of the Neo4j graph database (version 3.5). Conclusion With the accumulation of more experiments, the tool will provide invaluable support to compare neighbours of genes across experiments and conditions, helping in highlighting changes in functional domains and identifying new co-organised genomic compartments.
topic Hi-C
Chromatin capture
Graph databases
Graph visualisation
url https://doi.org/10.1186/s12859-020-03937-0
work_keys_str_mv AT danieledagostino advantagesofusinggraphdatabasestoexplorechromatinconformationcaptureexperiments
AT pietrolio advantagesofusinggraphdatabasestoexplorechromatinconformationcaptureexperiments
AT marcoaldinucci advantagesofusinggraphdatabasestoexplorechromatinconformationcaptureexperiments
AT ivanmerelli advantagesofusinggraphdatabasestoexplorechromatinconformationcaptureexperiments
_version_ 1721491637783232512