Integrative prediction of gene expression with chromatin accessibility and conformation data

Abstract Background Enhancers play a fundamental role in orchestrating cell state and development. Although several methods have been developed to identify enhancers, linking them to their target genes is still an open problem. Several theories have been proposed on the functional mechanisms of enha...

Full description

Bibliographic Details
Main Authors: Florian Schmidt, Fabian Kern, Marcel H. Schulz
Format: Article
Language:English
Published: BMC 2020-02-01
Series:Epigenetics & Chromatin
Subjects:
HiC
Online Access:https://doi.org/10.1186/s13072-020-0327-0
id doaj-6af9954ace1f4ca99906d7c4ef964481
record_format Article
spelling doaj-6af9954ace1f4ca99906d7c4ef9644812021-02-07T12:25:08ZengBMCEpigenetics & Chromatin1756-89352020-02-0113111710.1186/s13072-020-0327-0Integrative prediction of gene expression with chromatin accessibility and conformation dataFlorian Schmidt0Fabian Kern1Marcel H. Schulz2High-throughput Genomics & Systems Biology, Cluster of Excellence on Multimodal Computing and InteractionHigh-throughput Genomics & Systems Biology, Cluster of Excellence on Multimodal Computing and InteractionHigh-throughput Genomics & Systems Biology, Cluster of Excellence on Multimodal Computing and InteractionAbstract Background Enhancers play a fundamental role in orchestrating cell state and development. Although several methods have been developed to identify enhancers, linking them to their target genes is still an open problem. Several theories have been proposed on the functional mechanisms of enhancers, which triggered the development of various methods to infer promoter–enhancer interactions (PEIs). The advancement of high-throughput techniques describing the three-dimensional organization of the chromatin, paved the way to pinpoint long-range PEIs. Here we investigated whether including PEIs in computational models for the prediction of gene expression improves performance and interpretability. Results We have extended our $$\textsc{TEPIC}$$ TEPIC framework to include DNA contacts deduced from chromatin conformation capture experiments and compared various methods to determine PEIs using predictive modelling of gene expression from chromatin accessibility data and predicted transcription factor (TF) motif data. We designed a novel machine learning approach that allows the prioritization of TFs binding to distal loop and promoter regions with respect to their importance for gene expression regulation. Our analysis revealed a set of core TFs that are part of enhancer–promoter loops involving YY1 in different cell lines. Conclusion We present a novel approach that can be used to prioritize TFs involved in distal and promoter-proximal regulatory events by integrating chromatin accessibility, conformation, and gene expression data. We show that the integration of chromatin conformation data can improve gene expression prediction and aids model interpretability.https://doi.org/10.1186/s13072-020-0327-0Machine learningChromatin accessibilityDNase1-seqChromatin conformationGene regulationHiC
collection DOAJ
language English
format Article
sources DOAJ
author Florian Schmidt
Fabian Kern
Marcel H. Schulz
spellingShingle Florian Schmidt
Fabian Kern
Marcel H. Schulz
Integrative prediction of gene expression with chromatin accessibility and conformation data
Epigenetics & Chromatin
Machine learning
Chromatin accessibility
DNase1-seq
Chromatin conformation
Gene regulation
HiC
author_facet Florian Schmidt
Fabian Kern
Marcel H. Schulz
author_sort Florian Schmidt
title Integrative prediction of gene expression with chromatin accessibility and conformation data
title_short Integrative prediction of gene expression with chromatin accessibility and conformation data
title_full Integrative prediction of gene expression with chromatin accessibility and conformation data
title_fullStr Integrative prediction of gene expression with chromatin accessibility and conformation data
title_full_unstemmed Integrative prediction of gene expression with chromatin accessibility and conformation data
title_sort integrative prediction of gene expression with chromatin accessibility and conformation data
publisher BMC
series Epigenetics & Chromatin
issn 1756-8935
publishDate 2020-02-01
description Abstract Background Enhancers play a fundamental role in orchestrating cell state and development. Although several methods have been developed to identify enhancers, linking them to their target genes is still an open problem. Several theories have been proposed on the functional mechanisms of enhancers, which triggered the development of various methods to infer promoter–enhancer interactions (PEIs). The advancement of high-throughput techniques describing the three-dimensional organization of the chromatin, paved the way to pinpoint long-range PEIs. Here we investigated whether including PEIs in computational models for the prediction of gene expression improves performance and interpretability. Results We have extended our $$\textsc{TEPIC}$$ TEPIC framework to include DNA contacts deduced from chromatin conformation capture experiments and compared various methods to determine PEIs using predictive modelling of gene expression from chromatin accessibility data and predicted transcription factor (TF) motif data. We designed a novel machine learning approach that allows the prioritization of TFs binding to distal loop and promoter regions with respect to their importance for gene expression regulation. Our analysis revealed a set of core TFs that are part of enhancer–promoter loops involving YY1 in different cell lines. Conclusion We present a novel approach that can be used to prioritize TFs involved in distal and promoter-proximal regulatory events by integrating chromatin accessibility, conformation, and gene expression data. We show that the integration of chromatin conformation data can improve gene expression prediction and aids model interpretability.
topic Machine learning
Chromatin accessibility
DNase1-seq
Chromatin conformation
Gene regulation
HiC
url https://doi.org/10.1186/s13072-020-0327-0
work_keys_str_mv AT florianschmidt integrativepredictionofgeneexpressionwithchromatinaccessibilityandconformationdata
AT fabiankern integrativepredictionofgeneexpressionwithchromatinaccessibilityandconformationdata
AT marcelhschulz integrativepredictionofgeneexpressionwithchromatinaccessibilityandconformationdata
_version_ 1724281222187712512