Spectral methods for the detection and characterization of Topologically Associated Domains

The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops which is relat...

Full description

Bibliographic Details
Main Author: Cresswell, Kellen Garrison
Format: Others
Published: VCU Scholars Compass 2019
Subjects:
Online Access:https://scholarscompass.vcu.edu/etd/6100
https://scholarscompass.vcu.edu/cgi/viewcontent.cgi?article=7201&context=etd
id ndltd-vcu.edu-oai-scholarscompass.vcu.edu-etd-7201
record_format oai_dc
spelling ndltd-vcu.edu-oai-scholarscompass.vcu.edu-etd-72012019-12-17T03:42:17Z Spectral methods for the detection and characterization of Topologically Associated Domains Cresswell, Kellen Garrison The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops which is relatively stable across cell-lines and even across species. These TADs dynamically reorganize during development of disease, and exhibit cell- and conditionspecific differences. Identifying such hierarchical structures and how they change between conditions is a critical step in understanding genome regulation and disease development. Despite their importance, there are relatively few tools for identification of TADs and even fewer for identification of hierarchies. Additionally, there are no publicly available tools for comparison of TADs across datasets. These tools are necessary to conduct large-scale genome-wide analysis and comparison of 3D structure. To address the challenge of TAD identification, we developed a novel sliding window-based spectral clustering framework that uses gaps between consecutive eigenvectors for TAD boundary identification. Our method, implemented in an R package, SpectralTAD, has automatic parameter selection, is robust to sequencing depth, resolution and sparsity of Hi-C data, and detects hierarchical, biologically relevant TADs. SpectralTAD outperforms four state-of-the-art TAD callers in simulated and experimental settings. We demonstrate that TAD boundaries shared among multiple levels of the TAD hierarchy were more enriched in classical boundary marks and more conserved across cell lines and tissues. SpectralTAD is available at http://bioconductor.org/packages/SpectralTAD/. To address the problem of TAD comparison, we developed TADCompare. TADCompare is based on a spectral clustering-derived measure called the eigenvector gap, which enables a loci-by-loci comparison of TAD boundary differences between datasets. Using this measure, we introduce methods for identifying differential and consensus TAD boundaries and tracking TAD boundary changes over time. We further propose a novel framework for the systematic classification of TAD boundary changes. Colocalization- and gene enrichment analysis of different types of TAD boundary changes revealed distinct biological functionality associated with them. TADCompare is available on https://github.com/dozmorovlab/TADCompare. 2019-01-01T08:00:00Z text application/pdf https://scholarscompass.vcu.edu/etd/6100 https://scholarscompass.vcu.edu/cgi/viewcontent.cgi?article=7201&context=etd © The Author Theses and Dissertations VCU Scholars Compass Genomics Spectral Genetics Biostatistics Statistical Clustering Biostatistics
collection NDLTD
format Others
sources NDLTD
topic Genomics
Spectral
Genetics
Biostatistics
Statistical
Clustering
Biostatistics
spellingShingle Genomics
Spectral
Genetics
Biostatistics
Statistical
Clustering
Biostatistics
Cresswell, Kellen Garrison
Spectral methods for the detection and characterization of Topologically Associated Domains
description The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops which is relatively stable across cell-lines and even across species. These TADs dynamically reorganize during development of disease, and exhibit cell- and conditionspecific differences. Identifying such hierarchical structures and how they change between conditions is a critical step in understanding genome regulation and disease development. Despite their importance, there are relatively few tools for identification of TADs and even fewer for identification of hierarchies. Additionally, there are no publicly available tools for comparison of TADs across datasets. These tools are necessary to conduct large-scale genome-wide analysis and comparison of 3D structure. To address the challenge of TAD identification, we developed a novel sliding window-based spectral clustering framework that uses gaps between consecutive eigenvectors for TAD boundary identification. Our method, implemented in an R package, SpectralTAD, has automatic parameter selection, is robust to sequencing depth, resolution and sparsity of Hi-C data, and detects hierarchical, biologically relevant TADs. SpectralTAD outperforms four state-of-the-art TAD callers in simulated and experimental settings. We demonstrate that TAD boundaries shared among multiple levels of the TAD hierarchy were more enriched in classical boundary marks and more conserved across cell lines and tissues. SpectralTAD is available at http://bioconductor.org/packages/SpectralTAD/. To address the problem of TAD comparison, we developed TADCompare. TADCompare is based on a spectral clustering-derived measure called the eigenvector gap, which enables a loci-by-loci comparison of TAD boundary differences between datasets. Using this measure, we introduce methods for identifying differential and consensus TAD boundaries and tracking TAD boundary changes over time. We further propose a novel framework for the systematic classification of TAD boundary changes. Colocalization- and gene enrichment analysis of different types of TAD boundary changes revealed distinct biological functionality associated with them. TADCompare is available on https://github.com/dozmorovlab/TADCompare.
author Cresswell, Kellen Garrison
author_facet Cresswell, Kellen Garrison
author_sort Cresswell, Kellen Garrison
title Spectral methods for the detection and characterization of Topologically Associated Domains
title_short Spectral methods for the detection and characterization of Topologically Associated Domains
title_full Spectral methods for the detection and characterization of Topologically Associated Domains
title_fullStr Spectral methods for the detection and characterization of Topologically Associated Domains
title_full_unstemmed Spectral methods for the detection and characterization of Topologically Associated Domains
title_sort spectral methods for the detection and characterization of topologically associated domains
publisher VCU Scholars Compass
publishDate 2019
url https://scholarscompass.vcu.edu/etd/6100
https://scholarscompass.vcu.edu/cgi/viewcontent.cgi?article=7201&context=etd
work_keys_str_mv AT cresswellkellengarrison spectralmethodsforthedetectionandcharacterizationoftopologicallyassociateddomains
_version_ 1719303622307610624