DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data

Abstract Background DNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions tha...

Full description

Bibliographic Details
Main Authors: John M. Gaspar, Ronald P. Hart
Format: Article
Language:English
Published: BMC 2017-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-017-1909-0
id doaj-ad1aaf9050ae4c2da715e1e285e9a804
record_format Article
spelling doaj-ad1aaf9050ae4c2da715e1e285e9a8042020-11-24T20:55:58ZengBMCBMC Bioinformatics1471-21052017-11-011811810.1186/s12859-017-1909-0DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq dataJohn M. Gaspar0Ronald P. Hart1Department of Pharmaceutics, Rutgers UniversityDepartment of Cell Biology and Neuroscience, Rutgers UniversityAbstract Background DNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions that are differentially methylated between two or more biological conditions. Even though a variety of software packages is available for different aspects of the bioinformatics analysis, they often produce results that are biased or require excessive computational requirements. Results DMRfinder is a novel computational pipeline that identifies differentially methylated regions efficiently. Following alignment, DMRfinder extracts methylation counts and performs a modified single-linkage clustering of methylation sites into genomic regions. It then compares methylation levels using beta-binomial hierarchical modeling and Wald tests. Among its innovative attributes are the analyses of novel methylation sites and methylation linkage, as well as the simultaneous statistical analysis of multiple sample groups. To demonstrate its efficiency, DMRfinder is benchmarked against other computational approaches using a large published dataset. Contrasting two replicates of the same sample yielded minimal genomic regions with DMRfinder, whereas two alternative software packages reported a substantial number of false positives. Further analyses of biological samples revealed fundamental differences between DMRfinder and another software package, despite the fact that they utilize the same underlying statistical basis. For each step, DMRfinder completed the analysis in a fraction of the time required by other software. Conclusions Among the computational approaches for identifying differentially methylated regions from high-throughput bisulfite sequencing datasets, DMRfinder is the first that integrates all the post-alignment steps in a single package. Compared to other software, DMRfinder is extremely efficient and unbiased in this process. DMRfinder is free and open-source software, available on GitHub (github.com/jsh58/DMRfinder); it is written in Python and R, and is supported on Linux.http://link.springer.com/article/10.1186/s12859-017-1909-0DNA methylationBisulfite sequencingCpG islandsSingle-linkage clustering
collection DOAJ
language English
format Article
sources DOAJ
author John M. Gaspar
Ronald P. Hart
spellingShingle John M. Gaspar
Ronald P. Hart
DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
BMC Bioinformatics
DNA methylation
Bisulfite sequencing
CpG islands
Single-linkage clustering
author_facet John M. Gaspar
Ronald P. Hart
author_sort John M. Gaspar
title DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
title_short DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
title_full DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
title_fullStr DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
title_full_unstemmed DMRfinder: efficiently identifying differentially methylated regions from MethylC-seq data
title_sort dmrfinder: efficiently identifying differentially methylated regions from methylc-seq data
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2017-11-01
description Abstract Background DNA methylation is an epigenetic modification that is studied at a single-base resolution with bisulfite treatment followed by high-throughput sequencing. After alignment of the sequence reads to a reference genome, methylation counts are analyzed to determine genomic regions that are differentially methylated between two or more biological conditions. Even though a variety of software packages is available for different aspects of the bioinformatics analysis, they often produce results that are biased or require excessive computational requirements. Results DMRfinder is a novel computational pipeline that identifies differentially methylated regions efficiently. Following alignment, DMRfinder extracts methylation counts and performs a modified single-linkage clustering of methylation sites into genomic regions. It then compares methylation levels using beta-binomial hierarchical modeling and Wald tests. Among its innovative attributes are the analyses of novel methylation sites and methylation linkage, as well as the simultaneous statistical analysis of multiple sample groups. To demonstrate its efficiency, DMRfinder is benchmarked against other computational approaches using a large published dataset. Contrasting two replicates of the same sample yielded minimal genomic regions with DMRfinder, whereas two alternative software packages reported a substantial number of false positives. Further analyses of biological samples revealed fundamental differences between DMRfinder and another software package, despite the fact that they utilize the same underlying statistical basis. For each step, DMRfinder completed the analysis in a fraction of the time required by other software. Conclusions Among the computational approaches for identifying differentially methylated regions from high-throughput bisulfite sequencing datasets, DMRfinder is the first that integrates all the post-alignment steps in a single package. Compared to other software, DMRfinder is extremely efficient and unbiased in this process. DMRfinder is free and open-source software, available on GitHub (github.com/jsh58/DMRfinder); it is written in Python and R, and is supported on Linux.
topic DNA methylation
Bisulfite sequencing
CpG islands
Single-linkage clustering
url http://link.springer.com/article/10.1186/s12859-017-1909-0
work_keys_str_mv AT johnmgaspar dmrfinderefficientlyidentifyingdifferentiallymethylatedregionsfrommethylcseqdata
AT ronaldphart dmrfinderefficientlyidentifyingdifferentiallymethylatedregionsfrommethylcseqdata
_version_ 1716791269478367232