Detecting methylation signatures in neurodegenerative disease by density-based clustering of applications with reducing noise
Abstract There have been numerous genetic and epigenetic datasets generated for the study of complex disease including neurodegenerative disease. However, analysis of such data often suffers from detecting the outliers of the samples, which subsequently affects the extraction of the true biological...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Publishing Group
2020-12-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-020-78463-3 |
id |
doaj-cac7be4a73e04b479ecf1dc9a2db5591 |
---|---|
record_format |
Article |
spelling |
doaj-cac7be4a73e04b479ecf1dc9a2db55912020-12-20T12:31:48ZengNature Publishing GroupScientific Reports2045-23222020-12-0110111410.1038/s41598-020-78463-3Detecting methylation signatures in neurodegenerative disease by density-based clustering of applications with reducing noiseSaurav Mallik0Zhongming Zhao1Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at HoustonCenter for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at HoustonAbstract There have been numerous genetic and epigenetic datasets generated for the study of complex disease including neurodegenerative disease. However, analysis of such data often suffers from detecting the outliers of the samples, which subsequently affects the extraction of the true biological signals involved in the disease. To address this critical issue, we developed a novel framework for identifying methylation signatures using consecutive adaptation of a well-known outlier detection algorithm, density based clustering of applications with reducing noise (DBSCAN) followed by hierarchical clustering. We applied the framework to two representative neurodegenerative diseases, Alzheimer’s disease (AD) and Down syndrome (DS), using DNA methylation datasets from public sources (Gene Expression Omnibus, GEO accession ID: GSE74486). We first applied DBSCAN algorithm to eliminate outliers, and then used Limma statistical method to determine differentially methylated genes. Next, hierarchical clustering technique was applied to detect gene modules. Our analysis identified a methylation signature comprising 21 genes for AD and a methylation signature comprising 89 genes for DS, respectively. Our evaluation indicated that these two signatures could lead to high classification accuracy values (92% and 70%) for these two diseases. In summary, this framework will be useful to better detect outlier-free genetic and epigenetic signatures in various complex diseases and their developmental stages.https://doi.org/10.1038/s41598-020-78463-3 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Saurav Mallik Zhongming Zhao |
spellingShingle |
Saurav Mallik Zhongming Zhao Detecting methylation signatures in neurodegenerative disease by density-based clustering of applications with reducing noise Scientific Reports |
author_facet |
Saurav Mallik Zhongming Zhao |
author_sort |
Saurav Mallik |
title |
Detecting methylation signatures in neurodegenerative disease by density-based clustering of applications with reducing noise |
title_short |
Detecting methylation signatures in neurodegenerative disease by density-based clustering of applications with reducing noise |
title_full |
Detecting methylation signatures in neurodegenerative disease by density-based clustering of applications with reducing noise |
title_fullStr |
Detecting methylation signatures in neurodegenerative disease by density-based clustering of applications with reducing noise |
title_full_unstemmed |
Detecting methylation signatures in neurodegenerative disease by density-based clustering of applications with reducing noise |
title_sort |
detecting methylation signatures in neurodegenerative disease by density-based clustering of applications with reducing noise |
publisher |
Nature Publishing Group |
series |
Scientific Reports |
issn |
2045-2322 |
publishDate |
2020-12-01 |
description |
Abstract There have been numerous genetic and epigenetic datasets generated for the study of complex disease including neurodegenerative disease. However, analysis of such data often suffers from detecting the outliers of the samples, which subsequently affects the extraction of the true biological signals involved in the disease. To address this critical issue, we developed a novel framework for identifying methylation signatures using consecutive adaptation of a well-known outlier detection algorithm, density based clustering of applications with reducing noise (DBSCAN) followed by hierarchical clustering. We applied the framework to two representative neurodegenerative diseases, Alzheimer’s disease (AD) and Down syndrome (DS), using DNA methylation datasets from public sources (Gene Expression Omnibus, GEO accession ID: GSE74486). We first applied DBSCAN algorithm to eliminate outliers, and then used Limma statistical method to determine differentially methylated genes. Next, hierarchical clustering technique was applied to detect gene modules. Our analysis identified a methylation signature comprising 21 genes for AD and a methylation signature comprising 89 genes for DS, respectively. Our evaluation indicated that these two signatures could lead to high classification accuracy values (92% and 70%) for these two diseases. In summary, this framework will be useful to better detect outlier-free genetic and epigenetic signatures in various complex diseases and their developmental stages. |
url |
https://doi.org/10.1038/s41598-020-78463-3 |
work_keys_str_mv |
AT sauravmallik detectingmethylationsignaturesinneurodegenerativediseasebydensitybasedclusteringofapplicationswithreducingnoise AT zhongmingzhao detectingmethylationsignaturesinneurodegenerativediseasebydensitybasedclusteringofapplicationswithreducingnoise |
_version_ |
1724376529072291840 |