Prediction model construction of mouse stem cell pluripotency using CpG and non-CpG DNA methylation markers

Abstract Background Genome-wide studies of DNA methylation across the epigenetic landscape provide insights into the heterogeneity of pluripotent embryonic stem cells (ESCs). Differentiating into embryonic somatic and germ cells, ESCs exhibit varying degrees of pluripotency, and epigenetic changes o...

Full description

Bibliographic Details
Published in:BMC Bioinformatics
Main Authors: Soobok Joe, Hojung Nam
Format: Article
Language:English
Published: BMC 2020-05-01
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-3448-3
_version_ 1852720924124512256
author Soobok Joe
Hojung Nam
author_facet Soobok Joe
Hojung Nam
author_sort Soobok Joe
collection DOAJ
container_title BMC Bioinformatics
description Abstract Background Genome-wide studies of DNA methylation across the epigenetic landscape provide insights into the heterogeneity of pluripotent embryonic stem cells (ESCs). Differentiating into embryonic somatic and germ cells, ESCs exhibit varying degrees of pluripotency, and epigenetic changes occurring in this process have emerged as important factors explaining stem cell pluripotency. Results Here, using paired scBS-seq and scRNA-seq data of mice, we constructed a machine learning model that predicts degrees of pluripotency for mouse ESCs. Since the biological activities of non-CpG markers have yet to be clarified, we tested the predictive power of CpG and non-CpG markers, as well as a combination thereof, in the model. Through rigorous performance evaluation with both internal and external validation, we discovered that a model using both CpG and non-CpG markers predicted the pluripotency of ESCs with the highest prediction performance (0.956 AUC, external test). The prediction model consisted of 16 CpG and 33 non-CpG markers. The CpG and most of the non-CpG markers targeted depletions of methylation and were indicative of cell pluripotency, whereas only a few non-CpG markers reflected accumulations of methylation. Additionally, we confirmed that there exists the differing pluripotency between individual developmental stages, such as E3.5 and E6.5, as well as between induced mouse pluripotent stem cell (iPSC) and somatic cell. Conclusions In this study, we investigated CpG and non-CpG methylation in relation to mouse stem cell pluripotency and developed a model thereon that successfully predicts the pluripotency of mouse ESCs.
format Article
id doaj-art-17c79bdbbdf843bdb152eb2145dee049
institution Directory of Open Access Journals
issn 1471-2105
language English
publishDate 2020-05-01
publisher BMC
record_format Article
spelling doaj-art-17c79bdbbdf843bdb152eb2145dee0492025-08-19T21:12:35ZengBMCBMC Bioinformatics1471-21052020-05-0121111210.1186/s12859-020-3448-3Prediction model construction of mouse stem cell pluripotency using CpG and non-CpG DNA methylation markersSoobok Joe0Hojung Nam1School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST)School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST)Abstract Background Genome-wide studies of DNA methylation across the epigenetic landscape provide insights into the heterogeneity of pluripotent embryonic stem cells (ESCs). Differentiating into embryonic somatic and germ cells, ESCs exhibit varying degrees of pluripotency, and epigenetic changes occurring in this process have emerged as important factors explaining stem cell pluripotency. Results Here, using paired scBS-seq and scRNA-seq data of mice, we constructed a machine learning model that predicts degrees of pluripotency for mouse ESCs. Since the biological activities of non-CpG markers have yet to be clarified, we tested the predictive power of CpG and non-CpG markers, as well as a combination thereof, in the model. Through rigorous performance evaluation with both internal and external validation, we discovered that a model using both CpG and non-CpG markers predicted the pluripotency of ESCs with the highest prediction performance (0.956 AUC, external test). The prediction model consisted of 16 CpG and 33 non-CpG markers. The CpG and most of the non-CpG markers targeted depletions of methylation and were indicative of cell pluripotency, whereas only a few non-CpG markers reflected accumulations of methylation. Additionally, we confirmed that there exists the differing pluripotency between individual developmental stages, such as E3.5 and E6.5, as well as between induced mouse pluripotent stem cell (iPSC) and somatic cell. Conclusions In this study, we investigated CpG and non-CpG methylation in relation to mouse stem cell pluripotency and developed a model thereon that successfully predicts the pluripotency of mouse ESCs.http://link.springer.com/article/10.1186/s12859-020-3448-3DNA-methylationStem cell pluripotencyNon-CpG methylation
spellingShingle Soobok Joe
Hojung Nam
Prediction model construction of mouse stem cell pluripotency using CpG and non-CpG DNA methylation markers
DNA-methylation
Stem cell pluripotency
Non-CpG methylation
title Prediction model construction of mouse stem cell pluripotency using CpG and non-CpG DNA methylation markers
title_full Prediction model construction of mouse stem cell pluripotency using CpG and non-CpG DNA methylation markers
title_fullStr Prediction model construction of mouse stem cell pluripotency using CpG and non-CpG DNA methylation markers
title_full_unstemmed Prediction model construction of mouse stem cell pluripotency using CpG and non-CpG DNA methylation markers
title_short Prediction model construction of mouse stem cell pluripotency using CpG and non-CpG DNA methylation markers
title_sort prediction model construction of mouse stem cell pluripotency using cpg and non cpg dna methylation markers
topic DNA-methylation
Stem cell pluripotency
Non-CpG methylation
url http://link.springer.com/article/10.1186/s12859-020-3448-3
work_keys_str_mv AT soobokjoe predictionmodelconstructionofmousestemcellpluripotencyusingcpgandnoncpgdnamethylationmarkers
AT hojungnam predictionmodelconstructionofmousestemcellpluripotencyusingcpgandnoncpgdnamethylationmarkers