Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality
Abstract Background The effects of tobacco smoking on epigenome-wide methylation signatures in white blood cells (WBCs) collected from persons living with HIV may have important implications for their immune-related outcomes, including frailty and mortality. The application of a machine learning app...
Main Authors: | , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2018-12-01
|
Series: | Clinical Epigenetics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s13148-018-0591-z |
id |
doaj-09e62a3489c14179a6378405288c4aed |
---|---|
record_format |
Article |
spelling |
doaj-09e62a3489c14179a6378405288c4aed2020-11-25T02:32:04ZengBMCClinical Epigenetics1868-70751868-70832018-12-0110111510.1186/s13148-018-0591-zMachine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortalityXinyu Zhang0Ying Hu1Bradley E. Aouizerat2Gang Peng3Vincent C. Marconi4Michael J. Corley5Todd Hulgan6Kendall J. Bryant7Hongyu Zhao8John H. Krystal9Amy C. Justice10Ke Xu11Department of Psychiatry, Yale School of MedicineCenter for Biomedical Bioinformatics, National Cancer InstituteBluestone Center for Clinical Research, New York UniversityDepartment of Biostatistics, Yale School of Public HealthDivision of Infectious Diseases, Emory University School of MedicineDepartment of Native Hawaiian Health, John A. Burns School of Medicine, University of HawaiiSchool of Medicine, Vanderbilt UniversityNational Institute on Alcohol Abuse and AlcoholismDepartment of Biostatistics, Yale School of Public HealthDepartment of Psychiatry, Yale School of MedicineVA Connecticut Healthcare SystemDepartment of Psychiatry, Yale School of MedicineAbstract Background The effects of tobacco smoking on epigenome-wide methylation signatures in white blood cells (WBCs) collected from persons living with HIV may have important implications for their immune-related outcomes, including frailty and mortality. The application of a machine learning approach to the analysis of CpG methylation in the epigenome enables the selection of phenotypically relevant features from high-dimensional data. Using this approach, we now report that a set of smoking-associated DNA-methylated CpGs predicts HIV prognosis and mortality in an HIV-positive veteran population. Results We first identified 137 epigenome-wide significant CpGs for smoking in WBCs from 1137 HIV-positive individuals (p < 1.70E−07). To examine whether smoking-associated CpGs were predictive of HIV frailty and mortality, we applied ensemble-based machine learning to build a model in a training sample employing 408,583 CpGs. A set of 698 CpGs was selected and predictive of high HIV frailty in a testing sample [(area under curve (AUC) = 0.73, 95%CI 0.63~0.83)] and was replicated in an independent sample [(AUC = 0.78, 95%CI 0.73~0.83)]. We further found an association of a DNA methylation index constructed from the 698 CpGs that were associated with a 5-year survival rate [HR = 1.46; 95%CI 1.06~2.02, p = 0.02]. Interestingly, the 698 CpGs located on 445 genes were enriched on the integrin signaling pathway (p = 9.55E−05, false discovery rate = 0.036), which is responsible for the regulation of the cell cycle, differentiation, and adhesion. Conclusion We demonstrated that smoking-associated DNA methylation features in white blood cells predict HIV infection-related clinical outcomes in a population living with HIV.http://link.springer.com/article/10.1186/s13148-018-0591-zDNA methylationEnsemble machine learningHIV frailtyMortalityTobacco smoking |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xinyu Zhang Ying Hu Bradley E. Aouizerat Gang Peng Vincent C. Marconi Michael J. Corley Todd Hulgan Kendall J. Bryant Hongyu Zhao John H. Krystal Amy C. Justice Ke Xu |
spellingShingle |
Xinyu Zhang Ying Hu Bradley E. Aouizerat Gang Peng Vincent C. Marconi Michael J. Corley Todd Hulgan Kendall J. Bryant Hongyu Zhao John H. Krystal Amy C. Justice Ke Xu Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality Clinical Epigenetics DNA methylation Ensemble machine learning HIV frailty Mortality Tobacco smoking |
author_facet |
Xinyu Zhang Ying Hu Bradley E. Aouizerat Gang Peng Vincent C. Marconi Michael J. Corley Todd Hulgan Kendall J. Bryant Hongyu Zhao John H. Krystal Amy C. Justice Ke Xu |
author_sort |
Xinyu Zhang |
title |
Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality |
title_short |
Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality |
title_full |
Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality |
title_fullStr |
Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality |
title_full_unstemmed |
Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality |
title_sort |
machine learning selected smoking-associated dna methylation signatures that predict hiv prognosis and mortality |
publisher |
BMC |
series |
Clinical Epigenetics |
issn |
1868-7075 1868-7083 |
publishDate |
2018-12-01 |
description |
Abstract Background The effects of tobacco smoking on epigenome-wide methylation signatures in white blood cells (WBCs) collected from persons living with HIV may have important implications for their immune-related outcomes, including frailty and mortality. The application of a machine learning approach to the analysis of CpG methylation in the epigenome enables the selection of phenotypically relevant features from high-dimensional data. Using this approach, we now report that a set of smoking-associated DNA-methylated CpGs predicts HIV prognosis and mortality in an HIV-positive veteran population. Results We first identified 137 epigenome-wide significant CpGs for smoking in WBCs from 1137 HIV-positive individuals (p < 1.70E−07). To examine whether smoking-associated CpGs were predictive of HIV frailty and mortality, we applied ensemble-based machine learning to build a model in a training sample employing 408,583 CpGs. A set of 698 CpGs was selected and predictive of high HIV frailty in a testing sample [(area under curve (AUC) = 0.73, 95%CI 0.63~0.83)] and was replicated in an independent sample [(AUC = 0.78, 95%CI 0.73~0.83)]. We further found an association of a DNA methylation index constructed from the 698 CpGs that were associated with a 5-year survival rate [HR = 1.46; 95%CI 1.06~2.02, p = 0.02]. Interestingly, the 698 CpGs located on 445 genes were enriched on the integrin signaling pathway (p = 9.55E−05, false discovery rate = 0.036), which is responsible for the regulation of the cell cycle, differentiation, and adhesion. Conclusion We demonstrated that smoking-associated DNA methylation features in white blood cells predict HIV infection-related clinical outcomes in a population living with HIV. |
topic |
DNA methylation Ensemble machine learning HIV frailty Mortality Tobacco smoking |
url |
http://link.springer.com/article/10.1186/s13148-018-0591-z |
work_keys_str_mv |
AT xinyuzhang machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality AT yinghu machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality AT bradleyeaouizerat machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality AT gangpeng machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality AT vincentcmarconi machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality AT michaeljcorley machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality AT toddhulgan machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality AT kendalljbryant machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality AT hongyuzhao machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality AT johnhkrystal machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality AT amycjustice machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality AT kexu machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality |
_version_ |
1724821643079974912 |