Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality

Abstract Background The effects of tobacco smoking on epigenome-wide methylation signatures in white blood cells (WBCs) collected from persons living with HIV may have important implications for their immune-related outcomes, including frailty and mortality. The application of a machine learning app...

Full description

Bibliographic Details
Main Authors: Xinyu Zhang, Ying Hu, Bradley E. Aouizerat, Gang Peng, Vincent C. Marconi, Michael J. Corley, Todd Hulgan, Kendall J. Bryant, Hongyu Zhao, John H. Krystal, Amy C. Justice, Ke Xu
Format: Article
Language:English
Published: BMC 2018-12-01
Series:Clinical Epigenetics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13148-018-0591-z
id doaj-09e62a3489c14179a6378405288c4aed
record_format Article
spelling doaj-09e62a3489c14179a6378405288c4aed2020-11-25T02:32:04ZengBMCClinical Epigenetics1868-70751868-70832018-12-0110111510.1186/s13148-018-0591-zMachine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortalityXinyu Zhang0Ying Hu1Bradley E. Aouizerat2Gang Peng3Vincent C. Marconi4Michael J. Corley5Todd Hulgan6Kendall J. Bryant7Hongyu Zhao8John H. Krystal9Amy C. Justice10Ke Xu11Department of Psychiatry, Yale School of MedicineCenter for Biomedical Bioinformatics, National Cancer InstituteBluestone Center for Clinical Research, New York UniversityDepartment of Biostatistics, Yale School of Public HealthDivision of Infectious Diseases, Emory University School of MedicineDepartment of Native Hawaiian Health, John A. Burns School of Medicine, University of HawaiiSchool of Medicine, Vanderbilt UniversityNational Institute on Alcohol Abuse and AlcoholismDepartment of Biostatistics, Yale School of Public HealthDepartment of Psychiatry, Yale School of MedicineVA Connecticut Healthcare SystemDepartment of Psychiatry, Yale School of MedicineAbstract Background The effects of tobacco smoking on epigenome-wide methylation signatures in white blood cells (WBCs) collected from persons living with HIV may have important implications for their immune-related outcomes, including frailty and mortality. The application of a machine learning approach to the analysis of CpG methylation in the epigenome enables the selection of phenotypically relevant features from high-dimensional data. Using this approach, we now report that a set of smoking-associated DNA-methylated CpGs predicts HIV prognosis and mortality in an HIV-positive veteran population. Results We first identified 137 epigenome-wide significant CpGs for smoking in WBCs from 1137 HIV-positive individuals (p < 1.70E−07). To examine whether smoking-associated CpGs were predictive of HIV frailty and mortality, we applied ensemble-based machine learning to build a model in a training sample employing 408,583 CpGs. A set of 698 CpGs was selected and predictive of high HIV frailty in a testing sample [(area under curve (AUC) = 0.73, 95%CI 0.63~0.83)] and was replicated in an independent sample [(AUC = 0.78, 95%CI 0.73~0.83)]. We further found an association of a DNA methylation index constructed from the 698 CpGs that were associated with a 5-year survival rate [HR = 1.46; 95%CI 1.06~2.02, p = 0.02]. Interestingly, the 698 CpGs located on 445 genes were enriched on the integrin signaling pathway (p = 9.55E−05, false discovery rate = 0.036), which is responsible for the regulation of the cell cycle, differentiation, and adhesion. Conclusion We demonstrated that smoking-associated DNA methylation features in white blood cells predict HIV infection-related clinical outcomes in a population living with HIV.http://link.springer.com/article/10.1186/s13148-018-0591-zDNA methylationEnsemble machine learningHIV frailtyMortalityTobacco smoking
collection DOAJ
language English
format Article
sources DOAJ
author Xinyu Zhang
Ying Hu
Bradley E. Aouizerat
Gang Peng
Vincent C. Marconi
Michael J. Corley
Todd Hulgan
Kendall J. Bryant
Hongyu Zhao
John H. Krystal
Amy C. Justice
Ke Xu
spellingShingle Xinyu Zhang
Ying Hu
Bradley E. Aouizerat
Gang Peng
Vincent C. Marconi
Michael J. Corley
Todd Hulgan
Kendall J. Bryant
Hongyu Zhao
John H. Krystal
Amy C. Justice
Ke Xu
Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality
Clinical Epigenetics
DNA methylation
Ensemble machine learning
HIV frailty
Mortality
Tobacco smoking
author_facet Xinyu Zhang
Ying Hu
Bradley E. Aouizerat
Gang Peng
Vincent C. Marconi
Michael J. Corley
Todd Hulgan
Kendall J. Bryant
Hongyu Zhao
John H. Krystal
Amy C. Justice
Ke Xu
author_sort Xinyu Zhang
title Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality
title_short Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality
title_full Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality
title_fullStr Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality
title_full_unstemmed Machine learning selected smoking-associated DNA methylation signatures that predict HIV prognosis and mortality
title_sort machine learning selected smoking-associated dna methylation signatures that predict hiv prognosis and mortality
publisher BMC
series Clinical Epigenetics
issn 1868-7075
1868-7083
publishDate 2018-12-01
description Abstract Background The effects of tobacco smoking on epigenome-wide methylation signatures in white blood cells (WBCs) collected from persons living with HIV may have important implications for their immune-related outcomes, including frailty and mortality. The application of a machine learning approach to the analysis of CpG methylation in the epigenome enables the selection of phenotypically relevant features from high-dimensional data. Using this approach, we now report that a set of smoking-associated DNA-methylated CpGs predicts HIV prognosis and mortality in an HIV-positive veteran population. Results We first identified 137 epigenome-wide significant CpGs for smoking in WBCs from 1137 HIV-positive individuals (p < 1.70E−07). To examine whether smoking-associated CpGs were predictive of HIV frailty and mortality, we applied ensemble-based machine learning to build a model in a training sample employing 408,583 CpGs. A set of 698 CpGs was selected and predictive of high HIV frailty in a testing sample [(area under curve (AUC) = 0.73, 95%CI 0.63~0.83)] and was replicated in an independent sample [(AUC = 0.78, 95%CI 0.73~0.83)]. We further found an association of a DNA methylation index constructed from the 698 CpGs that were associated with a 5-year survival rate [HR = 1.46; 95%CI 1.06~2.02, p = 0.02]. Interestingly, the 698 CpGs located on 445 genes were enriched on the integrin signaling pathway (p = 9.55E−05, false discovery rate = 0.036), which is responsible for the regulation of the cell cycle, differentiation, and adhesion. Conclusion We demonstrated that smoking-associated DNA methylation features in white blood cells predict HIV infection-related clinical outcomes in a population living with HIV.
topic DNA methylation
Ensemble machine learning
HIV frailty
Mortality
Tobacco smoking
url http://link.springer.com/article/10.1186/s13148-018-0591-z
work_keys_str_mv AT xinyuzhang machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
AT yinghu machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
AT bradleyeaouizerat machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
AT gangpeng machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
AT vincentcmarconi machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
AT michaeljcorley machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
AT toddhulgan machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
AT kendalljbryant machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
AT hongyuzhao machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
AT johnhkrystal machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
AT amycjustice machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
AT kexu machinelearningselectedsmokingassociateddnamethylationsignaturesthatpredicthivprognosisandmortality
_version_ 1724821643079974912