Predictive utilities of lipid traits, lipoprotein subfractions and other risk factors for incident diabetes: a machine learning approach in the Diabetes Prevention Program

Introduction Although various lipid and non-lipid analytes measured by nuclear magnetic resonance (NMR) spectroscopy have been associated with type 2 diabetes, a structured comparison of the ability of NMR-derived biomarkers and standard lipids to predict individual diabetes risk has not been undert...

Full description

Bibliographic Details
Main Authors: Samuel Dagogo-Jack, Carlos Lorenzo, Kieren J Mather, Marinella Temprosa, Tibor V Varga, Jinxi Liu, Ronald B Goldberg, Guannan Chen, Xavier Pi-Sunyer
Format: Article
Language:English
Published: BMJ Publishing Group 2021-08-01
Series:BMJ Open Diabetes Research & Care
Online Access:https://drc.bmj.com/content/9/1/e001953.full
id doaj-6e1d050c124d4a529a5724c96d670a0f
record_format Article
spelling doaj-6e1d050c124d4a529a5724c96d670a0f2021-08-10T10:31:05ZengBMJ Publishing GroupBMJ Open Diabetes Research & Care2052-48972021-08-019110.1136/bmjdrc-2020-001953Predictive utilities of lipid traits, lipoprotein subfractions and other risk factors for incident diabetes: a machine learning approach in the Diabetes Prevention ProgramSamuel Dagogo-Jack0Carlos Lorenzo1Kieren J Mather2Marinella Temprosa3Tibor V Varga4Jinxi Liu5Ronald B Goldberg6Guannan Chen7Xavier Pi-Sunyer8University of Tennessee Health Science Center, Memphis, Tennessee, USAThe University of Texas Health Science Center at San Antonio, San Antonio, Texas, USAIndiana University School of Medicine, Indianapolis, Indiana, USABiostatistics Center and Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Rockville, Maryland, USASection of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, DenmarkBiostatistics Center and Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Rockville, Maryland, USADepartment of Medicine, University of Miami, Miami, Florida, USABiostatistics Center and Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Rockville, Maryland, USAColumbia University Medical Center, New York City, New York, USAIntroduction Although various lipid and non-lipid analytes measured by nuclear magnetic resonance (NMR) spectroscopy have been associated with type 2 diabetes, a structured comparison of the ability of NMR-derived biomarkers and standard lipids to predict individual diabetes risk has not been undertaken in larger studies nor among individuals at high risk of diabetes.Research design and methods Cumulative discriminative utilities of various groups of biomarkers including NMR lipoproteins, related non-lipid biomarkers, standard lipids, and demographic and glycemic traits were compared for short-term (3.2 years) and long-term (15 years) diabetes development in the Diabetes Prevention Program, a multiethnic, placebo-controlled, randomized controlled trial of individuals with pre-diabetes in the USA (N=2590). Logistic regression, Cox proportional hazards model and six different hyperparameter-tuned machine learning algorithms were compared. The Matthews Correlation Coefficient (MCC) was used as the primary measure of discriminative utility.Results Models with baseline NMR analytes and their changes did not improve the discriminative utility of simpler models including standard lipids or demographic and glycemic traits. Across all algorithms, models with baseline 2-hour glucose performed the best (max MCC=0.36). Sophisticated machine learning algorithms performed similarly to logistic regression in this study.Conclusions NMR lipoproteins and related non-lipid biomarkers were associated but did not augment discrimination of diabetes risk beyond traditional diabetes risk factors except for 2-hour glucose. Machine learning algorithms provided no meaningful improvement for discrimination compared with logistic regression, which suggests a lack of influential latent interactions among the analytes assessed in this study.Trial registration number Diabetes Prevention Program: NCT00004992; Diabetes Prevention Program Outcomes Study: NCT00038727.https://drc.bmj.com/content/9/1/e001953.full
collection DOAJ
language English
format Article
sources DOAJ
author Samuel Dagogo-Jack
Carlos Lorenzo
Kieren J Mather
Marinella Temprosa
Tibor V Varga
Jinxi Liu
Ronald B Goldberg
Guannan Chen
Xavier Pi-Sunyer
spellingShingle Samuel Dagogo-Jack
Carlos Lorenzo
Kieren J Mather
Marinella Temprosa
Tibor V Varga
Jinxi Liu
Ronald B Goldberg
Guannan Chen
Xavier Pi-Sunyer
Predictive utilities of lipid traits, lipoprotein subfractions and other risk factors for incident diabetes: a machine learning approach in the Diabetes Prevention Program
BMJ Open Diabetes Research & Care
author_facet Samuel Dagogo-Jack
Carlos Lorenzo
Kieren J Mather
Marinella Temprosa
Tibor V Varga
Jinxi Liu
Ronald B Goldberg
Guannan Chen
Xavier Pi-Sunyer
author_sort Samuel Dagogo-Jack
title Predictive utilities of lipid traits, lipoprotein subfractions and other risk factors for incident diabetes: a machine learning approach in the Diabetes Prevention Program
title_short Predictive utilities of lipid traits, lipoprotein subfractions and other risk factors for incident diabetes: a machine learning approach in the Diabetes Prevention Program
title_full Predictive utilities of lipid traits, lipoprotein subfractions and other risk factors for incident diabetes: a machine learning approach in the Diabetes Prevention Program
title_fullStr Predictive utilities of lipid traits, lipoprotein subfractions and other risk factors for incident diabetes: a machine learning approach in the Diabetes Prevention Program
title_full_unstemmed Predictive utilities of lipid traits, lipoprotein subfractions and other risk factors for incident diabetes: a machine learning approach in the Diabetes Prevention Program
title_sort predictive utilities of lipid traits, lipoprotein subfractions and other risk factors for incident diabetes: a machine learning approach in the diabetes prevention program
publisher BMJ Publishing Group
series BMJ Open Diabetes Research & Care
issn 2052-4897
publishDate 2021-08-01
description Introduction Although various lipid and non-lipid analytes measured by nuclear magnetic resonance (NMR) spectroscopy have been associated with type 2 diabetes, a structured comparison of the ability of NMR-derived biomarkers and standard lipids to predict individual diabetes risk has not been undertaken in larger studies nor among individuals at high risk of diabetes.Research design and methods Cumulative discriminative utilities of various groups of biomarkers including NMR lipoproteins, related non-lipid biomarkers, standard lipids, and demographic and glycemic traits were compared for short-term (3.2 years) and long-term (15 years) diabetes development in the Diabetes Prevention Program, a multiethnic, placebo-controlled, randomized controlled trial of individuals with pre-diabetes in the USA (N=2590). Logistic regression, Cox proportional hazards model and six different hyperparameter-tuned machine learning algorithms were compared. The Matthews Correlation Coefficient (MCC) was used as the primary measure of discriminative utility.Results Models with baseline NMR analytes and their changes did not improve the discriminative utility of simpler models including standard lipids or demographic and glycemic traits. Across all algorithms, models with baseline 2-hour glucose performed the best (max MCC=0.36). Sophisticated machine learning algorithms performed similarly to logistic regression in this study.Conclusions NMR lipoproteins and related non-lipid biomarkers were associated but did not augment discrimination of diabetes risk beyond traditional diabetes risk factors except for 2-hour glucose. Machine learning algorithms provided no meaningful improvement for discrimination compared with logistic regression, which suggests a lack of influential latent interactions among the analytes assessed in this study.Trial registration number Diabetes Prevention Program: NCT00004992; Diabetes Prevention Program Outcomes Study: NCT00038727.
url https://drc.bmj.com/content/9/1/e001953.full
work_keys_str_mv AT samueldagogojack predictiveutilitiesoflipidtraitslipoproteinsubfractionsandotherriskfactorsforincidentdiabetesamachinelearningapproachinthediabetespreventionprogram
AT carloslorenzo predictiveutilitiesoflipidtraitslipoproteinsubfractionsandotherriskfactorsforincidentdiabetesamachinelearningapproachinthediabetespreventionprogram
AT kierenjmather predictiveutilitiesoflipidtraitslipoproteinsubfractionsandotherriskfactorsforincidentdiabetesamachinelearningapproachinthediabetespreventionprogram
AT marinellatemprosa predictiveutilitiesoflipidtraitslipoproteinsubfractionsandotherriskfactorsforincidentdiabetesamachinelearningapproachinthediabetespreventionprogram
AT tiborvvarga predictiveutilitiesoflipidtraitslipoproteinsubfractionsandotherriskfactorsforincidentdiabetesamachinelearningapproachinthediabetespreventionprogram
AT jinxiliu predictiveutilitiesoflipidtraitslipoproteinsubfractionsandotherriskfactorsforincidentdiabetesamachinelearningapproachinthediabetespreventionprogram
AT ronaldbgoldberg predictiveutilitiesoflipidtraitslipoproteinsubfractionsandotherriskfactorsforincidentdiabetesamachinelearningapproachinthediabetespreventionprogram
AT guannanchen predictiveutilitiesoflipidtraitslipoproteinsubfractionsandotherriskfactorsforincidentdiabetesamachinelearningapproachinthediabetespreventionprogram
AT xavierpisunyer predictiveutilitiesoflipidtraitslipoproteinsubfractionsandotherriskfactorsforincidentdiabetesamachinelearningapproachinthediabetespreventionprogram
_version_ 1721212263168212992