Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in Japan

Introduction Early intervention in type 2 diabetes can prevent exacerbation of insulin resistance. More effective interventions can be implemented by early and precise prediction of the change in glycated haemoglobin A1c (HbA1c). Artificial intelligence (AI), which has been introduced into various m...

Full description

Bibliographic Details
Main Authors: Zentaro Yamagata, Hiroshi Yokomichi, Tadao Ooka, Hisashi Johno, Kazunori Nakamoto, Yoshioki Yoda
Format: Article
Language:English
Published: BMJ Publishing Group
Series:BMJ Nutrition, Prevention & Health
Online Access:https://nutrition.bmj.com/content/early/2021/03/09/bmjnph-2020-000200.full
id doaj-8bdcf9a8cf0e462ca7fc347cfbefe919
record_format Article
spelling doaj-8bdcf9a8cf0e462ca7fc347cfbefe9192021-03-12T15:30:03ZengBMJ Publishing GroupBMJ Nutrition, Prevention & Health2516-554210.1136/bmjnph-2020-000200Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in JapanZentaro Yamagata0Hiroshi Yokomichi1Tadao Ooka2Hisashi Johno3Kazunori Nakamoto4Yoshioki Yoda5Department of Health Sciences, University of Yamanashi, Chuo, Yamanashi, JapanDepartment of Health Sciences, University of Yamanashi, Chuo, Yamanashi, JapanDepartment of Health Sciences, University of Yamanashi, Chuo, Yamanashi, JapanDepartment of Radiology, University of Yamanashi, Chuo, Yamanashi, JapanCenter for Medical Education and Sciences, University of Yamanashi, Chuo, Yamanashi, JapanYamanashi Koseiren Health Care Center, Kofu, Yamanashi, JapanIntroduction Early intervention in type 2 diabetes can prevent exacerbation of insulin resistance. More effective interventions can be implemented by early and precise prediction of the change in glycated haemoglobin A1c (HbA1c). Artificial intelligence (AI), which has been introduced into various medical fields, may be useful in predicting changes in HbA1c. However, the inability to explain the predictive factors has been a problem in the use of deep learning, the leading AI technology. Therefore, we applied a highly interpretable AI method, random forest (RF), to large-scale health check-up data and examined whether there was an advantage over a conventional prediction model.Research design and methods This study included a cumulative total of 42 908 subjects not receiving treatment for diabetes with an HbA1c <6.5%. The objective variable was the change in HbA1c in the next year. Each prediction model was created with 51 health-check items and part of their change values from the previous year. We used two analytical methods to compare the predictive powers: RF as a new model and multivariate logistic regression (MLR) as a conventional model. We also created models excluding the change values to determine whether it positively affected the predictions. In addition, variable importance was calculated in the RF analysis, and standard regression coefficients were calculated in the MLR analysis to identify the predictors.Results The RF model showed a higher predictive power for the change in HbA1c than MLR in all models. The RF model including change values showed the highest predictive power. In the RF prediction model, HbA1c, fasting blood glucose, body weight, alkaline phosphatase and platelet count were factors with high predictive power.Conclusions Correct use of the RF method may enable highly accurate risk prediction for the change in HbA1c and may allow the identification of new diabetes risk predictors.https://nutrition.bmj.com/content/early/2021/03/09/bmjnph-2020-000200.full
collection DOAJ
language English
format Article
sources DOAJ
author Zentaro Yamagata
Hiroshi Yokomichi
Tadao Ooka
Hisashi Johno
Kazunori Nakamoto
Yoshioki Yoda
spellingShingle Zentaro Yamagata
Hiroshi Yokomichi
Tadao Ooka
Hisashi Johno
Kazunori Nakamoto
Yoshioki Yoda
Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in Japan
BMJ Nutrition, Prevention & Health
author_facet Zentaro Yamagata
Hiroshi Yokomichi
Tadao Ooka
Hisashi Johno
Kazunori Nakamoto
Yoshioki Yoda
author_sort Zentaro Yamagata
title Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in Japan
title_short Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in Japan
title_full Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in Japan
title_fullStr Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in Japan
title_full_unstemmed Random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in Japan
title_sort random forest approach for determining risk prediction and predictive factors of type 2 diabetes: large-scale health check-up data in japan
publisher BMJ Publishing Group
series BMJ Nutrition, Prevention & Health
issn 2516-5542
description Introduction Early intervention in type 2 diabetes can prevent exacerbation of insulin resistance. More effective interventions can be implemented by early and precise prediction of the change in glycated haemoglobin A1c (HbA1c). Artificial intelligence (AI), which has been introduced into various medical fields, may be useful in predicting changes in HbA1c. However, the inability to explain the predictive factors has been a problem in the use of deep learning, the leading AI technology. Therefore, we applied a highly interpretable AI method, random forest (RF), to large-scale health check-up data and examined whether there was an advantage over a conventional prediction model.Research design and methods This study included a cumulative total of 42 908 subjects not receiving treatment for diabetes with an HbA1c <6.5%. The objective variable was the change in HbA1c in the next year. Each prediction model was created with 51 health-check items and part of their change values from the previous year. We used two analytical methods to compare the predictive powers: RF as a new model and multivariate logistic regression (MLR) as a conventional model. We also created models excluding the change values to determine whether it positively affected the predictions. In addition, variable importance was calculated in the RF analysis, and standard regression coefficients were calculated in the MLR analysis to identify the predictors.Results The RF model showed a higher predictive power for the change in HbA1c than MLR in all models. The RF model including change values showed the highest predictive power. In the RF prediction model, HbA1c, fasting blood glucose, body weight, alkaline phosphatase and platelet count were factors with high predictive power.Conclusions Correct use of the RF method may enable highly accurate risk prediction for the change in HbA1c and may allow the identification of new diabetes risk predictors.
url https://nutrition.bmj.com/content/early/2021/03/09/bmjnph-2020-000200.full
work_keys_str_mv AT zentaroyamagata randomforestapproachfordeterminingriskpredictionandpredictivefactorsoftype2diabeteslargescalehealthcheckupdatainjapan
AT hiroshiyokomichi randomforestapproachfordeterminingriskpredictionandpredictivefactorsoftype2diabeteslargescalehealthcheckupdatainjapan
AT tadaoooka randomforestapproachfordeterminingriskpredictionandpredictivefactorsoftype2diabeteslargescalehealthcheckupdatainjapan
AT hisashijohno randomforestapproachfordeterminingriskpredictionandpredictivefactorsoftype2diabeteslargescalehealthcheckupdatainjapan
AT kazunorinakamoto randomforestapproachfordeterminingriskpredictionandpredictivefactorsoftype2diabeteslargescalehealthcheckupdatainjapan
AT yoshiokiyoda randomforestapproachfordeterminingriskpredictionandpredictivefactorsoftype2diabeteslargescalehealthcheckupdatainjapan
_version_ 1724222747909816320