Application of Machine Learning Techniques for Clinical Predictive Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in China

Background. Nonalcoholic fatty liver disease (NAFLD) is one of the most common chronic liver diseases. Machine learning techniques were introduced to evaluate the optimal predictive clinical model of NAFLD. Methods. A cross-sectional study was performed with subjects who attended a health examinatio...

Full description

Bibliographic Details
Main Authors:	Han Ma, Cheng-fu Xu, Zhe Shen, Chao-hui Yu, You-ming Li
Format:	Article
Language:	English
Published:	Hindawi Limited 2018-01-01
Series:	BioMed Research International
Online Access:	http://dx.doi.org/10.1155/2018/4304376

id	doaj-9cb23270aba24c8fa01ac0795000a9ec
record_format	Article
spelling	doaj-9cb23270aba24c8fa01ac0795000a9ec2020-11-24T20:57:13ZengHindawi LimitedBioMed Research International2314-61332314-61412018-01-01201810.1155/2018/43043764304376Application of Machine Learning Techniques for Clinical Predictive Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in ChinaHan Ma0Cheng-fu Xu1Zhe Shen2Chao-hui Yu3You-ming Li4Department of Gastroenterology, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, Zhejiang Province, ChinaDepartment of Gastroenterology, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, Zhejiang Province, ChinaDepartment of Gastroenterology, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, Zhejiang Province, ChinaDepartment of Gastroenterology, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, Zhejiang Province, ChinaDepartment of Gastroenterology, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, Zhejiang Province, ChinaBackground. Nonalcoholic fatty liver disease (NAFLD) is one of the most common chronic liver diseases. Machine learning techniques were introduced to evaluate the optimal predictive clinical model of NAFLD. Methods. A cross-sectional study was performed with subjects who attended a health examination at the First Affiliated Hospital, Zhejiang University. Questionnaires, laboratory tests, physical examinations, and liver ultrasonography were employed. Machine learning techniques were then implemented using the open source software Weka. The tasks included feature selection and classification. Feature selection techniques built a screening model by removing the redundant features. Classification was used to build a prediction model, which was evaluated by the F-measure. 11 state-of-the-art machine learning techniques were investigated. Results. Among the 10,508 enrolled subjects, 2,522 (24%) met the diagnostic criteria of NAFLD. By leveraging a set of statistical testing techniques, BMI, triglycerides, gamma-glutamyl transpeptidase (γGT), the serum alanine aminotransferase (ALT), and uric acid were the top 5 features contributing to NAFLD. A 10-fold cross-validation was used in the classification. According to the results, the Bayesian network model demonstrated the best performance from among the 11 different techniques. It achieved accuracy, specificity, sensitivity, and F-measure scores of up to 83%, 0.878, 0.675, and 0.655, respectively. Compared with logistic regression, the Bayesian network model improves the F-measure score by 9.17%. Conclusion. Novel machine learning techniques may have screening and predictive value for NAFLD.http://dx.doi.org/10.1155/2018/4304376
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Han Ma Cheng-fu Xu Zhe Shen Chao-hui Yu You-ming Li
spellingShingle	Han Ma Cheng-fu Xu Zhe Shen Chao-hui Yu You-ming Li Application of Machine Learning Techniques for Clinical Predictive Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in China BioMed Research International
author_facet	Han Ma Cheng-fu Xu Zhe Shen Chao-hui Yu You-ming Li
author_sort	Han Ma
title	Application of Machine Learning Techniques for Clinical Predictive Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in China
title_short	Application of Machine Learning Techniques for Clinical Predictive Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in China
title_full	Application of Machine Learning Techniques for Clinical Predictive Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in China
title_fullStr	Application of Machine Learning Techniques for Clinical Predictive Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in China
title_full_unstemmed	Application of Machine Learning Techniques for Clinical Predictive Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in China
title_sort	application of machine learning techniques for clinical predictive modeling: a cross-sectional study on nonalcoholic fatty liver disease in china
publisher	Hindawi Limited
series	BioMed Research International
issn	2314-6133 2314-6141
publishDate	2018-01-01
description	Background. Nonalcoholic fatty liver disease (NAFLD) is one of the most common chronic liver diseases. Machine learning techniques were introduced to evaluate the optimal predictive clinical model of NAFLD. Methods. A cross-sectional study was performed with subjects who attended a health examination at the First Affiliated Hospital, Zhejiang University. Questionnaires, laboratory tests, physical examinations, and liver ultrasonography were employed. Machine learning techniques were then implemented using the open source software Weka. The tasks included feature selection and classification. Feature selection techniques built a screening model by removing the redundant features. Classification was used to build a prediction model, which was evaluated by the F-measure. 11 state-of-the-art machine learning techniques were investigated. Results. Among the 10,508 enrolled subjects, 2,522 (24%) met the diagnostic criteria of NAFLD. By leveraging a set of statistical testing techniques, BMI, triglycerides, gamma-glutamyl transpeptidase (γGT), the serum alanine aminotransferase (ALT), and uric acid were the top 5 features contributing to NAFLD. A 10-fold cross-validation was used in the classification. According to the results, the Bayesian network model demonstrated the best performance from among the 11 different techniques. It achieved accuracy, specificity, sensitivity, and F-measure scores of up to 83%, 0.878, 0.675, and 0.655, respectively. Compared with logistic regression, the Bayesian network model improves the F-measure score by 9.17%. Conclusion. Novel machine learning techniques may have screening and predictive value for NAFLD.
url	http://dx.doi.org/10.1155/2018/4304376
work_keys_str_mv	AT hanma applicationofmachinelearningtechniquesforclinicalpredictivemodelingacrosssectionalstudyonnonalcoholicfattyliverdiseaseinchina AT chengfuxu applicationofmachinelearningtechniquesforclinicalpredictivemodelingacrosssectionalstudyonnonalcoholicfattyliverdiseaseinchina AT zheshen applicationofmachinelearningtechniquesforclinicalpredictivemodelingacrosssectionalstudyonnonalcoholicfattyliverdiseaseinchina AT chaohuiyu applicationofmachinelearningtechniquesforclinicalpredictivemodelingacrosssectionalstudyonnonalcoholicfattyliverdiseaseinchina AT youmingli applicationofmachinelearningtechniquesforclinicalpredictivemodelingacrosssectionalstudyonnonalcoholicfattyliverdiseaseinchina
_version_	1716788426359963648

Application of Machine Learning Techniques for Clinical Predictive Modeling: A Cross-Sectional Study on Nonalcoholic Fatty Liver Disease in China

Similar Items