Risk Prediction of Chronic Disease Using Machine Learning and Rebalancing Methods

Chronic diseases cause damage to important organs such as the brain, heart, and liver, which can easily cause disability, affect labor ability and quality of life, and the medical expenses are extremely high, which increases the economic burden of society and families. An effective method is to crea...

Full description

Bibliographic Details
Main Author: Li, Cheng (Author)
Other Authors: Mirza, Farhaan (Contributor)
Format: Others
Published: Auckland University of Technology, 2021-08-22T22:40:32Z.
Subjects:
Online Access:Get fulltext
LEADER 02040 am a22002053u 4500
001 14429
042 |a dc 
100 1 0 |a Li, Cheng  |e author 
100 1 0 |a Mirza, Farhaan  |e contributor 
245 0 0 |a Risk Prediction of Chronic Disease Using Machine Learning and Rebalancing Methods 
260 |b Auckland University of Technology,   |c 2021-08-22T22:40:32Z. 
520 |a Chronic diseases cause damage to important organs such as the brain, heart, and liver, which can easily cause disability, affect labor ability and quality of life, and the medical expenses are extremely high, which increases the economic burden of society and families. An effective method is to create predictive models to assess the risk of chronic diseases. Researchers have conducted several projects, but challenges still exist. The challenge is the imbalance of chronic disease data. When encountering unbalanced chronic diseases data, the classification algorithms will calculate the majority class (non- disease), while the minority class sample (disease) is not calculated. In order to accurately identify the disease and non-disease individuals, this research proposes a multi-combination method to deal with chronic disease data sets with imbalanced categories. The researcher conducted an in-depth analysis of the impact of three rebalancing methods: Synthetic minority oversampling technique (SMOTE), Resampling and SpreadSubsampling on the classifier processing through six classifiers and four data sets. Experimental results show that Random Forest (RF) combined with Resample rebalancing method (RF-RESAMPLE) is the best classifier of our selection of data sets and achieved 94.8770%. The method can assist doctors to identify chronic diseases, and then diagnose and treat patients early to increase their chances of survival. 
540 |a OpenAccess 
546 |a en 
650 0 4 |a Machine learning 
650 0 4 |a Chronic disease 
650 0 4 |a Rebalancing methods 
650 0 4 |a Class imbalance 
655 7 |a Thesis 
856 |z Get fulltext  |u http://hdl.handle.net/10292/14429