Classification and Clustering Algorithm Application for Prediction of Tablet Numbers: Case Study Diabetes Disease
Introduction: By diabetes outbreak in these days, prediction of tablet daily usage like Glibenclamid and Metformin helps doctors to recognize number of tablets, and prevents from drug abuse side effects. Also, it should be considered that the need of diabeticto drug is critical. So, in this paper we...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | fas |
Published: |
Vesnu Publications
2013-10-01
|
Series: | مدیریت اطلاعات سلامت |
Subjects: | |
Online Access: | http://him.mui.ac.ir/index.php/him/article/view/783 |
id |
doaj-399845d76ea14448a0b0b211d6ca0c19 |
---|---|
record_format |
Article |
spelling |
doaj-399845d76ea14448a0b0b211d6ca0c192020-11-25T00:25:36ZfasVesnu Publications مدیریت اطلاعات سلامت1735-78531735-98132013-10-01105739749541Classification and Clustering Algorithm Application for Prediction of Tablet Numbers: Case Study Diabetes DiseaseMaryam Ashoori0Vajihe NajiMoghadam1Somayeh Alizadeh2Mahsa Safi3MSc student, Information Technology engineering, K. N. Toosi University of Technology, Tehran, IranMSc student, Information Technology engineering, K. N. Toosi University of Technology, Tehran, IranAssistant Professor, Industrial engineering, K. N. Toosi University of Technology, Tehran, IranMSc student, Industrial engineering, K. N. Toosi University of Technology, Tehran, IranIntroduction: By diabetes outbreak in these days, prediction of tablet daily usage like Glibenclamid and Metformin helps doctors to recognize number of tablets, and prevents from drug abuse side effects. Also, it should be considered that the need of diabeticto drug is critical. So, in this paper we have used data mining techniques to predict the number of daily usage of tablets for diabetes. At the end, in evaluation process the algorithm that causes better results will be chosen. Methods: This study done by descriptive-cross sectional method. It done by Census sampling method and contains all 2783 patients from March 2008 to May 2012. The community research consists of Yazd Diabetes Research Center data dependent to Shaeed Sadoughi University of Medical Sciences Yazd and diabetes center agency confirms the records contents. In data preprocessing step, the records with missing value in some fields have been removed by the experts’ opinion and the number of patients reduced to 740 cases. These results have achieved by referring directly to the Yazd Diabetes Research Center and data gathering method validity confirmed by supervisor and specialists. Also reliability value have compared to each other for two used algorithms by measurement of test dataset accuracy. In this study Clementine 12.0 has been used for data analysis and data mining algorithms application. Two different algorithms namely CHAID and C5.0 have been used on data and then the generated models accuracy has been achieved. At the end, to confirm the accuracy, we have used clustering method. Results: The obtained values for generated models accuracy by C5.0 and CHAID algorithm's execution on dataset was 45/52 and 28/38 respectively. The high accuracy of C5.0 model shows the better performance of this algorithm for number of tablet usage prediction. In other hand, the low accuracy of C5.0 model shows some values have not classified directly in own location, due to the comparison of actual and predicted values for number of tablet usage in model generation shows the reasons of low accuracy of each model. The reason was dependent to predicted values which had low accuracy and confidence. The clustering of obtained results of C5.0 algorithm executing, put 3, 5, 6 and 7 of tablet usage with 46/83, 36/36, 55/71 and 15 percent of predicted value accuracy, respectively, in one cluster because the cases which have low accuracy or have low samples will be located in the same cluster. Also the clustering of CHAID algorithm executing results put 5 of tablet usage with 20/93 percent of predicted value accuracy in a cluster. Conclusion: This paper was done by Data Mining group's research of K.N. Toosi University of Technology. Finally it has been completed by team work and resulted into present research. In Diabetes Center, an organized approach to predict number of daily usage tablets and prediction from side effects of false recognition in number of tablets is necessary. In order to prevent dangerous effects of diabetes, it is better to invent novel approaches by the help of expert consultant and use of computerized technologies, internet and analytical softwares. Keywords: Diabetes; Decision Tree; Classification; Clustering; Dunn Indexhttp://him.mui.ac.ir/index.php/him/article/view/783DiabetesDecision TreeClassificationClusteringDunn Index |
collection |
DOAJ |
language |
fas |
format |
Article |
sources |
DOAJ |
author |
Maryam Ashoori Vajihe NajiMoghadam Somayeh Alizadeh Mahsa Safi |
spellingShingle |
Maryam Ashoori Vajihe NajiMoghadam Somayeh Alizadeh Mahsa Safi Classification and Clustering Algorithm Application for Prediction of Tablet Numbers: Case Study Diabetes Disease مدیریت اطلاعات سلامت Diabetes Decision Tree Classification Clustering Dunn Index |
author_facet |
Maryam Ashoori Vajihe NajiMoghadam Somayeh Alizadeh Mahsa Safi |
author_sort |
Maryam Ashoori |
title |
Classification and Clustering Algorithm Application for Prediction of Tablet Numbers: Case Study Diabetes Disease |
title_short |
Classification and Clustering Algorithm Application for Prediction of Tablet Numbers: Case Study Diabetes Disease |
title_full |
Classification and Clustering Algorithm Application for Prediction of Tablet Numbers: Case Study Diabetes Disease |
title_fullStr |
Classification and Clustering Algorithm Application for Prediction of Tablet Numbers: Case Study Diabetes Disease |
title_full_unstemmed |
Classification and Clustering Algorithm Application for Prediction of Tablet Numbers: Case Study Diabetes Disease |
title_sort |
classification and clustering algorithm application for prediction of tablet numbers: case study diabetes disease |
publisher |
Vesnu Publications |
series |
مدیریت اطلاعات سلامت |
issn |
1735-7853 1735-9813 |
publishDate |
2013-10-01 |
description |
Introduction: By diabetes outbreak in these days, prediction of tablet daily usage like Glibenclamid and Metformin helps doctors to recognize number of tablets, and prevents from drug abuse side effects. Also, it should be considered that the need of diabeticto drug is critical. So, in this paper we have used data mining techniques to predict the number of daily usage of tablets for diabetes. At the end, in evaluation process the algorithm that causes better results will be chosen.
Methods: This study done by descriptive-cross sectional method. It done by Census sampling method and contains all 2783 patients from March 2008 to May 2012. The community research consists of Yazd Diabetes Research Center data dependent to Shaeed Sadoughi University of Medical Sciences Yazd and diabetes center agency confirms the records contents. In data preprocessing step, the records with missing value in some fields have been removed by the experts’ opinion and the number of patients reduced to 740 cases. These results have achieved by referring directly to the Yazd Diabetes Research Center and data gathering method validity confirmed by supervisor and specialists. Also reliability value have compared to each other for two used algorithms by measurement of test dataset accuracy. In this study Clementine 12.0 has been used for data analysis and data mining algorithms application. Two different algorithms namely CHAID and C5.0 have been used on data and then the generated models accuracy has been achieved. At the end, to confirm the accuracy, we have used clustering method.
Results: The obtained values for generated models accuracy by C5.0 and CHAID algorithm's execution on dataset was 45/52 and 28/38 respectively. The high accuracy of C5.0 model shows the better performance of this algorithm for number of tablet usage prediction. In other hand, the low accuracy of C5.0 model shows some values have not classified directly in own location, due to the comparison of actual and predicted values for number of tablet usage in model generation shows the reasons of low accuracy of each model. The reason was dependent to predicted values which had low accuracy and confidence. The clustering of obtained results of C5.0 algorithm executing, put 3, 5, 6 and 7 of tablet usage with 46/83, 36/36, 55/71 and 15 percent of predicted value accuracy, respectively, in one cluster because the cases which have low accuracy or have low samples will be located in the same cluster. Also the clustering of CHAID algorithm executing results put 5 of tablet usage with 20/93 percent of predicted value accuracy in a cluster.
Conclusion: This paper was done by Data Mining group's research of K.N. Toosi University of Technology. Finally it has been completed by team work and resulted into present research. In Diabetes Center, an organized approach to predict number of daily usage tablets and prediction from side effects of false recognition in number of tablets is necessary. In order to prevent dangerous effects of diabetes, it is better to invent novel approaches by the help of expert consultant and use of computerized technologies, internet and analytical softwares.
Keywords: Diabetes; Decision Tree; Classification; Clustering; Dunn Index |
topic |
Diabetes Decision Tree Classification Clustering Dunn Index |
url |
http://him.mui.ac.ir/index.php/him/article/view/783 |
work_keys_str_mv |
AT maryamashoori classificationandclusteringalgorithmapplicationforpredictionoftabletnumberscasestudydiabetesdisease AT vajihenajimoghadam classificationandclusteringalgorithmapplicationforpredictionoftabletnumberscasestudydiabetesdisease AT somayehalizadeh classificationandclusteringalgorithmapplicationforpredictionoftabletnumberscasestudydiabetesdisease AT mahsasafi classificationandclusteringalgorithmapplicationforpredictionoftabletnumberscasestudydiabetesdisease |
_version_ |
1725348032165183488 |