Construction genetic algorithm prediction model in breast cancer / liver cancer

博士 === 國立陽明大學 === 公共衛生研究所 === 96 === In recent years, Data Mining attracts great concern from information industries, its main reason is that a large amount of extant materials can be used extensively, and there are urgent demands to be changed these materials into useful information and knowledge....

Full description

Bibliographic Details
Main Authors: Wei-Pin Chang, 張偉斌
Other Authors: Der-Ming Liou
Format: Others
Language:en_US
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/39719888564623775416
id ndltd-TW-096YM005058003
record_format oai_dc
spelling ndltd-TW-096YM0050580032015-10-13T13:51:29Z http://ndltd.ncl.edu.tw/handle/39719888564623775416 Construction genetic algorithm prediction model in breast cancer / liver cancer 建構基因演算法預測模式於乳癌/肝癌之研究 Wei-Pin Chang 張偉斌 博士 國立陽明大學 公共衛生研究所 96 In recent years, Data Mining attracts great concern from information industries, its main reason is that a large amount of extant materials can be used extensively, and there are urgent demands to be changed these materials into useful information and knowledge. The information and knowledge obtained are admissible to improve and promote efficiency, the field used includes very much, and the application case that Data Mining in the medical field increases gradually. According to records from Department of Health, Breast cancer and Liver cancer were major manifestations among Taiwanese population leading to deaths of top ten causes in Taiwan. These two indications had some characteristics in common as increasing risk with increasing age and sharing the same pool of risk factors in our living environment. The central role of data mining uses artificial intelligence and statistical methods to extract meaningful information from puzzles of variables and data. The present study focused on the investigation of the application of artificial intelligence and data mining techniques to the prediction models of breast cancer and liver cancer. The artificial neural network, decision tree, logistic regression, and genetic algorithm were used for the comparative studies and the accuracy and positive predictive value of each algorithm were used as the evaluation indicators. 699 records acquired from the breast cancer patients, 729 records acquire from the liver cancer patient. In breast cancer data, 9 predictor variables, and 1 outcome variable were incorporated for the data analysis followed by the 10-fold cross-validation. The results revealed that the accuracies of logistic regression model were 0.9637 (sensitivity 0.9716 and specificity 0.9482), the decision tree model 0.9435 (sensitivity 0.9615, specificity 0.9105), the neural network model 0.9502 (sensitivity 0.9628, specificity 0.9273), the genetic algorithm model 0.9878 (sensitivity 1, specificity 0.9802). The accuracy of the genetic algorithm was significantly higher than the average predicted accuracy of 0.9612. The predicted outcome of the logistic regression model was higher than that of the neural network model but no significant difference was observed. The average predicted accuracy of the decision tree model was 0.9435 which was the lowest of all 4 predictive models. The standard deviation of the 10-fold cross-validation was rather unreliable. On other hand, liver cancer data include 12 predictor variables, and 1 outcome variable were incorporated for the data analysis followed by the 10-fold cross-validation. The results revealed that the accuracies of logistic regression model were 0.7658 (sensitivity 0.7682 and specificity 0.7630, the decision tree model 0.7636 (sensitivity 0.7497, specificity 0.7793), the neural network model 0.7760 (sensitivity 0.7875, specificity 0.7679), the genetic algorithm model 0.8072 (sensitivity 0.8444, specificity 0.0.763). The accuracy of the genetic algorithm was significantly higher than the average predicted accuracy of 0.7684. The predicted outcome of the neural network model was higher than that of the logistic regression model and decision model but no significant difference was observed. The present study indicated that the genetic algorithm model yielded better results than other data mining models for the analysis of the data of breast cancer and liver cancer patient in terms of the overall accuracy of the patient classification, the expression and complexity of the classification rule. The results showed that the genetic algorithm described in the present study was able to produce accurate results in the classification of breast cancer data/liver cancer data and the classification rule identified was more acceptable and comprehensible. Der-Ming Liou 劉德明 2008 學位論文 ; thesis 97 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立陽明大學 === 公共衛生研究所 === 96 === In recent years, Data Mining attracts great concern from information industries, its main reason is that a large amount of extant materials can be used extensively, and there are urgent demands to be changed these materials into useful information and knowledge. The information and knowledge obtained are admissible to improve and promote efficiency, the field used includes very much, and the application case that Data Mining in the medical field increases gradually. According to records from Department of Health, Breast cancer and Liver cancer were major manifestations among Taiwanese population leading to deaths of top ten causes in Taiwan. These two indications had some characteristics in common as increasing risk with increasing age and sharing the same pool of risk factors in our living environment. The central role of data mining uses artificial intelligence and statistical methods to extract meaningful information from puzzles of variables and data. The present study focused on the investigation of the application of artificial intelligence and data mining techniques to the prediction models of breast cancer and liver cancer. The artificial neural network, decision tree, logistic regression, and genetic algorithm were used for the comparative studies and the accuracy and positive predictive value of each algorithm were used as the evaluation indicators. 699 records acquired from the breast cancer patients, 729 records acquire from the liver cancer patient. In breast cancer data, 9 predictor variables, and 1 outcome variable were incorporated for the data analysis followed by the 10-fold cross-validation. The results revealed that the accuracies of logistic regression model were 0.9637 (sensitivity 0.9716 and specificity 0.9482), the decision tree model 0.9435 (sensitivity 0.9615, specificity 0.9105), the neural network model 0.9502 (sensitivity 0.9628, specificity 0.9273), the genetic algorithm model 0.9878 (sensitivity 1, specificity 0.9802). The accuracy of the genetic algorithm was significantly higher than the average predicted accuracy of 0.9612. The predicted outcome of the logistic regression model was higher than that of the neural network model but no significant difference was observed. The average predicted accuracy of the decision tree model was 0.9435 which was the lowest of all 4 predictive models. The standard deviation of the 10-fold cross-validation was rather unreliable. On other hand, liver cancer data include 12 predictor variables, and 1 outcome variable were incorporated for the data analysis followed by the 10-fold cross-validation. The results revealed that the accuracies of logistic regression model were 0.7658 (sensitivity 0.7682 and specificity 0.7630, the decision tree model 0.7636 (sensitivity 0.7497, specificity 0.7793), the neural network model 0.7760 (sensitivity 0.7875, specificity 0.7679), the genetic algorithm model 0.8072 (sensitivity 0.8444, specificity 0.0.763). The accuracy of the genetic algorithm was significantly higher than the average predicted accuracy of 0.7684. The predicted outcome of the neural network model was higher than that of the logistic regression model and decision model but no significant difference was observed. The present study indicated that the genetic algorithm model yielded better results than other data mining models for the analysis of the data of breast cancer and liver cancer patient in terms of the overall accuracy of the patient classification, the expression and complexity of the classification rule. The results showed that the genetic algorithm described in the present study was able to produce accurate results in the classification of breast cancer data/liver cancer data and the classification rule identified was more acceptable and comprehensible.
author2 Der-Ming Liou
author_facet Der-Ming Liou
Wei-Pin Chang
張偉斌
author Wei-Pin Chang
張偉斌
spellingShingle Wei-Pin Chang
張偉斌
Construction genetic algorithm prediction model in breast cancer / liver cancer
author_sort Wei-Pin Chang
title Construction genetic algorithm prediction model in breast cancer / liver cancer
title_short Construction genetic algorithm prediction model in breast cancer / liver cancer
title_full Construction genetic algorithm prediction model in breast cancer / liver cancer
title_fullStr Construction genetic algorithm prediction model in breast cancer / liver cancer
title_full_unstemmed Construction genetic algorithm prediction model in breast cancer / liver cancer
title_sort construction genetic algorithm prediction model in breast cancer / liver cancer
publishDate 2008
url http://ndltd.ncl.edu.tw/handle/39719888564623775416
work_keys_str_mv AT weipinchang constructiongeneticalgorithmpredictionmodelinbreastcancerlivercancer
AT zhāngwěibīn constructiongeneticalgorithmpredictionmodelinbreastcancerlivercancer
AT weipinchang jiàngòujīyīnyǎnsuànfǎyùcèmóshìyúrǔáigānáizhīyánjiū
AT zhāngwěibīn jiàngòujīyīnyǎnsuànfǎyùcèmóshìyúrǔáigānáizhīyánjiū
_version_ 1717743833770360832