Optimizing parameter algorithms to improve the performance of cancer and disease classification

碩士 === 慈濟大學 === 醫學資訊研究所 === 99 === Because of the change of living habits and environment, there are many emerging cancers arose in recent ten years. In order to provide a better treatment, it is important to understand the relevant genes of cancers. Parkinson's disease is overtaking cancer...

Full description

Bibliographic Details
Main Authors: Zong-Wei Huang, 黃琮暐
Other Authors: Austin H. Chen
Format: Others
Language:zh-TW
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/26717791368071896815
Description
Summary:碩士 === 慈濟大學 === 醫學資訊研究所 === 99 === Because of the change of living habits and environment, there are many emerging cancers arose in recent ten years. In order to provide a better treatment, it is important to understand the relevant genes of cancers. Parkinson's disease is overtaking cancer as a leading cause of death because the world's elderly population grows; the disorder can attack patients those younger than age 40. Although the threat of Parkinson's disease is increasing, the understanding of the disease is far from sufficient, therefore, we proposal several algorithms to better classify cancers and Parkinson's disease of patients. In the past years, several methods have been developed in the classification of cancer and disease. Most of these methods do not consider the effect of the classifier parameters on the performance of classifier, in other words, these methods do not have the same approach to select the classifier parameters. We propose a novel approach that can automatically search the optimal parameters by combining classifiers with optimizing parameter algorithms. In this paper we develop four new classifiers by separately combining two optimizing parameter algorithms with two traditional classifiers. We called these four methods as Genetic Algorithm-Random Forests (GA-RF), Genetic Algorithm- Support Vector Machine (GA-SVM), Nested-Random Forests (Nested-RF) and Nested-Support Vector Machine (Nested-SVM). Our experiments use 5 datasets of cancer (brain tumor, colon cancer, DLBCL, leukemia, prostate tumor) and 1 disease (Parkinson's) datasets. The result shows GA-SVM has the best classification performance comparing with other three methods. Besides, the performance of Nested-SVM often outperforms the performance of traditional SVM. The more important thing is that both GA-SVM and Nested-SVM can significantly increase more than 20% of accuracy when we applied them to Parkinson's disease dataset. The accuracy could reach 94% by GA-SVM and 93% by Nested-SVM. Thus, optimizing parameter algorithms can improve the performance of cancer and disease classification; this approach has the potential of becoming the standard of setting classifier parameters.