A clustering method with a pre-specified number ofclusters for gene selection

碩士 === 國立成功大學 === 資訊管理研究所 === 96 === Scientists use gene microarray (DNA microarray) data to study the differences among cells and tissues. They usually expect to find the origin causes of a specific disease. Due to the special characteristics of microarray data, high dimension and small number of...

Full description

Bibliographic Details
Main Authors: Che-Jen Chang, 張哲仁
Other Authors: Tzu-Tsung Wong
Format: Others
Language:zh-TW
Online Access:http://ndltd.ncl.edu.tw/handle/49338483517372265759
id ndltd-TW-096NCKU5396015
record_format oai_dc
spelling ndltd-TW-096NCKU53960152016-05-09T04:14:17Z http://ndltd.ncl.edu.tw/handle/49338483517372265759 A clustering method with a pre-specified number ofclusters for gene selection 應用可自定群數的非監督式學習法於基因選取 Che-Jen Chang 張哲仁 碩士 國立成功大學 資訊管理研究所 96 Scientists use gene microarray (DNA microarray) data to study the differences among cells and tissues. They usually expect to find the origin causes of a specific disease. Due to the special characteristics of microarray data, high dimension and small number of samples, overfitting will generally occur, or noisy data may exist. Thus, gene selection plays an important role to solve the above problems. Though many gene selection methods have been proposed in these years, most of them are individual gene selection methods which may encounter the problems of gene collinearity and lack of consideration for combination genes. In order to solve those problems, an integration approach of individual gene ranking methods and clustering methods has been proposed to select a proper gene subset for classification. However, in such an integration approach, since the number of clusters cannot be a pre-specified parameter for clustering, its computational efficiency and stability are problematic. This research proposes an effective clustering method. After testing the clustering method on five microarray data sets, the experimental results show that it cannot only enhance the computational efficiency, but also generate stable clustering results. In addition, after applying our clustering method for gene selection, the resulting accuracies are close to the accuracies resulting from the original integration approach. Tzu-Tsung Wong 翁慈宗 學位論文 ; thesis 55 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立成功大學 === 資訊管理研究所 === 96 === Scientists use gene microarray (DNA microarray) data to study the differences among cells and tissues. They usually expect to find the origin causes of a specific disease. Due to the special characteristics of microarray data, high dimension and small number of samples, overfitting will generally occur, or noisy data may exist. Thus, gene selection plays an important role to solve the above problems. Though many gene selection methods have been proposed in these years, most of them are individual gene selection methods which may encounter the problems of gene collinearity and lack of consideration for combination genes. In order to solve those problems, an integration approach of individual gene ranking methods and clustering methods has been proposed to select a proper gene subset for classification. However, in such an integration approach, since the number of clusters cannot be a pre-specified parameter for clustering, its computational efficiency and stability are problematic. This research proposes an effective clustering method. After testing the clustering method on five microarray data sets, the experimental results show that it cannot only enhance the computational efficiency, but also generate stable clustering results. In addition, after applying our clustering method for gene selection, the resulting accuracies are close to the accuracies resulting from the original integration approach.
author2 Tzu-Tsung Wong
author_facet Tzu-Tsung Wong
Che-Jen Chang
張哲仁
author Che-Jen Chang
張哲仁
spellingShingle Che-Jen Chang
張哲仁
A clustering method with a pre-specified number ofclusters for gene selection
author_sort Che-Jen Chang
title A clustering method with a pre-specified number ofclusters for gene selection
title_short A clustering method with a pre-specified number ofclusters for gene selection
title_full A clustering method with a pre-specified number ofclusters for gene selection
title_fullStr A clustering method with a pre-specified number ofclusters for gene selection
title_full_unstemmed A clustering method with a pre-specified number ofclusters for gene selection
title_sort clustering method with a pre-specified number ofclusters for gene selection
url http://ndltd.ncl.edu.tw/handle/49338483517372265759
work_keys_str_mv AT chejenchang aclusteringmethodwithaprespecifiednumberofclustersforgeneselection
AT zhāngzhérén aclusteringmethodwithaprespecifiednumberofclustersforgeneselection
AT chejenchang yīngyòngkězìdìngqúnshùdefēijiāndūshìxuéxífǎyújīyīnxuǎnqǔ
AT zhāngzhérén yīngyòngkězìdìngqúnshùdefēijiāndūshìxuéxífǎyújīyīnxuǎnqǔ
AT chejenchang clusteringmethodwithaprespecifiednumberofclustersforgeneselection
AT zhāngzhérén clusteringmethodwithaprespecifiednumberofclustersforgeneselection
_version_ 1718263375354396672