A gene selection method based on risk gene set using microarray data

碩士 === 國立成功大學 === 資訊管理研究所 === 96 === To analyze microarray data, gene selection and clustering analysis are usually applied. The advantages of gene selection are to reduce the time complexity in building classifiers, improve the classification accuracy, and find significant genes for diseases. Clust...

Full description

Bibliographic Details
Main Authors: Ding-Qun Chen, 陳丁群
Other Authors: Tzu-Tsung Wong
Format: Others
Language:zh-TW
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/22654151203934568705
id ndltd-TW-096NCKU5396019
record_format oai_dc
spelling ndltd-TW-096NCKU53960192016-05-09T04:14:18Z http://ndltd.ncl.edu.tw/handle/22654151203934568705 A gene selection method based on risk gene set using microarray data 以致病基因集為先驗資訊的基因選取方法之研究 Ding-Qun Chen 陳丁群 碩士 國立成功大學 資訊管理研究所 96 To analyze microarray data, gene selection and clustering analysis are usually applied. The advantages of gene selection are to reduce the time complexity in building classifiers, improve the classification accuracy, and find significant genes for diseases. Clustering analysis can discover co-expressed genes which are likely to have the same biological function. Many gene selection methods have been proposed, but most of them do not consider the risk genes which have been presented in biological study. Our proposed method will consider the risk gene set as prior information for gene selection. It can be divided into two stages. At the first stage, we collect the risk genes from biological reports as the initial candidate gene subset, and remove the highly correlated genes with the risk genes. At the second stage, we apply the quality threshold clustering (QT clustering) on the remaining genes of the first stage, and select the significant genes of every stage in QT clustering to join the candidate gene subset. The final candidate gene subset is then applied into two machine learning classifiers, KNN and SVM, to evaluate its performance. This approach is tested on 4 well-known gene expression data sets for breast cancer and prostate cancer. The experimental results show that our gene selection method outperforms or has similar performance to the methods proposed by previous researches in prediction accuracy. Tzu-Tsung Wong 翁慈宗 2008 學位論文 ; thesis 78 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立成功大學 === 資訊管理研究所 === 96 === To analyze microarray data, gene selection and clustering analysis are usually applied. The advantages of gene selection are to reduce the time complexity in building classifiers, improve the classification accuracy, and find significant genes for diseases. Clustering analysis can discover co-expressed genes which are likely to have the same biological function. Many gene selection methods have been proposed, but most of them do not consider the risk genes which have been presented in biological study. Our proposed method will consider the risk gene set as prior information for gene selection. It can be divided into two stages. At the first stage, we collect the risk genes from biological reports as the initial candidate gene subset, and remove the highly correlated genes with the risk genes. At the second stage, we apply the quality threshold clustering (QT clustering) on the remaining genes of the first stage, and select the significant genes of every stage in QT clustering to join the candidate gene subset. The final candidate gene subset is then applied into two machine learning classifiers, KNN and SVM, to evaluate its performance. This approach is tested on 4 well-known gene expression data sets for breast cancer and prostate cancer. The experimental results show that our gene selection method outperforms or has similar performance to the methods proposed by previous researches in prediction accuracy.
author2 Tzu-Tsung Wong
author_facet Tzu-Tsung Wong
Ding-Qun Chen
陳丁群
author Ding-Qun Chen
陳丁群
spellingShingle Ding-Qun Chen
陳丁群
A gene selection method based on risk gene set using microarray data
author_sort Ding-Qun Chen
title A gene selection method based on risk gene set using microarray data
title_short A gene selection method based on risk gene set using microarray data
title_full A gene selection method based on risk gene set using microarray data
title_fullStr A gene selection method based on risk gene set using microarray data
title_full_unstemmed A gene selection method based on risk gene set using microarray data
title_sort gene selection method based on risk gene set using microarray data
publishDate 2008
url http://ndltd.ncl.edu.tw/handle/22654151203934568705
work_keys_str_mv AT dingqunchen ageneselectionmethodbasedonriskgenesetusingmicroarraydata
AT chéndīngqún ageneselectionmethodbasedonriskgenesetusingmicroarraydata
AT dingqunchen yǐzhìbìngjīyīnjíwèixiānyànzīxùndejīyīnxuǎnqǔfāngfǎzhīyánjiū
AT chéndīngqún yǐzhìbìngjīyīnjíwèixiānyànzīxùndejīyīnxuǎnqǔfāngfǎzhīyánjiū
AT dingqunchen geneselectionmethodbasedonriskgenesetusingmicroarraydata
AT chéndīngqún geneselectionmethodbasedonriskgenesetusingmicroarraydata
_version_ 1718263376769974272