A gene selection method based on risk gene set using microarray data
碩士 === 國立成功大學 === 資訊管理研究所 === 96 === To analyze microarray data, gene selection and clustering analysis are usually applied. The advantages of gene selection are to reduce the time complexity in building classifiers, improve the classification accuracy, and find significant genes for diseases. Clust...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2008
|
Online Access: | http://ndltd.ncl.edu.tw/handle/22654151203934568705 |
id |
ndltd-TW-096NCKU5396019 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-096NCKU53960192016-05-09T04:14:18Z http://ndltd.ncl.edu.tw/handle/22654151203934568705 A gene selection method based on risk gene set using microarray data 以致病基因集為先驗資訊的基因選取方法之研究 Ding-Qun Chen 陳丁群 碩士 國立成功大學 資訊管理研究所 96 To analyze microarray data, gene selection and clustering analysis are usually applied. The advantages of gene selection are to reduce the time complexity in building classifiers, improve the classification accuracy, and find significant genes for diseases. Clustering analysis can discover co-expressed genes which are likely to have the same biological function. Many gene selection methods have been proposed, but most of them do not consider the risk genes which have been presented in biological study. Our proposed method will consider the risk gene set as prior information for gene selection. It can be divided into two stages. At the first stage, we collect the risk genes from biological reports as the initial candidate gene subset, and remove the highly correlated genes with the risk genes. At the second stage, we apply the quality threshold clustering (QT clustering) on the remaining genes of the first stage, and select the significant genes of every stage in QT clustering to join the candidate gene subset. The final candidate gene subset is then applied into two machine learning classifiers, KNN and SVM, to evaluate its performance. This approach is tested on 4 well-known gene expression data sets for breast cancer and prostate cancer. The experimental results show that our gene selection method outperforms or has similar performance to the methods proposed by previous researches in prediction accuracy. Tzu-Tsung Wong 翁慈宗 2008 學位論文 ; thesis 78 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立成功大學 === 資訊管理研究所 === 96 === To analyze microarray data, gene selection and clustering analysis are usually applied. The advantages of gene selection are to reduce the time complexity in building classifiers, improve the classification accuracy, and find significant genes for diseases. Clustering analysis can discover co-expressed genes which are likely to have the same biological function. Many gene selection methods have been proposed, but most of them do not consider the risk genes which have been presented in biological study. Our proposed method will consider the risk gene set as prior information for gene selection. It can be divided into two stages. At the first stage, we collect the risk genes from biological reports as the initial candidate gene subset, and remove the highly correlated genes with the risk genes. At the second stage, we apply the quality threshold clustering (QT clustering) on the remaining genes of the first stage, and select the significant genes of every stage in QT clustering to join the candidate gene subset. The final candidate gene subset is then applied into two machine learning classifiers, KNN and SVM, to evaluate its performance. This approach is tested on 4 well-known gene expression data sets for breast cancer and prostate cancer. The experimental results show that our gene selection method outperforms or has similar performance to the methods proposed by previous researches in prediction accuracy.
|
author2 |
Tzu-Tsung Wong |
author_facet |
Tzu-Tsung Wong Ding-Qun Chen 陳丁群 |
author |
Ding-Qun Chen 陳丁群 |
spellingShingle |
Ding-Qun Chen 陳丁群 A gene selection method based on risk gene set using microarray data |
author_sort |
Ding-Qun Chen |
title |
A gene selection method based on risk gene set using microarray data |
title_short |
A gene selection method based on risk gene set using microarray data |
title_full |
A gene selection method based on risk gene set using microarray data |
title_fullStr |
A gene selection method based on risk gene set using microarray data |
title_full_unstemmed |
A gene selection method based on risk gene set using microarray data |
title_sort |
gene selection method based on risk gene set using microarray data |
publishDate |
2008 |
url |
http://ndltd.ncl.edu.tw/handle/22654151203934568705 |
work_keys_str_mv |
AT dingqunchen ageneselectionmethodbasedonriskgenesetusingmicroarraydata AT chéndīngqún ageneselectionmethodbasedonriskgenesetusingmicroarraydata AT dingqunchen yǐzhìbìngjīyīnjíwèixiānyànzīxùndejīyīnxuǎnqǔfāngfǎzhīyánjiū AT chéndīngqún yǐzhìbìngjīyīnjíwèixiānyànzīxùndejīyīnxuǎnqǔfāngfǎzhīyánjiū AT dingqunchen geneselectionmethodbasedonriskgenesetusingmicroarraydata AT chéndīngqún geneselectionmethodbasedonriskgenesetusingmicroarraydata |
_version_ |
1718263376769974272 |