Support Vector Machines: Classification with Coding and Regression for Gene Selection
博士 === 臺灣大學 === 流行病學研究所 === 96 === This thesis contains two major themes. One is the multiclass support vector machines and the other is the support vector regression for gene selection. In the first part, we propose a regression approach for multiclass support vector classification. We introduce so...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2008
|
Online Access: | http://ndltd.ncl.edu.tw/handle/00778695534101224969 |
id |
ndltd-TW-096NTU05544002 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-096NTU055440022015-10-13T11:31:39Z http://ndltd.ncl.edu.tw/handle/00778695534101224969 Support Vector Machines: Classification with Coding and Regression for Gene Selection 支撐向量機制:以編碼處理分類問題並利用迴歸模式進行基因選取 Pei-Chun Chen 陳佩君 博士 臺灣大學 流行病學研究所 96 This thesis contains two major themes. One is the multiclass support vector machines and the other is the support vector regression for gene selection. In the first part, we propose a regression approach for multiclass support vector classification. We introduce some existing coding schemes into the support vector classification by coding the class labels into multivariate responses. Regression of these multivariate responses on kernelized input data is used to extract a low-dimensional feature subspace for discriminant purpose. We unify these coding schemes by showing that they are equivalent in the sense of leading to the same low-dimensional discriminant feature subspace. Classification is then carried out in this low-dimensional subspace using a linear discriminant algorithm, which can be any reasonable choice. The regression approach for extracting low-dimensional discriminant subspace combined with user-specified linear algorithm can team up into a simple but yet powerful toolkit for multiclass support vector classification. Issues of encoding, decoding and the notions of equivalence of codes are discussed. Experimental results, including prediction ability and CPU time, show that our approach is a competent alternative for the multiclass support vector machine problem. In the second part, we propose a support vector regression approach for gene selection and use the selected genes for disease classification. Current gene selection methods based on microarray data have treated each individual subject with equal weight to the disease of interest. However, tissues collected from different patients can be from different disease stages and may have different strength of association with the disease. To reflect this circumstance, our proposed method will take into account the subject variation by assigning different weights to subjects. The weights are calculated via support vector regression. Then significant genes are selected based on the cumulative sum of weighted expressions. The proposed gene selection procedure is illustrated and evaluated using the acute leukemia and colon cancer data. The results and performance are compared with four other approaches in terms of classification accuracies. 陳素雲 蕭朱杏 2008 學位論文 ; thesis 84 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 臺灣大學 === 流行病學研究所 === 96 === This thesis contains two major themes. One is the multiclass support vector machines and the other is the support vector regression for gene selection. In the first part, we propose a regression approach for multiclass support vector classification. We introduce some existing coding schemes into the support vector classification by coding the class labels into multivariate responses. Regression of these multivariate responses on kernelized input data is used to extract a low-dimensional feature
subspace for discriminant purpose. We unify these coding schemes by showing that they are equivalent in the sense of leading to the same low-dimensional discriminant feature subspace. Classification is then carried out in this low-dimensional subspace using a linear discriminant algorithm, which can be any reasonable choice. The regression approach for extracting low-dimensional
discriminant subspace combined with user-specified linear
algorithm can team up into a simple but yet powerful toolkit for multiclass support vector classification. Issues of encoding, decoding and the notions of equivalence of codes are discussed. Experimental results, including prediction ability and CPU time, show that our approach is a competent alternative for the multiclass support vector machine problem.
In the second part, we propose a support vector regression
approach for gene selection and use the selected genes for disease classification. Current gene selection methods based on microarray data have treated each individual subject with equal weight to the disease of interest. However, tissues collected from different patients can be from different disease stages and may have different strength of association with the disease. To reflect
this circumstance, our proposed method will take into account the subject variation by assigning different weights to subjects. The weights are calculated via support vector regression. Then significant genes are selected based on the cumulative sum of weighted expressions. The proposed gene selection procedure is
illustrated and evaluated using the acute leukemia and colon cancer data. The results and performance are compared with four other approaches in terms of classification accuracies.
|
author2 |
陳素雲 |
author_facet |
陳素雲 Pei-Chun Chen 陳佩君 |
author |
Pei-Chun Chen 陳佩君 |
spellingShingle |
Pei-Chun Chen 陳佩君 Support Vector Machines: Classification with Coding and Regression for Gene Selection |
author_sort |
Pei-Chun Chen |
title |
Support Vector Machines: Classification with Coding and Regression for Gene Selection |
title_short |
Support Vector Machines: Classification with Coding and Regression for Gene Selection |
title_full |
Support Vector Machines: Classification with Coding and Regression for Gene Selection |
title_fullStr |
Support Vector Machines: Classification with Coding and Regression for Gene Selection |
title_full_unstemmed |
Support Vector Machines: Classification with Coding and Regression for Gene Selection |
title_sort |
support vector machines: classification with coding and regression for gene selection |
publishDate |
2008 |
url |
http://ndltd.ncl.edu.tw/handle/00778695534101224969 |
work_keys_str_mv |
AT peichunchen supportvectormachinesclassificationwithcodingandregressionforgeneselection AT chénpèijūn supportvectormachinesclassificationwithcodingandregressionforgeneselection AT peichunchen zhīchēngxiàngliàngjīzhìyǐbiānmǎchùlǐfēnlèiwèntíbìnglìyònghuíguīmóshìjìnxíngjīyīnxuǎnqǔ AT chénpèijūn zhīchēngxiàngliàngjīzhìyǐbiānmǎchùlǐfēnlèiwèntíbìnglìyònghuíguīmóshìjìnxíngjīyīnxuǎnqǔ |
_version_ |
1716845654446178304 |