Support Vector Machines: Classification with Coding and Regression for Gene Selection

博士 === 臺灣大學 === 流行病學研究所 === 96 === This thesis contains two major themes. One is the multiclass support vector machines and the other is the support vector regression for gene selection. In the first part, we propose a regression approach for multiclass support vector classification. We introduce so...

Full description

Bibliographic Details
Main Authors: Pei-Chun Chen, 陳佩君
Other Authors: 陳素雲
Format: Others
Language:en_US
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/00778695534101224969
id ndltd-TW-096NTU05544002
record_format oai_dc
spelling ndltd-TW-096NTU055440022015-10-13T11:31:39Z http://ndltd.ncl.edu.tw/handle/00778695534101224969 Support Vector Machines: Classification with Coding and Regression for Gene Selection 支撐向量機制:以編碼處理分類問題並利用迴歸模式進行基因選取 Pei-Chun Chen 陳佩君 博士 臺灣大學 流行病學研究所 96 This thesis contains two major themes. One is the multiclass support vector machines and the other is the support vector regression for gene selection. In the first part, we propose a regression approach for multiclass support vector classification. We introduce some existing coding schemes into the support vector classification by coding the class labels into multivariate responses. Regression of these multivariate responses on kernelized input data is used to extract a low-dimensional feature subspace for discriminant purpose. We unify these coding schemes by showing that they are equivalent in the sense of leading to the same low-dimensional discriminant feature subspace. Classification is then carried out in this low-dimensional subspace using a linear discriminant algorithm, which can be any reasonable choice. The regression approach for extracting low-dimensional discriminant subspace combined with user-specified linear algorithm can team up into a simple but yet powerful toolkit for multiclass support vector classification. Issues of encoding, decoding and the notions of equivalence of codes are discussed. Experimental results, including prediction ability and CPU time, show that our approach is a competent alternative for the multiclass support vector machine problem. In the second part, we propose a support vector regression approach for gene selection and use the selected genes for disease classification. Current gene selection methods based on microarray data have treated each individual subject with equal weight to the disease of interest. However, tissues collected from different patients can be from different disease stages and may have different strength of association with the disease. To reflect this circumstance, our proposed method will take into account the subject variation by assigning different weights to subjects. The weights are calculated via support vector regression. Then significant genes are selected based on the cumulative sum of weighted expressions. The proposed gene selection procedure is illustrated and evaluated using the acute leukemia and colon cancer data. The results and performance are compared with four other approaches in terms of classification accuracies. 陳素雲 蕭朱杏 2008 學位論文 ; thesis 84 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 臺灣大學 === 流行病學研究所 === 96 === This thesis contains two major themes. One is the multiclass support vector machines and the other is the support vector regression for gene selection. In the first part, we propose a regression approach for multiclass support vector classification. We introduce some existing coding schemes into the support vector classification by coding the class labels into multivariate responses. Regression of these multivariate responses on kernelized input data is used to extract a low-dimensional feature subspace for discriminant purpose. We unify these coding schemes by showing that they are equivalent in the sense of leading to the same low-dimensional discriminant feature subspace. Classification is then carried out in this low-dimensional subspace using a linear discriminant algorithm, which can be any reasonable choice. The regression approach for extracting low-dimensional discriminant subspace combined with user-specified linear algorithm can team up into a simple but yet powerful toolkit for multiclass support vector classification. Issues of encoding, decoding and the notions of equivalence of codes are discussed. Experimental results, including prediction ability and CPU time, show that our approach is a competent alternative for the multiclass support vector machine problem. In the second part, we propose a support vector regression approach for gene selection and use the selected genes for disease classification. Current gene selection methods based on microarray data have treated each individual subject with equal weight to the disease of interest. However, tissues collected from different patients can be from different disease stages and may have different strength of association with the disease. To reflect this circumstance, our proposed method will take into account the subject variation by assigning different weights to subjects. The weights are calculated via support vector regression. Then significant genes are selected based on the cumulative sum of weighted expressions. The proposed gene selection procedure is illustrated and evaluated using the acute leukemia and colon cancer data. The results and performance are compared with four other approaches in terms of classification accuracies.
author2 陳素雲
author_facet 陳素雲
Pei-Chun Chen
陳佩君
author Pei-Chun Chen
陳佩君
spellingShingle Pei-Chun Chen
陳佩君
Support Vector Machines: Classification with Coding and Regression for Gene Selection
author_sort Pei-Chun Chen
title Support Vector Machines: Classification with Coding and Regression for Gene Selection
title_short Support Vector Machines: Classification with Coding and Regression for Gene Selection
title_full Support Vector Machines: Classification with Coding and Regression for Gene Selection
title_fullStr Support Vector Machines: Classification with Coding and Regression for Gene Selection
title_full_unstemmed Support Vector Machines: Classification with Coding and Regression for Gene Selection
title_sort support vector machines: classification with coding and regression for gene selection
publishDate 2008
url http://ndltd.ncl.edu.tw/handle/00778695534101224969
work_keys_str_mv AT peichunchen supportvectormachinesclassificationwithcodingandregressionforgeneselection
AT chénpèijūn supportvectormachinesclassificationwithcodingandregressionforgeneselection
AT peichunchen zhīchēngxiàngliàngjīzhìyǐbiānmǎchùlǐfēnlèiwèntíbìnglìyònghuíguīmóshìjìnxíngjīyīnxuǎnqǔ
AT chénpèijūn zhīchēngxiàngliàngjīzhìyǐbiānmǎchùlǐfēnlèiwèntíbìnglìyònghuíguīmóshìjìnxíngjīyīnxuǎnqǔ
_version_ 1716845654446178304