Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier

Multi-class cancer classification based on microarray data is described. A generalized output-coding scheme based on One Versus One (OVO) combined with Latent Variable Model (LVM) is used. Results from the proposed One Versus One (OVO) output- coding strategy is compared with the results obtained fr...

Full description

Bibliographic Details
Main Authors: Sandeep J. Joseph, Kelly R. Robbins, Wensheng Zhang, Romdhane Rekaya
Format: Article
Language:English
Published: SAGE Publishing 2010-03-01
Series:Cancer Informatics
Online Access:http://la-press.com/comparison-of-two-output-coding-strategies-for-multi-class-tumor-class-a1909
id doaj-878d5741922d4900babe600691097ae6
record_format Article
spelling doaj-878d5741922d4900babe600691097ae62020-11-25T03:44:11ZengSAGE PublishingCancer Informatics1176-93512010-03-01201093948Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary ClassifierSandeep J. JosephKelly R. RobbinsWensheng ZhangRomdhane RekayaMulti-class cancer classification based on microarray data is described. A generalized output-coding scheme based on One Versus One (OVO) combined with Latent Variable Model (LVM) is used. Results from the proposed One Versus One (OVO) output- coding strategy is compared with the results obtained from the generalized One Versus All (OVA) method and their efficiencies of using them for multi-class tumor classification have been studied. This comparative study was done using two microarray gene expression data: Global Cancer Map (GCM) dataset and brain cancer (BC) dataset. Primary feature selection was based on fold change and penalized t-statistics. Evaluation was conducted with varying feature numbers. The OVO coding strategy worked quite well with the BC data, while both OVO and OVA results seemed to be similar for the GCM data. The selection of output coding methods for combining binary classifiers for multi-class tumor classification depends on the number of tumor types considered, the discrepancies between the tumor samples used for training as well as the heterogeneity of expression within the cancer subtypes used as training data. http://la-press.com/comparison-of-two-output-coding-strategies-for-multi-class-tumor-class-a1909
collection DOAJ
language English
format Article
sources DOAJ
author Sandeep J. Joseph
Kelly R. Robbins
Wensheng Zhang
Romdhane Rekaya
spellingShingle Sandeep J. Joseph
Kelly R. Robbins
Wensheng Zhang
Romdhane Rekaya
Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier
Cancer Informatics
author_facet Sandeep J. Joseph
Kelly R. Robbins
Wensheng Zhang
Romdhane Rekaya
author_sort Sandeep J. Joseph
title Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier
title_short Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier
title_full Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier
title_fullStr Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier
title_full_unstemmed Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier
title_sort comparison of two output-coding strategies for multi-class tumor classification using gene expression data and latent variable model as binary classifier
publisher SAGE Publishing
series Cancer Informatics
issn 1176-9351
publishDate 2010-03-01
description Multi-class cancer classification based on microarray data is described. A generalized output-coding scheme based on One Versus One (OVO) combined with Latent Variable Model (LVM) is used. Results from the proposed One Versus One (OVO) output- coding strategy is compared with the results obtained from the generalized One Versus All (OVA) method and their efficiencies of using them for multi-class tumor classification have been studied. This comparative study was done using two microarray gene expression data: Global Cancer Map (GCM) dataset and brain cancer (BC) dataset. Primary feature selection was based on fold change and penalized t-statistics. Evaluation was conducted with varying feature numbers. The OVO coding strategy worked quite well with the BC data, while both OVO and OVA results seemed to be similar for the GCM data. The selection of output coding methods for combining binary classifiers for multi-class tumor classification depends on the number of tumor types considered, the discrepancies between the tumor samples used for training as well as the heterogeneity of expression within the cancer subtypes used as training data.
url http://la-press.com/comparison-of-two-output-coding-strategies-for-multi-class-tumor-class-a1909
work_keys_str_mv AT sandeepjjoseph comparisonoftwooutputcodingstrategiesformulticlasstumorclassificationusinggeneexpressiondataandlatentvariablemodelasbinaryclassifier
AT kellyrrobbins comparisonoftwooutputcodingstrategiesformulticlasstumorclassificationusinggeneexpressiondataandlatentvariablemodelasbinaryclassifier
AT wenshengzhang comparisonoftwooutputcodingstrategiesformulticlasstumorclassificationusinggeneexpressiondataandlatentvariablemodelasbinaryclassifier
AT romdhanerekaya comparisonoftwooutputcodingstrategiesformulticlasstumorclassificationusinggeneexpressiondataandlatentvariablemodelasbinaryclassifier
_version_ 1724515749146394624