Use of Random Subspace Ensembles on Gene Expression Profiles in Survival Prediction for Colon Cancer Patients

Cancer is a disease process that emerges out of a series of genetic mutations that cause seemingly uncontrolled multiplication of cells. The molecular genetics of cells indicates that different combinations of genetic events or alternative pathways in cells may lead to cancer. A study of the gene ex...

Full description

Bibliographic Details
Main Author: Kamath, Vidya
Format: Others
Published: Scholar Commons 2005
Subjects:
Online Access:https://scholarcommons.usf.edu/etd/715
https://scholarcommons.usf.edu/cgi/viewcontent.cgi?article=1714&context=etd
id ndltd-USF-oai-scholarcommons.usf.edu-etd-1714
record_format oai_dc
spelling ndltd-USF-oai-scholarcommons.usf.edu-etd-17142019-10-04T05:20:38Z Use of Random Subspace Ensembles on Gene Expression Profiles in Survival Prediction for Colon Cancer Patients Kamath, Vidya Cancer is a disease process that emerges out of a series of genetic mutations that cause seemingly uncontrolled multiplication of cells. The molecular genetics of cells indicates that different combinations of genetic events or alternative pathways in cells may lead to cancer. A study of the gene expressions of cancer cells, in combination with the external influential factors, can greatly aid in cancer management such as understanding the initiation and etiology of cancer, as well as detection, assessment and prediction of the progression of cancer. Gene expression analysis of cells yields a very large number of features that can be used to describe the condition of the cell. Feature selection methods are explored to choose the best of these features that are most relevant to the problem at hand. Random subspace ensembles created using these selected features perform poorly in predicting the 36-month survival for colon cancer patients. A modification to the random subspace scheme is proposed to enhance the accuracy of prediction. The method first applies random subspace ensembles with decision trees to select predictive features. Then, support vector machines are used to analyze the selected gene expression profiles in cancer tissue to predict the survival outcome for a patient. The proposed method is shown to achieve a weighted accuracy of 58.96%, with 40.54% sensitivity and 77.38% specificity in predicting 36-month survival for new and unknown colon cancer patients. The prediction accuracy of the method is comparable to the baseline classifiers and significantly better than random subspace ensembles on gene expression profiles of colon cancer. 2005-11-04T08:00:00Z text application/pdf https://scholarcommons.usf.edu/etd/715 https://scholarcommons.usf.edu/cgi/viewcontent.cgi?article=1714&context=etd default Graduate Theses and Dissertations Scholar Commons Microarray Bioinformatics Data mining Feature selection Classifiers American Studies Arts and Humanities
collection NDLTD
format Others
sources NDLTD
topic Microarray
Bioinformatics
Data mining
Feature selection
Classifiers
American Studies
Arts and Humanities
spellingShingle Microarray
Bioinformatics
Data mining
Feature selection
Classifiers
American Studies
Arts and Humanities
Kamath, Vidya
Use of Random Subspace Ensembles on Gene Expression Profiles in Survival Prediction for Colon Cancer Patients
description Cancer is a disease process that emerges out of a series of genetic mutations that cause seemingly uncontrolled multiplication of cells. The molecular genetics of cells indicates that different combinations of genetic events or alternative pathways in cells may lead to cancer. A study of the gene expressions of cancer cells, in combination with the external influential factors, can greatly aid in cancer management such as understanding the initiation and etiology of cancer, as well as detection, assessment and prediction of the progression of cancer. Gene expression analysis of cells yields a very large number of features that can be used to describe the condition of the cell. Feature selection methods are explored to choose the best of these features that are most relevant to the problem at hand. Random subspace ensembles created using these selected features perform poorly in predicting the 36-month survival for colon cancer patients. A modification to the random subspace scheme is proposed to enhance the accuracy of prediction. The method first applies random subspace ensembles with decision trees to select predictive features. Then, support vector machines are used to analyze the selected gene expression profiles in cancer tissue to predict the survival outcome for a patient. The proposed method is shown to achieve a weighted accuracy of 58.96%, with 40.54% sensitivity and 77.38% specificity in predicting 36-month survival for new and unknown colon cancer patients. The prediction accuracy of the method is comparable to the baseline classifiers and significantly better than random subspace ensembles on gene expression profiles of colon cancer.
author Kamath, Vidya
author_facet Kamath, Vidya
author_sort Kamath, Vidya
title Use of Random Subspace Ensembles on Gene Expression Profiles in Survival Prediction for Colon Cancer Patients
title_short Use of Random Subspace Ensembles on Gene Expression Profiles in Survival Prediction for Colon Cancer Patients
title_full Use of Random Subspace Ensembles on Gene Expression Profiles in Survival Prediction for Colon Cancer Patients
title_fullStr Use of Random Subspace Ensembles on Gene Expression Profiles in Survival Prediction for Colon Cancer Patients
title_full_unstemmed Use of Random Subspace Ensembles on Gene Expression Profiles in Survival Prediction for Colon Cancer Patients
title_sort use of random subspace ensembles on gene expression profiles in survival prediction for colon cancer patients
publisher Scholar Commons
publishDate 2005
url https://scholarcommons.usf.edu/etd/715
https://scholarcommons.usf.edu/cgi/viewcontent.cgi?article=1714&context=etd
work_keys_str_mv AT kamathvidya useofrandomsubspaceensemblesongeneexpressionprofilesinsurvivalpredictionforcoloncancerpatients
_version_ 1719260665414156288