Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio

Feature selection in high dimensional gene expression datasets not only reduces the dimension of the data, but also the execution time and computational cost of the underlying classifier. The current study introduces a novel feature selection method called weighted signal to noise ratio (WSNR) by ex...

Full description

Bibliographic Details
Published in:PLoS ONE
Main Authors: Muhammad Hamraz, Amjad Ali, Wali Khan Mashwani, Saeed Aldahmani, Zardad Khan
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2023-01-01
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10128961/?tool=EBI
_version_ 1852673468264349696
author Muhammad Hamraz
Amjad Ali
Wali Khan Mashwani
Saeed Aldahmani
Zardad Khan
author_facet Muhammad Hamraz
Amjad Ali
Wali Khan Mashwani
Saeed Aldahmani
Zardad Khan
author_sort Muhammad Hamraz
collection DOAJ
container_title PLoS ONE
description Feature selection in high dimensional gene expression datasets not only reduces the dimension of the data, but also the execution time and computational cost of the underlying classifier. The current study introduces a novel feature selection method called weighted signal to noise ratio (WSNR) by exploiting the weights of features based on support vectors and signal to noise ratio, with an objective to identify the most informative genes in high dimensional classification problems. The combination of two state-of-the-art procedures enables the extration of the most informative genes. The corresponding weights of these procedures are then multiplied and arranged in decreasing order. Larger weight of a feature indicates its discriminatory power in classifying the tissue samples to their true classes. The current method is validated on eight gene expression datasets. Moreover, results of the proposed method (WSNR) are also compared with four well known feature selection methods. We found that the (WSNR) outperform the other competing methods on 6 out of 8 datasets. Box-plots and Bar-plots of the results of the proposed method and all the other methods are also constructed. The proposed method is further assessed on simulated data. Simulation analysis reveal that (WSNR) outperforms all the other methods included in the study.
format Article
id doaj-art-20e756cce5904c2f8fa2eeeb53acfffe
institution Directory of Open Access Journals
issn 1932-6203
language English
publishDate 2023-01-01
publisher Public Library of Science (PLoS)
record_format Article
spelling doaj-art-20e756cce5904c2f8fa2eeeb53acfffe2025-08-19T21:32:06ZengPublic Library of Science (PLoS)PLoS ONE1932-62032023-01-01184Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratioMuhammad HamrazAmjad AliWali Khan MashwaniSaeed AldahmaniZardad KhanFeature selection in high dimensional gene expression datasets not only reduces the dimension of the data, but also the execution time and computational cost of the underlying classifier. The current study introduces a novel feature selection method called weighted signal to noise ratio (WSNR) by exploiting the weights of features based on support vectors and signal to noise ratio, with an objective to identify the most informative genes in high dimensional classification problems. The combination of two state-of-the-art procedures enables the extration of the most informative genes. The corresponding weights of these procedures are then multiplied and arranged in decreasing order. Larger weight of a feature indicates its discriminatory power in classifying the tissue samples to their true classes. The current method is validated on eight gene expression datasets. Moreover, results of the proposed method (WSNR) are also compared with four well known feature selection methods. We found that the (WSNR) outperform the other competing methods on 6 out of 8 datasets. Box-plots and Bar-plots of the results of the proposed method and all the other methods are also constructed. The proposed method is further assessed on simulated data. Simulation analysis reveal that (WSNR) outperforms all the other methods included in the study.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10128961/?tool=EBI
spellingShingle Muhammad Hamraz
Amjad Ali
Wali Khan Mashwani
Saeed Aldahmani
Zardad Khan
Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title_full Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title_fullStr Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title_full_unstemmed Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title_short Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
title_sort feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10128961/?tool=EBI
work_keys_str_mv AT muhammadhamraz featureselectionforhighdimensionalmicroarraygeneexpressiondataviaweightedsignaltonoiseratio
AT amjadali featureselectionforhighdimensionalmicroarraygeneexpressiondataviaweightedsignaltonoiseratio
AT walikhanmashwani featureselectionforhighdimensionalmicroarraygeneexpressiondataviaweightedsignaltonoiseratio
AT saeedaldahmani featureselectionforhighdimensionalmicroarraygeneexpressiondataviaweightedsignaltonoiseratio
AT zardadkhan featureselectionforhighdimensionalmicroarraygeneexpressiondataviaweightedsignaltonoiseratio