GAWA–A Feature Selection Method for Hybrid Sentiment Classification

Sentiment analysis or opinion mining is the key to natural language processing for the extraction of useful information from the text documents of numerous sources. Several different techniques, i.e., simple rule-based to lexicon-based and more sophisticated machine learning algorithms, have been wi...

Full description

Bibliographic Details
Main Authors: Abdur Rasool, Ran Tao, Marjan Kamyab, Shoaib Hayat
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9222172/
id doaj-3cd6630fa7cf44a38e22c64be2a010ef
record_format Article
spelling doaj-3cd6630fa7cf44a38e22c64be2a010ef2021-03-30T03:26:54ZengIEEEIEEE Access2169-35362020-01-01819185019186110.1109/ACCESS.2020.30306429222172GAWA–A Feature Selection Method for Hybrid Sentiment ClassificationAbdur Rasool0https://orcid.org/0000-0001-5334-9001Ran Tao1https://orcid.org/0000-0002-7343-6388Marjan Kamyab2https://orcid.org/0000-0001-8152-9392Shoaib Hayat3https://orcid.org/0000-0002-2847-3147School of Computer Science and Technology, Donghua University, Shanghai, ChinaSchool of Computer Science and Technology, Donghua University, Shanghai, ChinaSchool of Computer Science and Technology, Donghua University, Shanghai, ChinaSchool of Computer Science and Technology, Donghua University, Shanghai, ChinaSentiment analysis or opinion mining is the key to natural language processing for the extraction of useful information from the text documents of numerous sources. Several different techniques, i.e., simple rule-based to lexicon-based and more sophisticated machine learning algorithms, have been widely used with different classifiers to get the factual analysis of sentiment. However, lexicon-based sentiment classification is still suffering from low accuracies, mainly due to the deficiency of domain-oriented competitive dictionaries. Similarly, machine learning-based sentiment is also tackling the accuracy constraints because of feature ambiguity from social data. One of the best ways to deal with the accuracy issue is to select the best feature-set and reduce the volume of the feature. This paper proposes a method (namely, GAWA) for feature selection by utilizing the Wrapper Approaches (WA) to select the premier features and the Genetic Algorithm (GA) to reduce the size of the premier features. The novelty of this work is the modified fitness function of heuristic GA to compute the optimal features by reducing the redundancy for better accuracy. This work aims to present a comprehensive model of hybrid sentiment by using the proposed method, GAWA. It will be valued in developing a new approach for the selection of feature-set with a better accuracy level. The experiments revealed that these techniques could reduce the feature-set up-to 61.95% without negotiating the accuracy level. The new optimal feature sets enhanced the efficiency of the Naïve Bayes algorithm up to 92%. This work is compared with the conventional method of feature selection and concluded the 11% better accuracy than PCA and 8% better than PSO. Furthermore, the results are compared with the literature work and found that the proposed method outperformed the previous research.https://ieeexplore.ieee.org/document/9222172/Feature selectiongenetic algorithmhybrid sentiment classificationmachine learning algorithmswrapper approach
collection DOAJ
language English
format Article
sources DOAJ
author Abdur Rasool
Ran Tao
Marjan Kamyab
Shoaib Hayat
spellingShingle Abdur Rasool
Ran Tao
Marjan Kamyab
Shoaib Hayat
GAWA–A Feature Selection Method for Hybrid Sentiment Classification
IEEE Access
Feature selection
genetic algorithm
hybrid sentiment classification
machine learning algorithms
wrapper approach
author_facet Abdur Rasool
Ran Tao
Marjan Kamyab
Shoaib Hayat
author_sort Abdur Rasool
title GAWA–A Feature Selection Method for Hybrid Sentiment Classification
title_short GAWA–A Feature Selection Method for Hybrid Sentiment Classification
title_full GAWA–A Feature Selection Method for Hybrid Sentiment Classification
title_fullStr GAWA–A Feature Selection Method for Hybrid Sentiment Classification
title_full_unstemmed GAWA–A Feature Selection Method for Hybrid Sentiment Classification
title_sort gawa–a feature selection method for hybrid sentiment classification
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Sentiment analysis or opinion mining is the key to natural language processing for the extraction of useful information from the text documents of numerous sources. Several different techniques, i.e., simple rule-based to lexicon-based and more sophisticated machine learning algorithms, have been widely used with different classifiers to get the factual analysis of sentiment. However, lexicon-based sentiment classification is still suffering from low accuracies, mainly due to the deficiency of domain-oriented competitive dictionaries. Similarly, machine learning-based sentiment is also tackling the accuracy constraints because of feature ambiguity from social data. One of the best ways to deal with the accuracy issue is to select the best feature-set and reduce the volume of the feature. This paper proposes a method (namely, GAWA) for feature selection by utilizing the Wrapper Approaches (WA) to select the premier features and the Genetic Algorithm (GA) to reduce the size of the premier features. The novelty of this work is the modified fitness function of heuristic GA to compute the optimal features by reducing the redundancy for better accuracy. This work aims to present a comprehensive model of hybrid sentiment by using the proposed method, GAWA. It will be valued in developing a new approach for the selection of feature-set with a better accuracy level. The experiments revealed that these techniques could reduce the feature-set up-to 61.95% without negotiating the accuracy level. The new optimal feature sets enhanced the efficiency of the Naïve Bayes algorithm up to 92%. This work is compared with the conventional method of feature selection and concluded the 11% better accuracy than PCA and 8% better than PSO. Furthermore, the results are compared with the literature work and found that the proposed method outperformed the previous research.
topic Feature selection
genetic algorithm
hybrid sentiment classification
machine learning algorithms
wrapper approach
url https://ieeexplore.ieee.org/document/9222172/
work_keys_str_mv AT abdurrasool gawax2013afeatureselectionmethodforhybridsentimentclassification
AT rantao gawax2013afeatureselectionmethodforhybridsentimentclassification
AT marjankamyab gawax2013afeatureselectionmethodforhybridsentimentclassification
AT shoaibhayat gawax2013afeatureselectionmethodforhybridsentimentclassification
_version_ 1724183393096171520