An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
Natural scene text classification is considered to be a challenging task because of diversified set of image contents, presence of degradations including noise, low contrast/resolution and the random appearance of foreground (font, style, sizes and orientations) and background properties. Above all,...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9395435/ |
id |
doaj-5cdaf8b3d31044ce92e25ed4a9014c6b |
---|---|
record_format |
Article |
spelling |
doaj-5cdaf8b3d31044ce92e25ed4a9014c6b2021-04-13T23:01:03ZengIEEEIEEE Access2169-35362021-01-019549235493710.1109/ACCESS.2021.30711699395435An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic AlgorithmGhulam Jillani Ansari0Jamal Hussain Shah1https://orcid.org/0000-0002-7903-9391Mylene C. Q. Farias2https://orcid.org/0000-0002-1957-9943Muhammad Sharif3https://orcid.org/0000-0002-1355-2168Nauman Qadeer4https://orcid.org/0000-0002-2178-0435Habib Ullah Khan5https://orcid.org/0000-0001-8373-2781Department of Information Sciences, University of Education at Multan, Lahore, PakistanDepartment of Computer Science, COMSATS University Islamabad at Wah, Islamabad, PakistanDepartment of Electrical Engineering, University of Brasília, Brasilia, BrazilDepartment of Information Sciences, University of Education at Multan, Lahore, PakistanDepartment of Computer Science, Federal Urdu University of Arts, Science and Technology at Islamabad, Islamabad, PakistanDepartment of Accounting and Information Systems, College of Business and Economics, Qatar University, Doha, QatarNatural scene text classification is considered to be a challenging task because of diversified set of image contents, presence of degradations including noise, low contrast/resolution and the random appearance of foreground (font, style, sizes and orientations) and background properties. Above all, the high dimension of the input image’s feature space is another major problem in such tasks. This work is aimed to tackle these problems and remove redundant and irrelevant features to improve the generalization properties of the classifier. In other words, the selection of a qualitative and discriminative set of features, aiming to reduce dimensionality that helps to achieve a successful pattern classification. In this work, we use a biologically inspired genetic algorithm because crossover employed in such algorithm significantly improve the quality of multimodal discriminative set of features and hence improve the classification accuracy for diversified natural scene text images. The Support Vector Machine (SVM) algorithm is used for classification and the average F-Score is used as fitness function and target condition. First after preprocessing input images, the whole feature space (population) is built using a multimodal feature representation technique. Second, a feature level fusion approach is used to combine the features. Third, to improve the average F-score of the classifier, we apply a meta-heuristic optimization technique using a GA for feature selection. The proposed algorithm is tested on five publically available datasets and the results are compared with various state-of-the-art methods. The obtained results proved that the proposed algorithm performs well while classifying textual and non-textual region with better accuracy than benchmark state-of-the-art algorithms.https://ieeexplore.ieee.org/document/9395435/Genetic algorithmnatural scene textoptimal feature selectionSFSfeature fusionfeature space dimensionality reduction |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ghulam Jillani Ansari Jamal Hussain Shah Mylene C. Q. Farias Muhammad Sharif Nauman Qadeer Habib Ullah Khan |
spellingShingle |
Ghulam Jillani Ansari Jamal Hussain Shah Mylene C. Q. Farias Muhammad Sharif Nauman Qadeer Habib Ullah Khan An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm IEEE Access Genetic algorithm natural scene text optimal feature selection SFS feature fusion feature space dimensionality reduction |
author_facet |
Ghulam Jillani Ansari Jamal Hussain Shah Mylene C. Q. Farias Muhammad Sharif Nauman Qadeer Habib Ullah Khan |
author_sort |
Ghulam Jillani Ansari |
title |
An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm |
title_short |
An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm |
title_full |
An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm |
title_fullStr |
An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm |
title_full_unstemmed |
An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm |
title_sort |
optimized feature selection technique in diversified natural scene text for classification using genetic algorithm |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2021-01-01 |
description |
Natural scene text classification is considered to be a challenging task because of diversified set of image contents, presence of degradations including noise, low contrast/resolution and the random appearance of foreground (font, style, sizes and orientations) and background properties. Above all, the high dimension of the input image’s feature space is another major problem in such tasks. This work is aimed to tackle these problems and remove redundant and irrelevant features to improve the generalization properties of the classifier. In other words, the selection of a qualitative and discriminative set of features, aiming to reduce dimensionality that helps to achieve a successful pattern classification. In this work, we use a biologically inspired genetic algorithm because crossover employed in such algorithm significantly improve the quality of multimodal discriminative set of features and hence improve the classification accuracy for diversified natural scene text images. The Support Vector Machine (SVM) algorithm is used for classification and the average F-Score is used as fitness function and target condition. First after preprocessing input images, the whole feature space (population) is built using a multimodal feature representation technique. Second, a feature level fusion approach is used to combine the features. Third, to improve the average F-score of the classifier, we apply a meta-heuristic optimization technique using a GA for feature selection. The proposed algorithm is tested on five publically available datasets and the results are compared with various state-of-the-art methods. The obtained results proved that the proposed algorithm performs well while classifying textual and non-textual region with better accuracy than benchmark state-of-the-art algorithms. |
topic |
Genetic algorithm natural scene text optimal feature selection SFS feature fusion feature space dimensionality reduction |
url |
https://ieeexplore.ieee.org/document/9395435/ |
work_keys_str_mv |
AT ghulamjillaniansari anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm AT jamalhussainshah anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm AT mylenecqfarias anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm AT muhammadsharif anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm AT naumanqadeer anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm AT habibullahkhan anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm AT ghulamjillaniansari optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm AT jamalhussainshah optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm AT mylenecqfarias optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm AT muhammadsharif optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm AT naumanqadeer optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm AT habibullahkhan optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm |
_version_ |
1721528454088753152 |