An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm

Natural scene text classification is considered to be a challenging task because of diversified set of image contents, presence of degradations including noise, low contrast/resolution and the random appearance of foreground (font, style, sizes and orientations) and background properties. Above all,...

Full description

Bibliographic Details
Main Authors: Ghulam Jillani Ansari, Jamal Hussain Shah, Mylene C. Q. Farias, Muhammad Sharif, Nauman Qadeer, Habib Ullah Khan
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
SFS
Online Access:https://ieeexplore.ieee.org/document/9395435/
id doaj-5cdaf8b3d31044ce92e25ed4a9014c6b
record_format Article
spelling doaj-5cdaf8b3d31044ce92e25ed4a9014c6b2021-04-13T23:01:03ZengIEEEIEEE Access2169-35362021-01-019549235493710.1109/ACCESS.2021.30711699395435An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic AlgorithmGhulam Jillani Ansari0Jamal Hussain Shah1https://orcid.org/0000-0002-7903-9391Mylene C. Q. Farias2https://orcid.org/0000-0002-1957-9943Muhammad Sharif3https://orcid.org/0000-0002-1355-2168Nauman Qadeer4https://orcid.org/0000-0002-2178-0435Habib Ullah Khan5https://orcid.org/0000-0001-8373-2781Department of Information Sciences, University of Education at Multan, Lahore, PakistanDepartment of Computer Science, COMSATS University Islamabad at Wah, Islamabad, PakistanDepartment of Electrical Engineering, University of Brasília, Brasilia, BrazilDepartment of Information Sciences, University of Education at Multan, Lahore, PakistanDepartment of Computer Science, Federal Urdu University of Arts, Science and Technology at Islamabad, Islamabad, PakistanDepartment of Accounting and Information Systems, College of Business and Economics, Qatar University, Doha, QatarNatural scene text classification is considered to be a challenging task because of diversified set of image contents, presence of degradations including noise, low contrast/resolution and the random appearance of foreground (font, style, sizes and orientations) and background properties. Above all, the high dimension of the input image’s feature space is another major problem in such tasks. This work is aimed to tackle these problems and remove redundant and irrelevant features to improve the generalization properties of the classifier. In other words, the selection of a qualitative and discriminative set of features, aiming to reduce dimensionality that helps to achieve a successful pattern classification. In this work, we use a biologically inspired genetic algorithm because crossover employed in such algorithm significantly improve the quality of multimodal discriminative set of features and hence improve the classification accuracy for diversified natural scene text images. The Support Vector Machine (SVM) algorithm is used for classification and the average F-Score is used as fitness function and target condition. First after preprocessing input images, the whole feature space (population) is built using a multimodal feature representation technique. Second, a feature level fusion approach is used to combine the features. Third, to improve the average F-score of the classifier, we apply a meta-heuristic optimization technique using a GA for feature selection. The proposed algorithm is tested on five publically available datasets and the results are compared with various state-of-the-art methods. The obtained results proved that the proposed algorithm performs well while classifying textual and non-textual region with better accuracy than benchmark state-of-the-art algorithms.https://ieeexplore.ieee.org/document/9395435/Genetic algorithmnatural scene textoptimal feature selectionSFSfeature fusionfeature space dimensionality reduction
collection DOAJ
language English
format Article
sources DOAJ
author Ghulam Jillani Ansari
Jamal Hussain Shah
Mylene C. Q. Farias
Muhammad Sharif
Nauman Qadeer
Habib Ullah Khan
spellingShingle Ghulam Jillani Ansari
Jamal Hussain Shah
Mylene C. Q. Farias
Muhammad Sharif
Nauman Qadeer
Habib Ullah Khan
An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
IEEE Access
Genetic algorithm
natural scene text
optimal feature selection
SFS
feature fusion
feature space dimensionality reduction
author_facet Ghulam Jillani Ansari
Jamal Hussain Shah
Mylene C. Q. Farias
Muhammad Sharif
Nauman Qadeer
Habib Ullah Khan
author_sort Ghulam Jillani Ansari
title An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
title_short An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
title_full An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
title_fullStr An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
title_full_unstemmed An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
title_sort optimized feature selection technique in diversified natural scene text for classification using genetic algorithm
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description Natural scene text classification is considered to be a challenging task because of diversified set of image contents, presence of degradations including noise, low contrast/resolution and the random appearance of foreground (font, style, sizes and orientations) and background properties. Above all, the high dimension of the input image’s feature space is another major problem in such tasks. This work is aimed to tackle these problems and remove redundant and irrelevant features to improve the generalization properties of the classifier. In other words, the selection of a qualitative and discriminative set of features, aiming to reduce dimensionality that helps to achieve a successful pattern classification. In this work, we use a biologically inspired genetic algorithm because crossover employed in such algorithm significantly improve the quality of multimodal discriminative set of features and hence improve the classification accuracy for diversified natural scene text images. The Support Vector Machine (SVM) algorithm is used for classification and the average F-Score is used as fitness function and target condition. First after preprocessing input images, the whole feature space (population) is built using a multimodal feature representation technique. Second, a feature level fusion approach is used to combine the features. Third, to improve the average F-score of the classifier, we apply a meta-heuristic optimization technique using a GA for feature selection. The proposed algorithm is tested on five publically available datasets and the results are compared with various state-of-the-art methods. The obtained results proved that the proposed algorithm performs well while classifying textual and non-textual region with better accuracy than benchmark state-of-the-art algorithms.
topic Genetic algorithm
natural scene text
optimal feature selection
SFS
feature fusion
feature space dimensionality reduction
url https://ieeexplore.ieee.org/document/9395435/
work_keys_str_mv AT ghulamjillaniansari anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
AT jamalhussainshah anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
AT mylenecqfarias anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
AT muhammadsharif anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
AT naumanqadeer anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
AT habibullahkhan anoptimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
AT ghulamjillaniansari optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
AT jamalhussainshah optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
AT mylenecqfarias optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
AT muhammadsharif optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
AT naumanqadeer optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
AT habibullahkhan optimizedfeatureselectiontechniqueindiversifiednaturalscenetextforclassificationusinggeneticalgorithm
_version_ 1721528454088753152