Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images

Word searching or keyword spotting is an important research problem in the domain of document image processing. The solution to the said problem for handwritten documents is more challenging than for printed ones. In this work, a two-stage word searching schema is introduced. In the first stage, all...

Full description

Bibliographic Details
Main Authors:	Malakar Samir, Ghosh Manosij, Sarkar Ram, Nasipuri Mita
Format:	Article
Language:	English
Published:	De Gruyter 2018-07-01
Series:	Journal of Intelligent Systems
Subjects:	word searching hog feature topological feature holistic word recognition handwritten documents quwi database
Online Access:	https://doi.org/10.1515/jisys-2017-0384

id	doaj-52a343250ac94164b35ebf0d20614615
record_format	Article
spelling	doaj-52a343250ac94164b35ebf0d206146152021-09-06T19:40:38ZengDe GruyterJournal of Intelligent Systems0334-18602191-026X2018-07-0129171973510.1515/jisys-2017-0384Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document ImagesMalakar Samir0Ghosh Manosij1Sarkar Ram2Nasipuri Mita3Department of Computer Science, Asutosh College, Kolkata, IndiaDepartment of Computer Science and Engineering, Jadavpur University, Kolkata, IndiaDepartment of Computer Science and Engineering, Jadavpur University, Kolkata, IndiaDepartment of Computer Science and Engineering, Jadavpur University, Kolkata, IndiaWord searching or keyword spotting is an important research problem in the domain of document image processing. The solution to the said problem for handwritten documents is more challenging than for printed ones. In this work, a two-stage word searching schema is introduced. In the first stage, all the irrelevant words with respect to a search word are filtered out from the document page image. This is carried out using a zonal feature vector, called pre-selection feature vector, along with a rule-based binary classification method. In the next step, a holistic word recognition paradigm is used to confirm a pre-selected word as search word. To accomplish this, a modified histogram of oriented gradients-based feature descriptor is combined with a topological feature vector. This method is experimented on a QUWI English database, which is freely available through the International Conference on Document Analysis and Recognition 2015 competition entitled “Writer Identification and Gender Classification.” This technique not only provides good retrieval performance in terms of recall, precision, and F-measure scores, but it also outperforms some state-of-the-art methods.https://doi.org/10.1515/jisys-2017-0384word searchinghog featuretopological featureholistic word recognitionhandwritten documentsquwi database
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Malakar Samir Ghosh Manosij Sarkar Ram Nasipuri Mita
spellingShingle	Malakar Samir Ghosh Manosij Sarkar Ram Nasipuri Mita Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images Journal of Intelligent Systems word searching hog feature topological feature holistic word recognition handwritten documents quwi database
author_facet	Malakar Samir Ghosh Manosij Sarkar Ram Nasipuri Mita
author_sort	Malakar Samir
title	Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images
title_short	Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images
title_full	Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images
title_fullStr	Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images
title_full_unstemmed	Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images
title_sort	development of a two-stage segmentation-based word searching method for handwritten document images
publisher	De Gruyter
series	Journal of Intelligent Systems
issn	0334-1860 2191-026X
publishDate	2018-07-01
description	Word searching or keyword spotting is an important research problem in the domain of document image processing. The solution to the said problem for handwritten documents is more challenging than for printed ones. In this work, a two-stage word searching schema is introduced. In the first stage, all the irrelevant words with respect to a search word are filtered out from the document page image. This is carried out using a zonal feature vector, called pre-selection feature vector, along with a rule-based binary classification method. In the next step, a holistic word recognition paradigm is used to confirm a pre-selected word as search word. To accomplish this, a modified histogram of oriented gradients-based feature descriptor is combined with a topological feature vector. This method is experimented on a QUWI English database, which is freely available through the International Conference on Document Analysis and Recognition 2015 competition entitled “Writer Identification and Gender Classification.” This technique not only provides good retrieval performance in terms of recall, precision, and F-measure scores, but it also outperforms some state-of-the-art methods.
topic	word searching hog feature topological feature holistic word recognition handwritten documents quwi database
url	https://doi.org/10.1515/jisys-2017-0384
work_keys_str_mv	AT malakarsamir developmentofatwostagesegmentationbasedwordsearchingmethodforhandwrittendocumentimages AT ghoshmanosij developmentofatwostagesegmentationbasedwordsearchingmethodforhandwrittendocumentimages AT sarkarram developmentofatwostagesegmentationbasedwordsearchingmethodforhandwrittendocumentimages AT nasipurimita developmentofatwostagesegmentationbasedwordsearchingmethodforhandwrittendocumentimages
_version_	1717768002374467584

Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images

Similar Items