End-To-End Deep-Learning-Based Tamil Handwritten Document Recognition and Classification Model

Overview: Handwriting recognition (HR) involves converting handwritten text into machine-readable text. Tamil handwritten document recognition remains a challenging process in various real-world applications owing to the differences in the sizes, styles and orientation angles of Tamil alphabets. Pri...

Full description

Bibliographic Details
Main Authors: Lakshmana Pandian, S. (Author), Vinotheni, C. (Author)
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2023
Subjects:
Online Access:View Fulltext in Publisher
View in Scopus
LEADER 02972nam a2200313Ia 4500
001 10.1109-ACCESS.2023.3270895
008 230529s2023 CNT 000 0 und d
020 |a 21693536 (ISSN) 
245 1 0 |a End-To-End Deep-Learning-Based Tamil Handwritten Document Recognition and Classification Model 
260 0 |b Institute of Electrical and Electronics Engineers Inc.  |c 2023 
300 |a 1 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1109/ACCESS.2023.3270895 
856 |z View in Scopus  |u https://www.scopus.com/inward/record.uri?eid=2-s2.0-85159698843&doi=10.1109%2fACCESS.2023.3270895&partnerID=40&md5=49c9d0c8896a9914aec941efedf6cd6d 
520 3 |a Overview: Handwriting recognition (HR) involves converting handwritten text into machine-readable text. Tamil handwritten document recognition remains a challenging process in various real-world applications owing to the differences in the sizes, styles and orientation angles of Tamil alphabets. Prior studies concentrated only on character-level segmentation, and each character was subsequently classified. The recently developed machine learning (ML) and deep learning (DL) approaches can be utilized for Tamil handwritten character recognition (HCR). Objective: This paper attempts to present an end-to-end DL-based Tamil handwritten document recognition (ETEDL-THDR) model. Methods: Segmentation is used, first at the word level and then at the line level. ETEDL-THDR text recognition can be accomplished using two modules: line segmentation and line recognition. Initially, the ETEDL-THDR model targets improving input image quality using the median filtering (MF) technique. To create meaningful regions, more line and character segmentation activities are performed. A deep convolutional neural network (DCNN) based MobileNet approach is also applied to derive feature vectors. Finally, the water strider optimization (WSO) algorithm with a bidirectional gated recurrent unit (BiGRU) model is used to identify the Tamil characters. Results: Extensive experimental analyses of the ETEDL-THDR model have been carried out, and the results show that the ETEDL-THDR model performs better than more recent methodologies, with a maximum accuracy of 98.48%, the precision of 98.38%, sensitivity of 97.98%, specificity of 98.27% and F-measure of 98.35%. Conclusion: The comparison results show that the proposed model can recognize Tamil handwritten documents in real-time. Author 
650 0 4 |a Character recognition 
650 0 4 |a Convolutional neural networks 
650 0 4 |a deep learning 
650 0 4 |a Feature extraction 
650 0 4 |a Handwriting recognition 
650 0 4 |a Handwritten character recognition 
650 0 4 |a Hidden Markov models 
650 0 4 |a Image recognition 
650 0 4 |a Image segmentation 
650 0 4 |a machine learning 
650 0 4 |a segmentation 
650 0 4 |a Tamil language 
700 1 0 |a Lakshmana Pandian, S.  |e author 
700 1 0 |a Vinotheni, C.  |e author 
773 |t IEEE Access