Stroke Extraction for Offline Handwritten Mathematical Expression Recognition

Offline handwritten mathematical expression recognition is often considered much harder than its online counterpart due to the absence of temporal information. In order to take advantage of the more mature methods for online recognition and save resources, an oversegmentation approach is proposed to...

Full description

Bibliographic Details
Main Author: Chungkwong Chan
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9051736/
id doaj-b03c8f6814324bc995b0cf737c4cdf77
record_format Article
spelling doaj-b03c8f6814324bc995b0cf737c4cdf772021-03-30T01:30:07ZengIEEEIEEE Access2169-35362020-01-018615656157510.1109/ACCESS.2020.29846279051736Stroke Extraction for Offline Handwritten Mathematical Expression RecognitionChungkwong Chan0https://orcid.org/0000-0002-2242-0351School of Mathematics, Sun Yat-Sen University, Guangzhou, ChinaOffline handwritten mathematical expression recognition is often considered much harder than its online counterpart due to the absence of temporal information. In order to take advantage of the more mature methods for online recognition and save resources, an oversegmentation approach is proposed to recover strokes from textual bitmap images automatically. The proposed algorithm first breaks down the skeleton of a binarized image into junctions and segments, then segments are merged to form strokes, finally stroke order is normalized by using recursive projection and topological sort. Good offline accuracy was obtained in combination with ordinary online recognizers, which were not specially designed for extracted strokes. Given a ready-made state-of-the-art online handwritten mathematical expression recognizer, the proposed procedure correctly recognized 58.22%, 65.65%, and 65.22% of the offline formulas rendered from the datasets of the Competitions on Recognition of Online Handwritten Mathematical Expressions (CROHME) in 2014, 2016, and 2019 respectively. Furthermore, given a trainable online recognition system, retraining it with extracted strokes resulted in an offline recognizer with the same level of accuracy. On the other hand, the speed of the entire pipeline was fast enough to facilitate on-device recognition on mobile phones with limited resources. To conclude, stroke extraction provides an attractive way to build optical character recognition software.https://ieeexplore.ieee.org/document/9051736/Character recognitionfeature extractionoffline handwritten mathematical expression recognitionoptical character recognition softwarestroke extraction
collection DOAJ
language English
format Article
sources DOAJ
author Chungkwong Chan
spellingShingle Chungkwong Chan
Stroke Extraction for Offline Handwritten Mathematical Expression Recognition
IEEE Access
Character recognition
feature extraction
offline handwritten mathematical expression recognition
optical character recognition software
stroke extraction
author_facet Chungkwong Chan
author_sort Chungkwong Chan
title Stroke Extraction for Offline Handwritten Mathematical Expression Recognition
title_short Stroke Extraction for Offline Handwritten Mathematical Expression Recognition
title_full Stroke Extraction for Offline Handwritten Mathematical Expression Recognition
title_fullStr Stroke Extraction for Offline Handwritten Mathematical Expression Recognition
title_full_unstemmed Stroke Extraction for Offline Handwritten Mathematical Expression Recognition
title_sort stroke extraction for offline handwritten mathematical expression recognition
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Offline handwritten mathematical expression recognition is often considered much harder than its online counterpart due to the absence of temporal information. In order to take advantage of the more mature methods for online recognition and save resources, an oversegmentation approach is proposed to recover strokes from textual bitmap images automatically. The proposed algorithm first breaks down the skeleton of a binarized image into junctions and segments, then segments are merged to form strokes, finally stroke order is normalized by using recursive projection and topological sort. Good offline accuracy was obtained in combination with ordinary online recognizers, which were not specially designed for extracted strokes. Given a ready-made state-of-the-art online handwritten mathematical expression recognizer, the proposed procedure correctly recognized 58.22%, 65.65%, and 65.22% of the offline formulas rendered from the datasets of the Competitions on Recognition of Online Handwritten Mathematical Expressions (CROHME) in 2014, 2016, and 2019 respectively. Furthermore, given a trainable online recognition system, retraining it with extracted strokes resulted in an offline recognizer with the same level of accuracy. On the other hand, the speed of the entire pipeline was fast enough to facilitate on-device recognition on mobile phones with limited resources. To conclude, stroke extraction provides an attractive way to build optical character recognition software.
topic Character recognition
feature extraction
offline handwritten mathematical expression recognition
optical character recognition software
stroke extraction
url https://ieeexplore.ieee.org/document/9051736/
work_keys_str_mv AT chungkwongchan strokeextractionforofflinehandwrittenmathematicalexpressionrecognition
_version_ 1724186939712602112