A Study of Handwritten Chinese Text Recognition

博士 === 國立交通大學 === 資訊工程研究所 === 82 === In this thesis,we propose a Chinese text processing system for handwritten Chinese character recognition,contextual postprocessing and maintenance of the dictionary.A handwritten Chinese character generator is created...

Full description

Bibliographic Details
Main Authors: Cheng-Huang Tung, 董呈煌
Other Authors: Hsi-Jian Lee
Format: Others
Language:en_US
Published: 1994
Online Access:http://ndltd.ncl.edu.tw/handle/76462947641190721731
Description
Summary:博士 === 國立交通大學 === 資訊工程研究所 === 82 === In this thesis,we propose a Chinese text processing system for handwritten Chinese character recognition,contextual postprocessing and maintenance of the dictionary.A handwritten Chinese character generator is created for generating handwritten Chinese character images.By utilizing the generated character images,we measure the performance of segmenting an image into a number of meshes by different methods,and derive the feature extraction for an image mesh. After the performance measurement,a character recognition system consisting of a candidate selection module and a matching module is established. Because character recognition still takes much execution time,we propose a multi-stage candidate pre-selection module to reduce the execution time. In each stage,we use a single feature computed from the input character image to eliminate impossible character categories. The features used in candidate pre- selection are ordered according to the reduction rates evaluated from a set of training characters. We also propose a method for organizing the character database as a classification tree. The experimental results show that the proposed model can reduce the total execution time significantly without decreasing the precision of character recognition. We present a new approach for detecting and correcting characters erroneously identified by the matching module. Two matching modules are applied at the recognition stage to recognize an input character image simultaneously. If the matching results of the two modules for a character image are not the same,the character image is rejected at the recognition stage. Here,we construct the second recognition module by maximizing the accuracy of the accepted training characters. Because the recognition stage recognizes most of the input characters correctly and outputs a small number of candidates for each rejected character,a character bigram Markov language model can be applied to choose a candidate with high recognition rate.