Summary: | 博士 === 國立交通大學 === 資訊工程研究所 === 82 === In this thesis,we propose a Chinese text processing system for
handwritten Chinese character recognition,contextual
postprocessing and maintenance of the dictionary.A handwritten
Chinese character generator is created for generating
handwritten Chinese character images.By utilizing the generated
character images,we measure the performance of segmenting an
image into a number of meshes by different methods,and derive
the feature extraction for an image mesh. After the performance
measurement,a character recognition system consisting of a
candidate selection module and a matching module is
established. Because character recognition still takes much
execution time,we propose a multi-stage candidate pre-selection
module to reduce the execution time. In each stage,we use a
single feature computed from the input character image to
eliminate impossible character categories. The features used in
candidate pre- selection are ordered according to the reduction
rates evaluated from a set of training characters. We also
propose a method for organizing the character database as a
classification tree. The experimental results show that the
proposed model can reduce the total execution time
significantly without decreasing the precision of character
recognition. We present a new approach for detecting and
correcting characters erroneously identified by the matching
module. Two matching modules are applied at the recognition
stage to recognize an input character image simultaneously. If
the matching results of the two modules for a character image
are not the same,the character image is rejected at the
recognition stage. Here,we construct the second recognition
module by maximizing the accuracy of the accepted training
characters. Because the recognition stage recognizes most of
the input characters correctly and outputs a small number of
candidates for each rejected character,a character bigram
Markov language model can be applied to choose a candidate with
high recognition rate.
|