Summary: | 碩士 === 國立臺灣科技大學 === 電子工程系 === 99 === This thesis is about recovery and character segmentation of Chinese warped document images. The research work includes four parts:
The first part is about software design of the binarization algorithm for digital document images.
The second part is about software design of recovery from warped document images. To restore the distorted or skewed Chinese document images this software includes the following algorithms such as text line and word detection, word baseline estimation, and skew correction.
The third part is about software design for segmenting the Chinese document images. The characters are segmented by using the connected component labeling algorithm. This would make character normalization and document image analysis easier. Finally, a coordinate file for individual characters will be generated.
The fourth part is about the evaluation of reliability and run-time performance of recovery and character segmentation of Chinese warped document images. Experiments over various kinds of Chinese warped document images have shown that recovery and character segmentation can be operated very well.
On the whole, this thesis has accomplished the related algorithms and software design of recovery and character segmentation of Chinese warped document images. After being verified by various kinds of document images, the algorithm developed in this thesis has shown very good performance in skew correction and can improve the recognition rate of the document image analysis system used later.
|