Character segmentation and form recognition for table-form documents

碩士 === 國立中央大學 === 資訊工程研究所 === 85 === We have proposed a table-form document analysis system. There are four modules in the system, the study work in this paper is just the forth module. In this paper, algorithms for character segmentation and table-...

Full description

Bibliographic Details
Main Authors: Wu, Yu-Fang, 吳昱芳
Other Authors: Din-Chang Tseng
Format: Others
Language:zh-TW
Published: 1997
Online Access:http://ndltd.ncl.edu.tw/handle/22161336545561017860
Description
Summary:碩士 === 國立中央大學 === 資訊工程研究所 === 85 === We have proposed a table-form document analysis system. There are four modules in the system, the study work in this paper is just the forth module. In this paper, algorithms for character segmentation and table-form recognition are proposed. First, we generate connected components as the basic units in character segmentation. Many Chinese characters consist of more than one radical, we group the isolated radicals into a complete Chinese word based on several heuristic rules. We also proposed a projection-profile method to solve touching-character problem. Connected components will be incorporated into complete and meaningful character components during character segmentation. We classify processed components into texts and graphs and then extract field attributes. Finally, a hierarchical recognition is proposed to determine whether an input form document is the same as a document in the database based on the extracted structure features and field attributes. The performance of proposed algorithms are evaluated using lots of table-form images.