Character segmentation and form recognition for table-form documents

碩士 === 國立中央大學 === 資訊工程研究所 === 85 === We have proposed a table-form document analysis system. There are four modules in the system, the study work in this paper is just the forth module. In this paper, algorithms for character segmentation and table-...

Full description

Bibliographic Details
Main Authors: Wu, Yu-Fang, 吳昱芳
Other Authors: Din-Chang Tseng
Format: Others
Language:zh-TW
Published: 1997
Online Access:http://ndltd.ncl.edu.tw/handle/22161336545561017860
id ndltd-TW-085NCU00392020
record_format oai_dc
spelling ndltd-TW-085NCU003920202015-10-13T17:59:41Z http://ndltd.ncl.edu.tw/handle/22161336545561017860 Character segmentation and form recognition for table-form documents 表格文件的文字分割與表格辨識 Wu, Yu-Fang 吳昱芳 碩士 國立中央大學 資訊工程研究所 85 We have proposed a table-form document analysis system. There are four modules in the system, the study work in this paper is just the forth module. In this paper, algorithms for character segmentation and table-form recognition are proposed. First, we generate connected components as the basic units in character segmentation. Many Chinese characters consist of more than one radical, we group the isolated radicals into a complete Chinese word based on several heuristic rules. We also proposed a projection-profile method to solve touching-character problem. Connected components will be incorporated into complete and meaningful character components during character segmentation. We classify processed components into texts and graphs and then extract field attributes. Finally, a hierarchical recognition is proposed to determine whether an input form document is the same as a document in the database based on the extracted structure features and field attributes. The performance of proposed algorithms are evaluated using lots of table-form images. Din-Chang Tseng 曾定章 --- 1997 學位論文 ; thesis 87 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立中央大學 === 資訊工程研究所 === 85 === We have proposed a table-form document analysis system. There are four modules in the system, the study work in this paper is just the forth module. In this paper, algorithms for character segmentation and table-form recognition are proposed. First, we generate connected components as the basic units in character segmentation. Many Chinese characters consist of more than one radical, we group the isolated radicals into a complete Chinese word based on several heuristic rules. We also proposed a projection-profile method to solve touching-character problem. Connected components will be incorporated into complete and meaningful character components during character segmentation. We classify processed components into texts and graphs and then extract field attributes. Finally, a hierarchical recognition is proposed to determine whether an input form document is the same as a document in the database based on the extracted structure features and field attributes. The performance of proposed algorithms are evaluated using lots of table-form images.
author2 Din-Chang Tseng
author_facet Din-Chang Tseng
Wu, Yu-Fang
吳昱芳
author Wu, Yu-Fang
吳昱芳
spellingShingle Wu, Yu-Fang
吳昱芳
Character segmentation and form recognition for table-form documents
author_sort Wu, Yu-Fang
title Character segmentation and form recognition for table-form documents
title_short Character segmentation and form recognition for table-form documents
title_full Character segmentation and form recognition for table-form documents
title_fullStr Character segmentation and form recognition for table-form documents
title_full_unstemmed Character segmentation and form recognition for table-form documents
title_sort character segmentation and form recognition for table-form documents
publishDate 1997
url http://ndltd.ncl.edu.tw/handle/22161336545561017860
work_keys_str_mv AT wuyufang charactersegmentationandformrecognitionfortableformdocuments
AT wúyùfāng charactersegmentationandformrecognitionfortableformdocuments
AT wuyufang biǎogéwénjiàndewénzìfēngēyǔbiǎogébiànshí
AT wúyùfāng biǎogéwénjiàndewénzìfēngēyǔbiǎogébiànshí
_version_ 1717786884802871296