Document Image Content Analysis Based on Image Segmentation Results

碩士 === 國立交通大學 === 資訊科學系 === 88 === In this study, a system for document analysis, including skew correction, segmentation, classification, understanding, and display is proposed. In the skew correction phase, we propose a data reduction method for fast skew estimation using the Hough tran...

Full description

Bibliographic Details
Main Authors: Yia-Wen Lee, 李雅雯
Other Authors: Wen-Hsiang Tsai
Format: Others
Language:en_US
Published: 2000
Online Access:http://ndltd.ncl.edu.tw/handle/62904826118100209726
Description
Summary:碩士 === 國立交通大學 === 資訊科學系 === 88 === In this study, a system for document analysis, including skew correction, segmentation, classification, understanding, and display is proposed. In the skew correction phase, we propose a data reduction method for fast skew estimation using the Hough transform. In the segmentation phase, a bottom-up method for color document segmentation is adopted to obtain segmented blocks, including text blocks, text lines, and graphic blocks, of the document image. And then in the classification phase, several features are used for extracting titles, tables, and small enclosed articles from segmented blocks. After block classification, titles are understood by an adopted OCR system, and with a user interfaces designed in this study, the document can be displayed conveniently with classified blocks. In the thumbnail creation phase, we propose a novel method to create a thumbnail image with better visual effects by keeping edge information in graphics and table blocks, and showing ASCII characters in titles. Experimental results are shown to prove the feasibility of the proposed approach.