Document Image Content Analysis Based on Image Segmentation Results
碩士 === 國立交通大學 === 資訊科學系 === 88 === In this study, a system for document analysis, including skew correction, segmentation, classification, understanding, and display is proposed. In the skew correction phase, we propose a data reduction method for fast skew estimation using the Hough tran...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2000
|
Online Access: | http://ndltd.ncl.edu.tw/handle/62904826118100209726 |
id |
ndltd-TW-088NCTU0394039 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-088NCTU03940392015-10-13T10:59:52Z http://ndltd.ncl.edu.tw/handle/62904826118100209726 Document Image Content Analysis Based on Image Segmentation Results 基於影像切割結果對文件影像內容作分析 Yia-Wen Lee 李雅雯 碩士 國立交通大學 資訊科學系 88 In this study, a system for document analysis, including skew correction, segmentation, classification, understanding, and display is proposed. In the skew correction phase, we propose a data reduction method for fast skew estimation using the Hough transform. In the segmentation phase, a bottom-up method for color document segmentation is adopted to obtain segmented blocks, including text blocks, text lines, and graphic blocks, of the document image. And then in the classification phase, several features are used for extracting titles, tables, and small enclosed articles from segmented blocks. After block classification, titles are understood by an adopted OCR system, and with a user interfaces designed in this study, the document can be displayed conveniently with classified blocks. In the thumbnail creation phase, we propose a novel method to create a thumbnail image with better visual effects by keeping edge information in graphics and table blocks, and showing ASCII characters in titles. Experimental results are shown to prove the feasibility of the proposed approach. Wen-Hsiang Tsai 蔡文祥 2000 學位論文 ; thesis 73 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 資訊科學系 === 88 === In this study, a system for document analysis, including skew correction, segmentation, classification, understanding, and display is proposed. In the skew correction phase, we propose a data reduction method for fast skew estimation using the Hough transform. In the segmentation phase, a bottom-up method for color document segmentation is adopted to obtain segmented blocks, including text blocks, text lines, and graphic blocks, of the document image. And then in the classification phase, several features are used for extracting titles, tables, and small enclosed articles from segmented blocks. After block classification, titles are understood by an adopted OCR system, and with a user interfaces designed in this study, the document can be displayed conveniently with classified blocks. In the thumbnail creation phase, we propose a novel method to create a thumbnail image with better visual effects by keeping edge information in graphics and table blocks, and showing ASCII characters in titles. Experimental results are shown to prove the feasibility of the proposed approach.
|
author2 |
Wen-Hsiang Tsai |
author_facet |
Wen-Hsiang Tsai Yia-Wen Lee 李雅雯 |
author |
Yia-Wen Lee 李雅雯 |
spellingShingle |
Yia-Wen Lee 李雅雯 Document Image Content Analysis Based on Image Segmentation Results |
author_sort |
Yia-Wen Lee |
title |
Document Image Content Analysis Based on Image Segmentation Results |
title_short |
Document Image Content Analysis Based on Image Segmentation Results |
title_full |
Document Image Content Analysis Based on Image Segmentation Results |
title_fullStr |
Document Image Content Analysis Based on Image Segmentation Results |
title_full_unstemmed |
Document Image Content Analysis Based on Image Segmentation Results |
title_sort |
document image content analysis based on image segmentation results |
publishDate |
2000 |
url |
http://ndltd.ncl.edu.tw/handle/62904826118100209726 |
work_keys_str_mv |
AT yiawenlee documentimagecontentanalysisbasedonimagesegmentationresults AT lǐyǎwén documentimagecontentanalysisbasedonimagesegmentationresults AT yiawenlee jīyúyǐngxiàngqiègējiéguǒduìwénjiànyǐngxiàngnèiróngzuòfēnxī AT lǐyǎwén jīyúyǐngxiàngqiègējiéguǒduìwénjiànyǐngxiàngnèiróngzuòfēnxī |
_version_ |
1716835401394552832 |