Special Typeface Identification in Chinese Document Images

碩士 === 大葉大學 === 資訊管理學系碩士班 === 93 === Optical character recognition (OCR) is a famous research subject in recent twenty years. To digitize paper documents by applying OCR techniques can decrease the document storage space. These digitized document images can be classified and retrieved conveniently....

Full description

Bibliographic Details
Main Authors:	Lin Yu-Yuan, 林裕淵
Other Authors:	Tseng Yi-Hong
Format:	Others
Language:	zh-TW
Published:	2005
Online Access:	http://ndltd.ncl.edu.tw/handle/40694443264057230071

id	ndltd-TW-093DYU00396018
record_format	oai_dc
spelling	ndltd-TW-093DYU003960182015-10-13T11:39:44Z http://ndltd.ncl.edu.tw/handle/40694443264057230071 Special Typeface Identification in Chinese Document Images 中文文件影像中之特殊字體偵測 Lin Yu-Yuan 林裕淵碩士大葉大學資訊管理學系碩士班 93 Optical character recognition (OCR) is a famous research subject in recent twenty years. To digitize paper documents by applying OCR techniques can decrease the document storage space. These digitized document images can be classified and retrieved conveniently. At present, commercial OCR products purported to provide a satisfactory recognition results whose recognition accuracy is over 90%. The accuracy is generally measured by recognizing those printed characters whose typefaces are normal. However, several special typefaces such as italic, underline, hollow, and boldface, poor recognition accuracy is obtained by commercial OCR systems. Since the amount of Chinese characters is large, the recognition speed is slow using a multi-engine OCR system. This paper proposes an approach to detect all characters in special typefaces. In the proposed typeface identification system, text lines and character components are extracted by analyzing the projection profiles of text block images. Then, several characteristics such as component sizes, gaps between two components, stroke widths, and black run lengths, are computed and analyzed to identify the typeface of each character. Afterward, a specific recognition engine is applied to recognize each unknown character according to the corresponding typeface identification result. Tseng Yi-Hong 曾逸鴻 2005 學位論文 ; thesis 51 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 大葉大學 === 資訊管理學系碩士班 === 93 === Optical character recognition (OCR) is a famous research subject in recent twenty years. To digitize paper documents by applying OCR techniques can decrease the document storage space. These digitized document images can be classified and retrieved conveniently. At present, commercial OCR products purported to provide a satisfactory recognition results whose recognition accuracy is over 90%. The accuracy is generally measured by recognizing those printed characters whose typefaces are normal. However, several special typefaces such as italic, underline, hollow, and boldface, poor recognition accuracy is obtained by commercial OCR systems. Since the amount of Chinese characters is large, the recognition speed is slow using a multi-engine OCR system. This paper proposes an approach to detect all characters in special typefaces. In the proposed typeface identification system, text lines and character components are extracted by analyzing the projection profiles of text block images. Then, several characteristics such as component sizes, gaps between two components, stroke widths, and black run lengths, are computed and analyzed to identify the typeface of each character. Afterward, a specific recognition engine is applied to recognize each unknown character according to the corresponding typeface identification result.
author2	Tseng Yi-Hong
author_facet	Tseng Yi-Hong Lin Yu-Yuan 林裕淵
author	Lin Yu-Yuan 林裕淵
spellingShingle	Lin Yu-Yuan 林裕淵 Special Typeface Identification in Chinese Document Images
author_sort	Lin Yu-Yuan
title	Special Typeface Identification in Chinese Document Images
title_short	Special Typeface Identification in Chinese Document Images
title_full	Special Typeface Identification in Chinese Document Images
title_fullStr	Special Typeface Identification in Chinese Document Images
title_full_unstemmed	Special Typeface Identification in Chinese Document Images
title_sort	special typeface identification in chinese document images
publishDate	2005
url	http://ndltd.ncl.edu.tw/handle/40694443264057230071
work_keys_str_mv	AT linyuyuan specialtypefaceidentificationinchinesedocumentimages AT línyùyuān specialtypefaceidentificationinchinesedocumentimages AT linyuyuan zhōngwénwénjiànyǐngxiàngzhōngzhītèshūzìtǐzhēncè AT línyùyuān zhōngwénwénjiànyǐngxiàngzhōngzhītèshūzìtǐzhēncè
_version_	1716847054309818368

Special Typeface Identification in Chinese Document Images

Similar Items