Article Analysis and Title Character Segmentation in Chinese Newspaper

碩士 === 國立交通大學 === 資訊工程系 === 88 === In this thesis, we present an automatic system to segment title characters in newspaper efficiently.The character segmentation system contains two modules: article analysis and character segmentation. In the article analysis module, we first perform imag...

Full description

Bibliographic Details
Main Authors: Shou-Lun Yin, 尹守綸
Other Authors: Hsi-Jian Lee
Format: Others
Language:zh-TW
Published: 2000
Online Access:http://ndltd.ncl.edu.tw/handle/61666020476583266398
Description
Summary:碩士 === 國立交通大學 === 資訊工程系 === 88 === In this thesis, we present an automatic system to segment title characters in newspaper efficiently.The character segmentation system contains two modules: article analysis and character segmentation. In the article analysis module, we first perform image reduction and connected-component extraction. The large connected-components are next classified as picture blocks, table blocks, graph blocks, and frame blocks, and the small components are classified into text components or title components. After large block classification, we merge all text components into text blocks and merge all title components into title blocks. An article in newspaper is then extracted by performing six relation tests. In the character segmentation module, we extract bi-lines from title blocks. Then we segment Chinese characters, English letters and numerals in title lines. Touched characters are separated according to the average size. In our experiments, the character segmentation rate is about 98.9%. The correct block classification rate is about 97%. This shows the effectiveness of our proposed system.