Developing a sense-tagged corpus of Chinese

碩士 === 東吳大學 === 資訊科學系 === 96 === Word sense disambiguation (WSD) is a process that tags the specific meaning of a polysemy in a given sentence. Already a lot of scholars have been devoted to the research of WSD at present, and WSD has been an important role in nature language processing. Sense tagge...

Full description

Bibliographic Details
Main Authors: Shih-yin Liu, 劉詩音
Other Authors: S. J. Ker
Format: Others
Language:zh-TW
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/xr497g
Description
Summary:碩士 === 東吳大學 === 資訊科學系 === 96 === Word sense disambiguation (WSD) is a process that tags the specific meaning of a polysemy in a given sentence. Already a lot of scholars have been devoted to the research of WSD at present, and WSD has been an important role in nature language processing. Sense tagged corpus occupies very important position to natural language processing, but there are few Chinese sense tagged corpus at present. So we designed a all-word Chinese sense tagged corpus which contained more than 110 thousand word. We selected 56 articles from Sinica Corpus and tagged the polysemy combining the n-gram method and probability method automatically, for reach high result of accuracy, we checked the tagged sense manually.