Using contextual information in clustering Chinese word senses

碩士 === 國立政治大學 === 資訊科學系 === 108 === Lexical ambiguityis a common language phenomenon. In English, the word bank can refer to the bank which we save our money or a river bank. In Chinese, the term cattle(黃牛) can stand for either a cattle or a scalper. Currently the understanding of lexical ambiguity...

Full description

Bibliographic Details
Main Authors: Chou, Tzu Hao, 周子皓
Other Authors: Liu, Chao Lin
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/h84ctr
Description
Summary:碩士 === 國立政治大學 === 資訊科學系 === 108 === Lexical ambiguityis a common language phenomenon. In English, the word bank can refer to the bank which we save our money or a river bank. In Chinese, the term cattle(黃牛) can stand for either a cattle or a scalper. Currently the understanding of lexical ambiguity terms come from either the dictionary or a search system. However, there are often times where a dictionary or a search system is not enough. Dictionaries have a standard procedure for including content and once the dictionary has been published it cannot be updated frequently. Therefore, dictionaries can fail to include new definitions or verbal usage. For search systems, using the Academia Sinica’s database as an example, users are required to read through all related sentences to understand related meanings. Current research on lexical ambiguity requires researchers to examine sentences, extract term meanings and cluster them one by one. In this study, the best clustering model and variables are selected based on purity values derived from references provided by the user. Then, the selected clustering model is used to find more terms and references with similar meanings from the database. Finally, the terms will be clustered according to selected meanings. This study also observes whether different types of lexical ambiguity will affect the results of clustering and embedding. Therefore, this study chooses homonym such as amazon and apple, polysemy’s such as departure and pressure as research subjects. This study hopes to reduce the time needed for researchers to examine sentences, extract term meanings and cluster them one by one in lexical ambiguity researches.