WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH

博士 === 國立清華大學 === 資訊工程學系 === 86 === AbstractWord sense disambiguation for unrestricted text is one of the most difficult tasks in the fields of computational linguistics. The crux of the problem is to discover a model that relates the intended sense of a...

Full description

Bibliographic Details
Main Authors: CHEN, JEN-NAN, 陳振南
Other Authors: CHANG JASON S.
Format: Others
Language:zh-TW
Published: 1998
Online Access:http://ndltd.ncl.edu.tw/handle/08124057900214650552
id ndltd-TW-086NTHU0392063
record_format oai_dc
spelling ndltd-TW-086NTHU03920632016-06-29T04:13:31Z http://ndltd.ncl.edu.tw/handle/08124057900214650552 WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH 詞彙語意歧義解析-適應性的概念式作法 CHEN, JEN-NAN 陳振南 博士 國立清華大學 資訊工程學系 86 AbstractWord sense disambiguation for unrestricted text is one of the most difficult tasks in the fields of computational linguistics. The crux of the problem is to discover a model that relates the intended sense of a word with its context. Such relations allow for other computational semantics tasks such as noun sequence interpretation and prepositional phrase attachment disambiguation. There are two aspects of building up a knowledge base of semantic relations. First, we need a representational scheme to divide and codify the word senses in a context. Both word-based and class-based representation of word senses and context have been used in the literature. Second, based upon the sense division, an algorithm is used to identify semantic relations in a lexical resource, a corpus or a machine-readable dictionary. Until recently, most knowledge bases are created manually by language experts. The awakening of the statistical approach did provide an alternative of acquiring a knowledge base from corpora. In recent years, attention has been shifted from word-based towards class-based representation, in the hope of providing broader coverage for unrestricted text. The goal of this dissertation is to make significant progress on constructing a class-based sense model from existing resources that is effective in representing the semantic relations and resolving sense ambiguity. The model should have a broad coverage of senses making efficient representation of semantic relations possible. In particular, we describe a series of algorithms based on informational retrieval techniques that cluster machine-readable dictionary (MRD) senses to provide a complete and appropriate sense division for word sense disambiguation (WSD). One algorithm exploits the topical sense clusters available in Longman''s Lexicon of Contemporary English. Another algorithm identifies the topics related to terms in the definition of a dictionary headword. In other words, senses are classified according to either their topics or disambiguated genus. Therefore, our main tool in building up a sense division, as well as semantic relation, is topical analysis of dictionary definitions. We also describe a general framework for an adaptive conceptual word sense disambiguation. The learning process described here begins with an initial disambiguation step of knowledge based on MRDs. An adaptation step follows to combine the initial knowledge base with knowledge gleaned from the partially disambiguated text. Once the knowledge base is adjusted to suit the text at hand, it is then applied to the text again to finalize the disambiguation result. Definitions and example sentences from Longman Dictionary of Contemporary English (LDOCE) are employed as training materials for WSD, while passages from the Brown corpus and Wall Street Journal are used for testing. Finally, we report on several experiments illustrating the effectiveness of the adaptive approach. CHANG JASON S. 張俊盛 1998 學位論文 ; thesis 150 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 博士 === 國立清華大學 === 資訊工程學系 === 86 === AbstractWord sense disambiguation for unrestricted text is one of the most difficult tasks in the fields of computational linguistics. The crux of the problem is to discover a model that relates the intended sense of a word with its context. Such relations allow for other computational semantics tasks such as noun sequence interpretation and prepositional phrase attachment disambiguation. There are two aspects of building up a knowledge base of semantic relations. First, we need a representational scheme to divide and codify the word senses in a context. Both word-based and class-based representation of word senses and context have been used in the literature. Second, based upon the sense division, an algorithm is used to identify semantic relations in a lexical resource, a corpus or a machine-readable dictionary. Until recently, most knowledge bases are created manually by language experts. The awakening of the statistical approach did provide an alternative of acquiring a knowledge base from corpora. In recent years, attention has been shifted from word-based towards class-based representation, in the hope of providing broader coverage for unrestricted text. The goal of this dissertation is to make significant progress on constructing a class-based sense model from existing resources that is effective in representing the semantic relations and resolving sense ambiguity. The model should have a broad coverage of senses making efficient representation of semantic relations possible. In particular, we describe a series of algorithms based on informational retrieval techniques that cluster machine-readable dictionary (MRD) senses to provide a complete and appropriate sense division for word sense disambiguation (WSD). One algorithm exploits the topical sense clusters available in Longman''s Lexicon of Contemporary English. Another algorithm identifies the topics related to terms in the definition of a dictionary headword. In other words, senses are classified according to either their topics or disambiguated genus. Therefore, our main tool in building up a sense division, as well as semantic relation, is topical analysis of dictionary definitions. We also describe a general framework for an adaptive conceptual word sense disambiguation. The learning process described here begins with an initial disambiguation step of knowledge based on MRDs. An adaptation step follows to combine the initial knowledge base with knowledge gleaned from the partially disambiguated text. Once the knowledge base is adjusted to suit the text at hand, it is then applied to the text again to finalize the disambiguation result. Definitions and example sentences from Longman Dictionary of Contemporary English (LDOCE) are employed as training materials for WSD, while passages from the Brown corpus and Wall Street Journal are used for testing. Finally, we report on several experiments illustrating the effectiveness of the adaptive approach.
author2 CHANG JASON S.
author_facet CHANG JASON S.
CHEN, JEN-NAN
陳振南
author CHEN, JEN-NAN
陳振南
spellingShingle CHEN, JEN-NAN
陳振南
WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH
author_sort CHEN, JEN-NAN
title WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH
title_short WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH
title_full WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH
title_fullStr WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH
title_full_unstemmed WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH
title_sort word sense disambiguation - a conceptual adaptive approach
publishDate 1998
url http://ndltd.ncl.edu.tw/handle/08124057900214650552
work_keys_str_mv AT chenjennan wordsensedisambiguationaconceptualadaptiveapproach
AT chénzhènnán wordsensedisambiguationaconceptualadaptiveapproach
AT chenjennan cíhuìyǔyìqíyìjiěxīshìyīngxìngdegàiniànshìzuòfǎ
AT chénzhènnán cíhuìyǔyìqíyìjiěxīshìyīngxìngdegàiniànshìzuòfǎ
_version_ 1718326109353803776