WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH
博士 === 國立清華大學 === 資訊工程學系 === 86 === AbstractWord sense disambiguation for unrestricted text is one of the most difficult tasks in the fields of computational linguistics. The crux of the problem is to discover a model that relates the intended sense of a...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
1998
|
Online Access: | http://ndltd.ncl.edu.tw/handle/08124057900214650552 |
id |
ndltd-TW-086NTHU0392063 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-086NTHU03920632016-06-29T04:13:31Z http://ndltd.ncl.edu.tw/handle/08124057900214650552 WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH 詞彙語意歧義解析-適應性的概念式作法 CHEN, JEN-NAN 陳振南 博士 國立清華大學 資訊工程學系 86 AbstractWord sense disambiguation for unrestricted text is one of the most difficult tasks in the fields of computational linguistics. The crux of the problem is to discover a model that relates the intended sense of a word with its context. Such relations allow for other computational semantics tasks such as noun sequence interpretation and prepositional phrase attachment disambiguation. There are two aspects of building up a knowledge base of semantic relations. First, we need a representational scheme to divide and codify the word senses in a context. Both word-based and class-based representation of word senses and context have been used in the literature. Second, based upon the sense division, an algorithm is used to identify semantic relations in a lexical resource, a corpus or a machine-readable dictionary. Until recently, most knowledge bases are created manually by language experts. The awakening of the statistical approach did provide an alternative of acquiring a knowledge base from corpora. In recent years, attention has been shifted from word-based towards class-based representation, in the hope of providing broader coverage for unrestricted text. The goal of this dissertation is to make significant progress on constructing a class-based sense model from existing resources that is effective in representing the semantic relations and resolving sense ambiguity. The model should have a broad coverage of senses making efficient representation of semantic relations possible. In particular, we describe a series of algorithms based on informational retrieval techniques that cluster machine-readable dictionary (MRD) senses to provide a complete and appropriate sense division for word sense disambiguation (WSD). One algorithm exploits the topical sense clusters available in Longman''s Lexicon of Contemporary English. Another algorithm identifies the topics related to terms in the definition of a dictionary headword. In other words, senses are classified according to either their topics or disambiguated genus. Therefore, our main tool in building up a sense division, as well as semantic relation, is topical analysis of dictionary definitions. We also describe a general framework for an adaptive conceptual word sense disambiguation. The learning process described here begins with an initial disambiguation step of knowledge based on MRDs. An adaptation step follows to combine the initial knowledge base with knowledge gleaned from the partially disambiguated text. Once the knowledge base is adjusted to suit the text at hand, it is then applied to the text again to finalize the disambiguation result. Definitions and example sentences from Longman Dictionary of Contemporary English (LDOCE) are employed as training materials for WSD, while passages from the Brown corpus and Wall Street Journal are used for testing. Finally, we report on several experiments illustrating the effectiveness of the adaptive approach. CHANG JASON S. 張俊盛 1998 學位論文 ; thesis 150 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立清華大學 === 資訊工程學系 === 86 === AbstractWord sense disambiguation for unrestricted text is one
of the most difficult tasks in the fields of computational
linguistics. The crux of the problem is to discover a model
that relates the intended sense of a word with its context.
Such relations allow for other computational semantics tasks
such as noun sequence interpretation and prepositional phrase
attachment disambiguation. There are two aspects of building up
a knowledge base of semantic relations. First, we need a
representational scheme to divide and codify the word senses in
a context. Both word-based and class-based representation of
word senses and context have been used in the literature.
Second, based upon the sense division, an algorithm is used to
identify semantic relations in a lexical resource, a corpus or a
machine-readable dictionary. Until recently, most knowledge
bases are created manually by language experts. The awakening
of the statistical approach did provide an alternative of
acquiring a knowledge base from corpora. In recent years,
attention has been shifted from word-based towards class-based
representation, in the hope of providing broader coverage for
unrestricted text. The goal of this dissertation is to make
significant progress on constructing a class-based sense model
from existing resources that is effective in representing the
semantic relations and resolving sense ambiguity. The model
should have a broad coverage of senses making efficient
representation of semantic relations possible. In particular,
we describe a series of algorithms based on informational
retrieval techniques that cluster machine-readable dictionary
(MRD) senses to provide a complete and appropriate sense
division for word sense disambiguation (WSD). One algorithm
exploits the topical sense clusters available in Longman''s
Lexicon of Contemporary English. Another algorithm identifies
the topics related to terms in the definition of a dictionary
headword. In other words, senses are classified according to
either their topics or disambiguated genus. Therefore, our main
tool in building up a sense division, as well as semantic
relation, is topical analysis of dictionary definitions. We
also describe a general framework for an adaptive conceptual
word sense disambiguation. The learning process described here
begins with an initial disambiguation step of knowledge based on
MRDs. An adaptation step follows to combine the initial
knowledge base with knowledge gleaned from the partially
disambiguated text. Once the knowledge base is adjusted to suit
the text at hand, it is then applied to the text again to
finalize the disambiguation result. Definitions and example
sentences from Longman Dictionary of Contemporary English
(LDOCE) are employed as training materials for WSD, while
passages from the Brown corpus and Wall Street Journal are used
for testing. Finally, we report on several experiments
illustrating the effectiveness of the adaptive approach.
|
author2 |
CHANG JASON S. |
author_facet |
CHANG JASON S. CHEN, JEN-NAN 陳振南 |
author |
CHEN, JEN-NAN 陳振南 |
spellingShingle |
CHEN, JEN-NAN 陳振南 WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH |
author_sort |
CHEN, JEN-NAN |
title |
WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH |
title_short |
WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH |
title_full |
WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH |
title_fullStr |
WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH |
title_full_unstemmed |
WORD SENSE DISAMBIGUATION - A CONCEPTUAL ADAPTIVE APPROACH |
title_sort |
word sense disambiguation - a conceptual adaptive approach |
publishDate |
1998 |
url |
http://ndltd.ncl.edu.tw/handle/08124057900214650552 |
work_keys_str_mv |
AT chenjennan wordsensedisambiguationaconceptualadaptiveapproach AT chénzhènnán wordsensedisambiguationaconceptualadaptiveapproach AT chenjennan cíhuìyǔyìqíyìjiěxīshìyīngxìngdegàiniànshìzuòfǎ AT chénzhènnán cíhuìyǔyìqíyìjiěxīshìyīngxìngdegàiniànshìzuòfǎ |
_version_ |
1718326109353803776 |