Corpus-Based Coherence Relation Tagging in Chinese Discourse

碩士 === 國立交通大學 === 資訊科學與工程研究所 === 94 === Discourse analysis plays an important role of document understanding and is crucial for clarifying the proposition and logical structure of the document. Therefore, this thesis is aimed to built a automated Chinese discourse tagging system by collecting and ex...

Full description

Bibliographic Details
Main Authors: Shou-Yi Cheng, 鄭守益
Other Authors: Tyne Liang
Format: Others
Language:zh-TW
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/96375712768665040446
id ndltd-TW-094NCTU5394061
record_format oai_dc
spelling ndltd-TW-094NCTU53940612016-05-27T04:18:35Z http://ndltd.ncl.edu.tw/handle/96375712768665040446 Corpus-Based Coherence Relation Tagging in Chinese Discourse 以語料為基礎的中文語篇連貫關係自動標記 Shou-Yi Cheng 鄭守益 碩士 國立交通大學 資訊科學與工程研究所 94 Discourse analysis plays an important role of document understanding and is crucial for clarifying the proposition and logical structure of the document. Therefore, this thesis is aimed to built a automated Chinese discourse tagging system by collecting and expanding the coherence feature of discourse base on corpus study and to design the corresponding rules. We used the written documents from Sinica Balance Corpus 3.0 as our mining corpus. It includes 7265 articles covering news, biographies, essays, letters, commentary and illustration manuals. We mine individually cue term, continuous POS tag and peculiar punctuation marks for nine types of rhetorical relations of Chinese discourse, that includes Coordinate, Continue, Option, Forward, Disjunctive, Cause and Effect, Conditions, Elaboration and Goal. In our experiment, we used 100 news editorial articles, each of which contains around 1500 words(1424~1558), as testing corpus. The precision, recall and filtration precision of intra sentence tagging achieve 91%, 95% and 98%. On the other hand, the precision, recall and filtration precision of inter sentence tagging achieve 86%, 93% and 95%. Tyne Liang 梁婷 2006 學位論文 ; thesis 62 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 資訊科學與工程研究所 === 94 === Discourse analysis plays an important role of document understanding and is crucial for clarifying the proposition and logical structure of the document. Therefore, this thesis is aimed to built a automated Chinese discourse tagging system by collecting and expanding the coherence feature of discourse base on corpus study and to design the corresponding rules. We used the written documents from Sinica Balance Corpus 3.0 as our mining corpus. It includes 7265 articles covering news, biographies, essays, letters, commentary and illustration manuals. We mine individually cue term, continuous POS tag and peculiar punctuation marks for nine types of rhetorical relations of Chinese discourse, that includes Coordinate, Continue, Option, Forward, Disjunctive, Cause and Effect, Conditions, Elaboration and Goal. In our experiment, we used 100 news editorial articles, each of which contains around 1500 words(1424~1558), as testing corpus. The precision, recall and filtration precision of intra sentence tagging achieve 91%, 95% and 98%. On the other hand, the precision, recall and filtration precision of inter sentence tagging achieve 86%, 93% and 95%.
author2 Tyne Liang
author_facet Tyne Liang
Shou-Yi Cheng
鄭守益
author Shou-Yi Cheng
鄭守益
spellingShingle Shou-Yi Cheng
鄭守益
Corpus-Based Coherence Relation Tagging in Chinese Discourse
author_sort Shou-Yi Cheng
title Corpus-Based Coherence Relation Tagging in Chinese Discourse
title_short Corpus-Based Coherence Relation Tagging in Chinese Discourse
title_full Corpus-Based Coherence Relation Tagging in Chinese Discourse
title_fullStr Corpus-Based Coherence Relation Tagging in Chinese Discourse
title_full_unstemmed Corpus-Based Coherence Relation Tagging in Chinese Discourse
title_sort corpus-based coherence relation tagging in chinese discourse
publishDate 2006
url http://ndltd.ncl.edu.tw/handle/96375712768665040446
work_keys_str_mv AT shouyicheng corpusbasedcoherencerelationtagginginchinesediscourse
AT zhèngshǒuyì corpusbasedcoherencerelationtagginginchinesediscourse
AT shouyicheng yǐyǔliàowèijīchǔdezhōngwényǔpiānliánguànguānxìzìdòngbiāojì
AT zhèngshǒuyì yǐyǔliàowèijīchǔdezhōngwényǔpiānliánguànguānxìzìdòngbiāojì
_version_ 1718282640884236288