Automatic topic detection from news stories.

Hui Kin. === Thesis (M.Phil.)--Chinese University of Hong Kong, 2001. === Includes bibliographical references (leaves 115-120). === Abstracts in English and Chinese. === Chapter 1 --- Introduction --- p.1 === Chapter 1.1 --- Topic Detection Problem --- p.2 === Chapter 1.1.1 --- What is a Topic? --...

Full description

Bibliographic Details
Other Authors: Hui, Kin.
Format: Others
Language:English
Chinese
Published: 2001
Subjects:
Online Access:http://library.cuhk.edu.hk/record=b5890594
http://repository.lib.cuhk.edu.hk/en/item/cuhk-323424
Description
Summary:Hui Kin. === Thesis (M.Phil.)--Chinese University of Hong Kong, 2001. === Includes bibliographical references (leaves 115-120). === Abstracts in English and Chinese. === Chapter 1 --- Introduction --- p.1 === Chapter 1.1 --- Topic Detection Problem --- p.2 === Chapter 1.1.1 --- What is a Topic? --- p.2 === Chapter 1.1.2 --- Topic Detection --- p.3 === Chapter 1.2 --- Our Contributions --- p.5 === Chapter 1.2.1 --- Thesis Organization --- p.6 === Chapter 2 --- Literature Review --- p.7 === Chapter 2.1 --- Dragon Systems --- p.7 === Chapter 2.2 --- University of Massachusetts (UMass) --- p.9 === Chapter 2.3 --- Carnegie Mellon University (CMU) --- p.10 === Chapter 2.4 --- BBN Technologies --- p.11 === Chapter 2.5 --- IBM T. J. Watson Research Center --- p.12 === Chapter 2.6 --- National Taiwan University (NTU) --- p.13 === Chapter 2.7 --- Drawbacks of Existing Approaches --- p.14 === Chapter 3 --- System Overview --- p.16 === Chapter 3.1 --- News Sources --- p.17 === Chapter 3.2 --- Story Preprocessing --- p.21 === Chapter 3.3 --- Named Entity Extraction --- p.22 === Chapter 3.4 --- Gross Translation --- p.22 === Chapter 3.5 --- Unsupervised Learning Module --- p.24 === Chapter 4 --- Term Extraction and Story Representation --- p.27 === Chapter 4.1 --- IBM Intelligent Miner For Text --- p.28 === Chapter 4.2 --- Transformation-based Error-driven Learning --- p.31 === Chapter 4.2.1 --- Learning Stage --- p.32 === Chapter 4.2.2 --- Design of New Tags --- p.33 === Chapter 4.2.3 --- Lexical Rules Learning --- p.35 === Chapter 4.2.4 --- Contextual Rules Learning --- p.39 === Chapter 4.3 --- Extracting Named Entities Using Learned Rules --- p.42 === Chapter 4.4 --- Story Representation --- p.46 === Chapter 4.4.1 --- Basic Representation --- p.46 === Chapter 4.4.2 --- Enhanced Representation --- p.47 === Chapter 5 --- Gross Translation --- p.52 === Chapter 5.1 --- Basic Translation --- p.52 === Chapter 5.2 --- Enhanced Translation --- p.60 === Chapter 5.2.1 --- Parallel Corpus Alignment Approach --- p.60 === Chapter 5.2.2 --- Enhanced Translation Approach --- p.62 === Chapter 6 --- Unsupervised Learning Module --- p.68 === Chapter 6.1 --- Overview of the Discovery Algorithm --- p.68 === Chapter 6.2 --- Topic Representation --- p.70 === Chapter 6.3 --- Similarity Calculation --- p.72 === Chapter 6.3.1 --- Similarity Score Calculation --- p.72 === Chapter 6.3.2 --- Time Adjustment Scheme --- p.74 === Chapter 6.3.3 --- Language Normalization Scheme --- p.75 === Chapter 6.4 --- Related Elements Combination --- p.78 === Chapter 7 --- Experimental Results and Analysis --- p.84 === Chapter 7.1 --- TDT corpora --- p.84 === Chapter 7.2 --- Evaluation Methodology --- p.85 === Chapter 7.3 --- Experimental Results on Various Parameter Settings --- p.88 === Chapter 7.4 --- Experiments Results on Various Named Entity Extraction Ap- proaches --- p.89 === Chapter 7.5 --- Experiments Results on Various Story Representation Approaches --- p.100 === Chapter 7.6 --- Experiments Results on Various Translation Approaches --- p.104 === Chapter 7.7 --- Experiments Results on the Effect of Language Normalization Scheme on Detection Approaches --- p.106 === Chapter 7.8 --- TDT2000 Topic Detection Result --- p.110 === Chapter 8 --- Conclusions and Future Works --- p.112 === Chapter 8.1 --- Conclusions --- p.112 === Chapter 8.2 --- Future Work --- p.114 === Bibliography --- p.115 === Chapter A --- List of Topics annotated for TDT2 Corpus --- p.121 === Chapter B --- Significant Test Results --- p.124