Research on Query Expansion- Using Topic Analysis and Keyword Extraction

碩士 === 國立清華大學 === 資訊工程學系 === 90 === Because of the progress of Information Technology, how to get the desired information precisely become a pressing problem. The study of “Information Retrieval” addresses the issues of efficient information storage and retrieval methods, and “Query Expa...

Full description

Bibliographic Details
Main Authors: Hsiang-chi Hsieh, 謝祥綺
Other Authors: 張俊盛
Format: Others
Language:zh-TW
Published: 2002
Online Access:http://ndltd.ncl.edu.tw/handle/94151038082463936568
Description
Summary:碩士 === 國立清華大學 === 資訊工程學系 === 90 === Because of the progress of Information Technology, how to get the desired information precisely become a pressing problem. The study of “Information Retrieval” addresses the issues of efficient information storage and retrieval methods, and “Query Expansion” is an important technique in this field. In order to increase the effectiveness of query expansion, this paper presents methods of topic analysis for documents and keywords. Further more, using these methods, we try to construct thesauri automatically and extract query expansion keywords. We obverse that the topic of a document is determined by its content words, and the topic of a keyword is determined by the documents it appears. So, the analysis is done by repeatedly computing topics of documents and that of keywords. The experiments showed that the topic analysis of documents can filter out 90% of non-relevant documents for the query. And the topic similarity between two keywords is also a good judge about the relevance of one keyword to the other. For keyword extraction, we introduce “Predecessor and Successor Variety”, which combined with rules of part of speech, is very effective for extracting Chinese noun phrases from corpus. Finally, we apply these methods to the research of constructing thesauri automatically and query expansion.