Mining Relevant Syntactic Patterns for Chinese Text Extraction

碩士 === 國立中央大學 === 資訊工程研究所 === 90 === IE is a research topic related to TREC (Text Retrieval Conference) and MUC (Message Understanding Conference). The target of Information extraction (IE) is to extract specific types of information from text. The IE systems for free text form written in English...

Full description

Bibliographic Details
Main Authors: Dong-shun Wu, 吳東軒
Other Authors: Chia-Hui Chang
Format: Others
Language:zh-TW
Published: 2002
Online Access:http://ndltd.ncl.edu.tw/handle/30689959598695641672
Description
Summary:碩士 === 國立中央大學 === 資訊工程研究所 === 90 === IE is a research topic related to TREC (Text Retrieval Conference) and MUC (Message Understanding Conference). The target of Information extraction (IE) is to extract specific types of information from text. The IE systems for free text form written in English are different from the systems for Chinese. In this paper we propose a simple method for extracting information from free text from written in Chinese. We use training examples and encode them with the responding targets. Then we find the repeated substrings within the encoded text. These repeated substrings play the role in our IE system for Chinese which is likes the role of the sentence analyzers in some IE systems for free text form in English. In the phrase for extracting information from testing data, we first encode them and then extract the interesting target by the repeated substrings fined previously.