A Study on Automatic Chinese Keyword Extraction Based on Search Engines and Internet Encyclopedias

碩士 === 國立雲林科技大學 === 資訊管理系 === 103 === Keywords are a subset of words or phrases from a document those can describe the meaning of the document. The major methods for Chinese keyword extraction are keyword lexicons approaches, statistics approaches, linguistics approaches, etc. Among these methods, k...

Full description

Bibliographic Details
Main Authors: ZENG,YU-HONG, 曾郁閎
Other Authors: HUANG,CHIN-FA
Format: Others
Language:zh-TW
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/75916377647627372221
Description
Summary:碩士 === 國立雲林科技大學 === 資訊管理系 === 103 === Keywords are a subset of words or phrases from a document those can describe the meaning of the document. The major methods for Chinese keyword extraction are keyword lexicons approaches, statistics approaches, linguistics approaches, etc. Among these methods, keyword lexicons approaches make keyword extraction high precision and high efficient, but building keyword lexicons spends a lot of time and the maintenance of keyword lexicons is manual. This research presents a Chinese keyword extraction system based on CKIP Chinese word segmentation system. This system provides the recombination of words by using part of speech (POS) combination and automatic words combination via search engine (Google Search) and internet encyclopedia (Wikipedia). This system also focuses on building a keyword lexicon that can update its keywords automatically. The system can improve the disadvantages of keyword lexicons approaches. The results of experiments show that using the CKIP Chinese word segmentation system, POS combination and automatic words combination gains higher precision and the number of documents does not affect the performance of the keyword extraction system. Keywords: Keyword Extraction, Keyword Lexicon, Search Engine, Internet Encyclopedia