Multi-Document Summarization Based on Keyword Clustering

碩士 === 國立臺灣科技大學 === 資訊管理系 === 90 === With the rapid growth of the World Wide Web, more and more information is accessible on-line. This explosion of information has resulted in an information overload problem. However, people have no time to read everything and have to decide which information is av...

Full description

Bibliographic Details
Main Author: 黃思萱
Other Authors: 徐 俊 傑
Format: Others
Language:zh-TW
Published: 2002
Online Access:http://ndltd.ncl.edu.tw/handle/53800112370126276087
id ndltd-TW-090NTUST396014
record_format oai_dc
spelling ndltd-TW-090NTUST3960142015-10-13T14:41:23Z http://ndltd.ncl.edu.tw/handle/53800112370126276087 Multi-Document Summarization Based on Keyword Clustering 以關鍵詞分群為基礎的多文件摘要 黃思萱 碩士 國立臺灣科技大學 資訊管理系 90 With the rapid growth of the World Wide Web, more and more information is accessible on-line. This explosion of information has resulted in an information overload problem. However, people have no time to read everything and have to decide which information is available. The technology of automatic text summarization is indispensable for dealing with this problem. Text summarization is the process of distilling the most important information from a source to produce an abridged version for a particular user and task. Recent researches on multi-document summarization are based on document clustering technology. We propose a method of multi-document summarization, which is based on keyword clustering. In our investigation we develop three methods of keyword clustering to produce multi-document summaries. We distill representative keywords from all documents, and then cluster keywords using connected component, weighted clique and hybrid of both. The purpose of keyword clustering is to gather up information which discusses the same topic or event. In the same cluster, our system computes weight of each sentence and ranks all sentences by weight. The first largest weight sentences will be chosen as the summary of the documents. Our experiments show that stricter keyword clustering method has better summary results. The system which we develop can help people to save time, and read important new documents. 徐 俊 傑 2002 學位論文 ; thesis 74 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣科技大學 === 資訊管理系 === 90 === With the rapid growth of the World Wide Web, more and more information is accessible on-line. This explosion of information has resulted in an information overload problem. However, people have no time to read everything and have to decide which information is available. The technology of automatic text summarization is indispensable for dealing with this problem. Text summarization is the process of distilling the most important information from a source to produce an abridged version for a particular user and task. Recent researches on multi-document summarization are based on document clustering technology. We propose a method of multi-document summarization, which is based on keyword clustering. In our investigation we develop three methods of keyword clustering to produce multi-document summaries. We distill representative keywords from all documents, and then cluster keywords using connected component, weighted clique and hybrid of both. The purpose of keyword clustering is to gather up information which discusses the same topic or event. In the same cluster, our system computes weight of each sentence and ranks all sentences by weight. The first largest weight sentences will be chosen as the summary of the documents. Our experiments show that stricter keyword clustering method has better summary results. The system which we develop can help people to save time, and read important new documents.
author2 徐 俊 傑
author_facet 徐 俊 傑
黃思萱
author 黃思萱
spellingShingle 黃思萱
Multi-Document Summarization Based on Keyword Clustering
author_sort 黃思萱
title Multi-Document Summarization Based on Keyword Clustering
title_short Multi-Document Summarization Based on Keyword Clustering
title_full Multi-Document Summarization Based on Keyword Clustering
title_fullStr Multi-Document Summarization Based on Keyword Clustering
title_full_unstemmed Multi-Document Summarization Based on Keyword Clustering
title_sort multi-document summarization based on keyword clustering
publishDate 2002
url http://ndltd.ncl.edu.tw/handle/53800112370126276087
work_keys_str_mv AT huángsīxuān multidocumentsummarizationbasedonkeywordclustering
AT huángsīxuān yǐguānjiàncífēnqúnwèijīchǔdeduōwénjiànzhāiyào
_version_ 1717756288481361920