Multi-Document Summarization Based on Keyword Clustering
碩士 === 國立臺灣科技大學 === 資訊管理系 === 90 === With the rapid growth of the World Wide Web, more and more information is accessible on-line. This explosion of information has resulted in an information overload problem. However, people have no time to read everything and have to decide which information is av...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2002
|
Online Access: | http://ndltd.ncl.edu.tw/handle/53800112370126276087 |
id |
ndltd-TW-090NTUST396014 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-090NTUST3960142015-10-13T14:41:23Z http://ndltd.ncl.edu.tw/handle/53800112370126276087 Multi-Document Summarization Based on Keyword Clustering 以關鍵詞分群為基礎的多文件摘要 黃思萱 碩士 國立臺灣科技大學 資訊管理系 90 With the rapid growth of the World Wide Web, more and more information is accessible on-line. This explosion of information has resulted in an information overload problem. However, people have no time to read everything and have to decide which information is available. The technology of automatic text summarization is indispensable for dealing with this problem. Text summarization is the process of distilling the most important information from a source to produce an abridged version for a particular user and task. Recent researches on multi-document summarization are based on document clustering technology. We propose a method of multi-document summarization, which is based on keyword clustering. In our investigation we develop three methods of keyword clustering to produce multi-document summaries. We distill representative keywords from all documents, and then cluster keywords using connected component, weighted clique and hybrid of both. The purpose of keyword clustering is to gather up information which discusses the same topic or event. In the same cluster, our system computes weight of each sentence and ranks all sentences by weight. The first largest weight sentences will be chosen as the summary of the documents. Our experiments show that stricter keyword clustering method has better summary results. The system which we develop can help people to save time, and read important new documents. 徐 俊 傑 2002 學位論文 ; thesis 74 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣科技大學 === 資訊管理系 === 90 === With the rapid growth of the World Wide Web, more and more information is accessible on-line. This explosion of information has resulted in an information overload problem. However, people have no time to read everything and have to decide which information is available. The technology of automatic text summarization is indispensable for dealing with this problem. Text summarization is the process of distilling the most important information from a source to produce an abridged version for a particular user and task.
Recent researches on multi-document summarization are based on document clustering technology. We propose a method of multi-document summarization, which is based on keyword clustering. In our investigation we develop three methods of keyword clustering to produce multi-document summaries. We distill representative keywords from all documents, and then cluster keywords using connected component, weighted clique and hybrid of both. The purpose of keyword clustering is to gather up information which discusses the same topic or event. In the same cluster, our system computes weight of each sentence and ranks all sentences by weight. The first largest weight sentences will be chosen as the summary of the documents. Our experiments show that stricter keyword clustering method has better summary results. The system which we develop can help people to save time, and read important new documents.
|
author2 |
徐 俊 傑 |
author_facet |
徐 俊 傑 黃思萱 |
author |
黃思萱 |
spellingShingle |
黃思萱 Multi-Document Summarization Based on Keyword Clustering |
author_sort |
黃思萱 |
title |
Multi-Document Summarization Based on Keyword Clustering |
title_short |
Multi-Document Summarization Based on Keyword Clustering |
title_full |
Multi-Document Summarization Based on Keyword Clustering |
title_fullStr |
Multi-Document Summarization Based on Keyword Clustering |
title_full_unstemmed |
Multi-Document Summarization Based on Keyword Clustering |
title_sort |
multi-document summarization based on keyword clustering |
publishDate |
2002 |
url |
http://ndltd.ncl.edu.tw/handle/53800112370126276087 |
work_keys_str_mv |
AT huángsīxuān multidocumentsummarizationbasedonkeywordclustering AT huángsīxuān yǐguānjiàncífēnqúnwèijīchǔdeduōwénjiànzhāiyào |
_version_ |
1717756288481361920 |