Multi-Document Summarization System Based on Mutual Reinforcement Principle

碩士 === 國立交通大學 === 多媒體工程研究所 === 98 === According to the research report, the rapid development of the Internet results in the amount of the digital document, video, or other data to grow in double rate per year. In order to find out the information of these electronic files efficiently, this thesis d...

Full description

Bibliographic Details
Main Authors: Yang, Ruin-Min, 楊瑞敏
Other Authors: Lee, Chia-Hoang
Format: Others
Language:zh-TW
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/59722834005586953270
id ndltd-TW-098NCTU5641035
record_format oai_dc
spelling ndltd-TW-098NCTU56410352016-04-18T04:21:48Z http://ndltd.ncl.edu.tw/handle/59722834005586953270 Multi-Document Summarization System Based on Mutual Reinforcement Principle 多文件摘要系統基於Mutual Reinforcement原理 Yang, Ruin-Min 楊瑞敏 碩士 國立交通大學 多媒體工程研究所 98 According to the research report, the rapid development of the Internet results in the amount of the digital document, video, or other data to grow in double rate per year. In order to find out the information of these electronic files efficiently, this thesis develops an automatic summarization system to sieve out the non-information data of digital documents. Therefore, users can find out the contents of information efficiently without losing the meaning of the original documents. The automatic summarization system proposed in this thesis considers three different aspects for the sentence scoring: first, the relationship between words and sentences; second, the relationship between the titles and sentences; finally, the relationship between sentences and sentences. Before the sentences scoring, this summarization system uses Alignment algorithm and Mutual Reinforcement Principle to remove the sentences that have fewer information on the original dataset to avoid these sentences with fewer information to be selected as a part of the summary. The HITS algorithm, the cosine similarity calculation methods and the PageRank algorithm are employed respectively to achieve the above three different aspects. The dataset used in this thesis is the DUC dataset, and the constituent documents of the DUC dataset are the English news articles. The evaluation results of the evaluation tools ROUGE show the performance of the summary generate by this summarization system is good. Lee, Chia-Hoang 李嘉晃 2010 學位論文 ; thesis 50 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 多媒體工程研究所 === 98 === According to the research report, the rapid development of the Internet results in the amount of the digital document, video, or other data to grow in double rate per year. In order to find out the information of these electronic files efficiently, this thesis develops an automatic summarization system to sieve out the non-information data of digital documents. Therefore, users can find out the contents of information efficiently without losing the meaning of the original documents. The automatic summarization system proposed in this thesis considers three different aspects for the sentence scoring: first, the relationship between words and sentences; second, the relationship between the titles and sentences; finally, the relationship between sentences and sentences. Before the sentences scoring, this summarization system uses Alignment algorithm and Mutual Reinforcement Principle to remove the sentences that have fewer information on the original dataset to avoid these sentences with fewer information to be selected as a part of the summary. The HITS algorithm, the cosine similarity calculation methods and the PageRank algorithm are employed respectively to achieve the above three different aspects. The dataset used in this thesis is the DUC dataset, and the constituent documents of the DUC dataset are the English news articles. The evaluation results of the evaluation tools ROUGE show the performance of the summary generate by this summarization system is good.
author2 Lee, Chia-Hoang
author_facet Lee, Chia-Hoang
Yang, Ruin-Min
楊瑞敏
author Yang, Ruin-Min
楊瑞敏
spellingShingle Yang, Ruin-Min
楊瑞敏
Multi-Document Summarization System Based on Mutual Reinforcement Principle
author_sort Yang, Ruin-Min
title Multi-Document Summarization System Based on Mutual Reinforcement Principle
title_short Multi-Document Summarization System Based on Mutual Reinforcement Principle
title_full Multi-Document Summarization System Based on Mutual Reinforcement Principle
title_fullStr Multi-Document Summarization System Based on Mutual Reinforcement Principle
title_full_unstemmed Multi-Document Summarization System Based on Mutual Reinforcement Principle
title_sort multi-document summarization system based on mutual reinforcement principle
publishDate 2010
url http://ndltd.ncl.edu.tw/handle/59722834005586953270
work_keys_str_mv AT yangruinmin multidocumentsummarizationsystembasedonmutualreinforcementprinciple
AT yángruìmǐn multidocumentsummarizationsystembasedonmutualreinforcementprinciple
AT yangruinmin duōwénjiànzhāiyàoxìtǒngjīyúmutualreinforcementyuánlǐ
AT yángruìmǐn duōwénjiànzhāiyàoxìtǒngjīyúmutualreinforcementyuánlǐ
_version_ 1718226947521118208