SECNet: Unsupervised Text Summarization Using Salience Extractor with a Consistent Network

碩士 === 國立成功大學 === 電腦與通信工程研究所 === 107 === Automated extractive document summarization is an important aspect of natural language processing. Most existing unsupervised summarization models use a graph-based ranking algorithm to evaluate the salience of sentences based on similarity. However, similari...

Full description

Bibliographic Details
Main Authors: Hong-JunYan, 顏宏峻
Other Authors: Jen-Wei Huang
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/3n6548
id ndltd-TW-107NCKU5652089
record_format oai_dc
spelling ndltd-TW-107NCKU56520892019-10-26T06:24:19Z http://ndltd.ncl.edu.tw/handle/3n6548 SECNet: Unsupervised Text Summarization Using Salience Extractor with a Consistent Network 基於顯著性提取器和一致網路之非監督式文本摘要模型 Hong-JunYan 顏宏峻 碩士 國立成功大學 電腦與通信工程研究所 107 Automated extractive document summarization is an important aspect of natural language processing. Most existing unsupervised summarization models use a graph-based ranking algorithm to evaluate the salience of sentences based on similarity. However, similarity captures only the surface relationships between sentences. In this study, we addressed this issue by exploiting recent developments in the field of unsupervised image generation and attention-based models for image captioning. Our model includes two Seq2Seq models, a salience extractor, and a consistent network. The salience extractor model derives information from documents using an encoder, and generates latent summaries using a decoder. The attention score of latent summaries is used to calculate the importance of each sentence in the document. The use of a consistent network ensures that the latent summaries contain the necessary document information. The proposed model was evaluated using the English summarization benchmark datasets, DUC 2001 and DUC 2002, as well as the large Chinese summarization dataset, LCSTS. We also created a finance news dataset in Chinese and had a group of experts in bank auditing label a summary for use as a reference. Jen-Wei Huang 黃仁暐 2019 學位論文 ; thesis 46 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立成功大學 === 電腦與通信工程研究所 === 107 === Automated extractive document summarization is an important aspect of natural language processing. Most existing unsupervised summarization models use a graph-based ranking algorithm to evaluate the salience of sentences based on similarity. However, similarity captures only the surface relationships between sentences. In this study, we addressed this issue by exploiting recent developments in the field of unsupervised image generation and attention-based models for image captioning. Our model includes two Seq2Seq models, a salience extractor, and a consistent network. The salience extractor model derives information from documents using an encoder, and generates latent summaries using a decoder. The attention score of latent summaries is used to calculate the importance of each sentence in the document. The use of a consistent network ensures that the latent summaries contain the necessary document information. The proposed model was evaluated using the English summarization benchmark datasets, DUC 2001 and DUC 2002, as well as the large Chinese summarization dataset, LCSTS. We also created a finance news dataset in Chinese and had a group of experts in bank auditing label a summary for use as a reference.
author2 Jen-Wei Huang
author_facet Jen-Wei Huang
Hong-JunYan
顏宏峻
author Hong-JunYan
顏宏峻
spellingShingle Hong-JunYan
顏宏峻
SECNet: Unsupervised Text Summarization Using Salience Extractor with a Consistent Network
author_sort Hong-JunYan
title SECNet: Unsupervised Text Summarization Using Salience Extractor with a Consistent Network
title_short SECNet: Unsupervised Text Summarization Using Salience Extractor with a Consistent Network
title_full SECNet: Unsupervised Text Summarization Using Salience Extractor with a Consistent Network
title_fullStr SECNet: Unsupervised Text Summarization Using Salience Extractor with a Consistent Network
title_full_unstemmed SECNet: Unsupervised Text Summarization Using Salience Extractor with a Consistent Network
title_sort secnet: unsupervised text summarization using salience extractor with a consistent network
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/3n6548
work_keys_str_mv AT hongjunyan secnetunsupervisedtextsummarizationusingsalienceextractorwithaconsistentnetwork
AT yánhóngjùn secnetunsupervisedtextsummarizationusingsalienceextractorwithaconsistentnetwork
AT hongjunyan jīyúxiǎnzhexìngtíqǔqìhéyīzhìwǎnglùzhīfēijiāndūshìwénběnzhāiyàomóxíng
AT yánhóngjùn jīyúxiǎnzhexìngtíqǔqìhéyīzhìwǎnglùzhīfēijiāndūshìwénběnzhāiyàomóxíng
_version_ 1719279727251816448