Extractive Document Summarization Based on Dynamic Feature Space Mapping

The exponential growth of the Web documents has constituted the need for automatic document summarization. In this context, extractive document summarization, i.e., that task of extracting the most relevant information, removing redundancy and presenting the remained data in a coherent and cohesive...

Full description

Bibliographic Details
Main Authors: Samira Ghodratnama, Amin Beheshti, Mehrdad Zakershahrak, Fariborz Sobhanmanesh
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9151114/
id doaj-ab49a3360ed94326aa65d89eea177ffe
record_format Article
spelling doaj-ab49a3360ed94326aa65d89eea177ffe2021-03-30T04:27:08ZengIEEEIEEE Access2169-35362020-01-01813908413909510.1109/ACCESS.2020.30125399151114Extractive Document Summarization Based on Dynamic Feature Space MappingSamira Ghodratnama0https://orcid.org/0000-0002-4443-8333Amin Beheshti1Mehrdad Zakershahrak2Fariborz Sobhanmanesh3Department of Computing, Macquarie University, Sydney, NSW, AustraliaDepartment of Computing, Macquarie University, Sydney, NSW, AustraliaASU School of Computing, Informatics, and Decision Systems Engineering (CIDSE), Arizona State University, Tempe, AZ, USADepartment of Computing, Macquarie University, Sydney, NSW, AustraliaThe exponential growth of the Web documents has constituted the need for automatic document summarization. In this context, extractive document summarization, i.e., that task of extracting the most relevant information, removing redundancy and presenting the remained data in a coherent and cohesive structure, is a challenging task. In this article, we propose a novel intelligent approach, namely ExDoS, that harvests benefits of both supervised and unsupervised algorithms simultaneously. To the best of our knowledge, ExDoS is the first approach to combine both supervised and unsupervised algorithms in a single framework and an interpretable manner for document summarization purpose. ExDoS iteratively minimizes the error rate of the classifier in each cluster with the help of dynamic local feature weighting. Moreover, this approach specifies the contribution of features to discriminate each class, which is a challenging issue in the summarization task. Therefore, in addition to summarizing text, ExDoS is also able to measure the importance of each feature in the summarization process. We evaluate our model both automatically (in terms of ROUGE factor) and empirically (human analysis) on the benchmark datasets: the DUC2002 and CNN/DailyMail. Results show that our model obtains higher ROUGE scores comparing to most state-of-the-art models. The human evaluation also demonstrates that our model is capable of generating informative and readable summaries.https://ieeexplore.ieee.org/document/9151114/Automatic text summarizationextractive summarizationfeature weightingmulti-document summarization
collection DOAJ
language English
format Article
sources DOAJ
author Samira Ghodratnama
Amin Beheshti
Mehrdad Zakershahrak
Fariborz Sobhanmanesh
spellingShingle Samira Ghodratnama
Amin Beheshti
Mehrdad Zakershahrak
Fariborz Sobhanmanesh
Extractive Document Summarization Based on Dynamic Feature Space Mapping
IEEE Access
Automatic text summarization
extractive summarization
feature weighting
multi-document summarization
author_facet Samira Ghodratnama
Amin Beheshti
Mehrdad Zakershahrak
Fariborz Sobhanmanesh
author_sort Samira Ghodratnama
title Extractive Document Summarization Based on Dynamic Feature Space Mapping
title_short Extractive Document Summarization Based on Dynamic Feature Space Mapping
title_full Extractive Document Summarization Based on Dynamic Feature Space Mapping
title_fullStr Extractive Document Summarization Based on Dynamic Feature Space Mapping
title_full_unstemmed Extractive Document Summarization Based on Dynamic Feature Space Mapping
title_sort extractive document summarization based on dynamic feature space mapping
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description The exponential growth of the Web documents has constituted the need for automatic document summarization. In this context, extractive document summarization, i.e., that task of extracting the most relevant information, removing redundancy and presenting the remained data in a coherent and cohesive structure, is a challenging task. In this article, we propose a novel intelligent approach, namely ExDoS, that harvests benefits of both supervised and unsupervised algorithms simultaneously. To the best of our knowledge, ExDoS is the first approach to combine both supervised and unsupervised algorithms in a single framework and an interpretable manner for document summarization purpose. ExDoS iteratively minimizes the error rate of the classifier in each cluster with the help of dynamic local feature weighting. Moreover, this approach specifies the contribution of features to discriminate each class, which is a challenging issue in the summarization task. Therefore, in addition to summarizing text, ExDoS is also able to measure the importance of each feature in the summarization process. We evaluate our model both automatically (in terms of ROUGE factor) and empirically (human analysis) on the benchmark datasets: the DUC2002 and CNN/DailyMail. Results show that our model obtains higher ROUGE scores comparing to most state-of-the-art models. The human evaluation also demonstrates that our model is capable of generating informative and readable summaries.
topic Automatic text summarization
extractive summarization
feature weighting
multi-document summarization
url https://ieeexplore.ieee.org/document/9151114/
work_keys_str_mv AT samiraghodratnama extractivedocumentsummarizationbasedondynamicfeaturespacemapping
AT aminbeheshti extractivedocumentsummarizationbasedondynamicfeaturespacemapping
AT mehrdadzakershahrak extractivedocumentsummarizationbasedondynamicfeaturespacemapping
AT fariborzsobhanmanesh extractivedocumentsummarizationbasedondynamicfeaturespacemapping
_version_ 1724181799382286336