Investigating the Extractive Summarization of Literary Novels

Abstract Due to the vast amount of information we are faced with, summarization has become a critical necessity of everyday human life. Given that a large fraction of the electronic documents available online and elsewhere consist of short texts such as Web pages, news articles, scientific repor...

Full description

Bibliographic Details
Main Author: Ceylan, Hakan
Other Authors: Mihalcea, Rada, 1974-
Format: Others
Language:English
Published: University of North Texas 2011
Subjects:
Online Access:https://digital.library.unt.edu/ark:/67531/metadc103298/
id ndltd-unt.edu-info-ark-67531-metadc103298
record_format oai_dc
spelling ndltd-unt.edu-info-ark-67531-metadc1032982020-07-18T05:19:27Z Investigating the Extractive Summarization of Literary Novels Ceylan, Hakan Text summarization extractive summarization summarization of literary novels Abstract Due to the vast amount of information we are faced with, summarization has become a critical necessity of everyday human life. Given that a large fraction of the electronic documents available online and elsewhere consist of short texts such as Web pages, news articles, scientific reports, and others, the focus of natural language processing techniques to date has been on the automation of methods targeting short documents. We are witnessing however a change: an increasingly larger number of books become available in electronic format. This means that the need for language processing techniques able to handle very large documents such as books is becoming increasingly important. This thesis addresses the problem of summarization of novels, which are long and complex literary narratives. While there is a significant body of research that has been carried out on the task of automatic text summarization, most of this work has been concerned with the summarization of short documents, with a particular focus on news stories. However, novels are different in both length and genre, and consequently different summarization techniques are required. This thesis attempts to close this gap by analyzing a new domain for summarization, and by building unsupervised and supervised systems that effectively take into account the properties of long documents, and outperform the traditional extractive summarization systems typically addressing news genre. University of North Texas Mihalcea, Rada, 1974- Yuret, Deniz Swigger, Kathleen M. Tarau, Paul 2011-12 Thesis or Dissertation Text oclc: 812016128 https://digital.library.unt.edu/ark:/67531/metadc103298/ ark: ark:/67531/metadc103298 English Public Ceylan, Hakan Copyright Copyright is held by the author, unless otherwise noted. All rights Reserved.
collection NDLTD
language English
format Others
sources NDLTD
topic Text summarization
extractive summarization
summarization of literary novels
spellingShingle Text summarization
extractive summarization
summarization of literary novels
Ceylan, Hakan
Investigating the Extractive Summarization of Literary Novels
description Abstract Due to the vast amount of information we are faced with, summarization has become a critical necessity of everyday human life. Given that a large fraction of the electronic documents available online and elsewhere consist of short texts such as Web pages, news articles, scientific reports, and others, the focus of natural language processing techniques to date has been on the automation of methods targeting short documents. We are witnessing however a change: an increasingly larger number of books become available in electronic format. This means that the need for language processing techniques able to handle very large documents such as books is becoming increasingly important. This thesis addresses the problem of summarization of novels, which are long and complex literary narratives. While there is a significant body of research that has been carried out on the task of automatic text summarization, most of this work has been concerned with the summarization of short documents, with a particular focus on news stories. However, novels are different in both length and genre, and consequently different summarization techniques are required. This thesis attempts to close this gap by analyzing a new domain for summarization, and by building unsupervised and supervised systems that effectively take into account the properties of long documents, and outperform the traditional extractive summarization systems typically addressing news genre.
author2 Mihalcea, Rada, 1974-
author_facet Mihalcea, Rada, 1974-
Ceylan, Hakan
author Ceylan, Hakan
author_sort Ceylan, Hakan
title Investigating the Extractive Summarization of Literary Novels
title_short Investigating the Extractive Summarization of Literary Novels
title_full Investigating the Extractive Summarization of Literary Novels
title_fullStr Investigating the Extractive Summarization of Literary Novels
title_full_unstemmed Investigating the Extractive Summarization of Literary Novels
title_sort investigating the extractive summarization of literary novels
publisher University of North Texas
publishDate 2011
url https://digital.library.unt.edu/ark:/67531/metadc103298/
work_keys_str_mv AT ceylanhakan investigatingtheextractivesummarizationofliterarynovels
_version_ 1719329976453431296