WATS-SMS: A T5-Based French Wikipedia Abstractive Text Summarizer for SMS

Text summarization remains a challenging task in the natural language processing field despite the plethora of applications in enterprises and daily life. One of the common use cases is the summarization of web pages which has the potential to provide an overview of web pages to devices with limited...

Full description

Bibliographic Details
Main Authors: Jean Louis Ebongue Kedieng Fendji, Désiré Manuel Taira, Marcellin Atemkeng, Adam Musa Ali
Format: Article
Language:English
Published: MDPI AG 2021-09-01
Series:Future Internet
Subjects:
SMS
Online Access:https://www.mdpi.com/1999-5903/13/9/238
id doaj-394fdcd1adbc4e10b9c6dfc78bb116be
record_format Article
spelling doaj-394fdcd1adbc4e10b9c6dfc78bb116be2021-09-26T00:11:41ZengMDPI AGFuture Internet1999-59032021-09-011323823810.3390/fi13090238WATS-SMS: A T5-Based French Wikipedia Abstractive Text Summarizer for SMSJean Louis Ebongue Kedieng Fendji0Désiré Manuel Taira1Marcellin Atemkeng2Adam Musa Ali3Department of Computer Engineering, University Institute of Technology, University of Ngaoundere, Ngaoundere P.O. Box 454, CameroonDepartment of Mathematics and Computer Science, Faculty of Science, University of Ngaoundere, Ngaoundere P.O. Box 454, CameroonDepartment of Mathematics, Rhodes University, Grahamstown 6140, South AfricaDepartment of Mathematics and Computer Science, Faculty of Science, University of Ngaoundere, Ngaoundere P.O. Box 454, CameroonText summarization remains a challenging task in the natural language processing field despite the plethora of applications in enterprises and daily life. One of the common use cases is the summarization of web pages which has the potential to provide an overview of web pages to devices with limited features. In fact, despite the increasing penetration rate of mobile devices in rural areas, the bulk of those devices offer limited features in addition to the fact that these areas are covered with limited connectivity such as the GSM network. Summarizing web pages into SMS becomes, therefore, an important task to provide information to limited devices. This work introduces WATS-SMS, a T5-based French Wikipedia Abstractive Text Summarizer for SMS. It is built through a transfer learning approach. The T5 English pre-trained model is used to generate a French text summarization model by retraining the model on 25,000 Wikipedia pages then compared with different approaches in the literature. The objective is twofold: (1) to check the assumption made in the literature that abstractive models provide better results compared to extractive ones; and (2) to evaluate the performance of our model compared to other existing abstractive models. A score based on ROUGE metrics gave us a value of 52% for articles with length up to 500 characters against 34.2% for transformer-ED and 12.7% for seq-2seq-attention; and a value of 77% for articles with larger size against 37% for transformers-DMCA. Moreover, an architecture including a software SMS-gateway has been developed to allow owners of mobile devices with limited features to send requests and to receive summaries through the GSM network.https://www.mdpi.com/1999-5903/13/9/238text summarizationfine-tuningtransformersSMSgatewayFrench Wikipedia
collection DOAJ
language English
format Article
sources DOAJ
author Jean Louis Ebongue Kedieng Fendji
Désiré Manuel Taira
Marcellin Atemkeng
Adam Musa Ali
spellingShingle Jean Louis Ebongue Kedieng Fendji
Désiré Manuel Taira
Marcellin Atemkeng
Adam Musa Ali
WATS-SMS: A T5-Based French Wikipedia Abstractive Text Summarizer for SMS
Future Internet
text summarization
fine-tuning
transformers
SMS
gateway
French Wikipedia
author_facet Jean Louis Ebongue Kedieng Fendji
Désiré Manuel Taira
Marcellin Atemkeng
Adam Musa Ali
author_sort Jean Louis Ebongue Kedieng Fendji
title WATS-SMS: A T5-Based French Wikipedia Abstractive Text Summarizer for SMS
title_short WATS-SMS: A T5-Based French Wikipedia Abstractive Text Summarizer for SMS
title_full WATS-SMS: A T5-Based French Wikipedia Abstractive Text Summarizer for SMS
title_fullStr WATS-SMS: A T5-Based French Wikipedia Abstractive Text Summarizer for SMS
title_full_unstemmed WATS-SMS: A T5-Based French Wikipedia Abstractive Text Summarizer for SMS
title_sort wats-sms: a t5-based french wikipedia abstractive text summarizer for sms
publisher MDPI AG
series Future Internet
issn 1999-5903
publishDate 2021-09-01
description Text summarization remains a challenging task in the natural language processing field despite the plethora of applications in enterprises and daily life. One of the common use cases is the summarization of web pages which has the potential to provide an overview of web pages to devices with limited features. In fact, despite the increasing penetration rate of mobile devices in rural areas, the bulk of those devices offer limited features in addition to the fact that these areas are covered with limited connectivity such as the GSM network. Summarizing web pages into SMS becomes, therefore, an important task to provide information to limited devices. This work introduces WATS-SMS, a T5-based French Wikipedia Abstractive Text Summarizer for SMS. It is built through a transfer learning approach. The T5 English pre-trained model is used to generate a French text summarization model by retraining the model on 25,000 Wikipedia pages then compared with different approaches in the literature. The objective is twofold: (1) to check the assumption made in the literature that abstractive models provide better results compared to extractive ones; and (2) to evaluate the performance of our model compared to other existing abstractive models. A score based on ROUGE metrics gave us a value of 52% for articles with length up to 500 characters against 34.2% for transformer-ED and 12.7% for seq-2seq-attention; and a value of 77% for articles with larger size against 37% for transformers-DMCA. Moreover, an architecture including a software SMS-gateway has been developed to allow owners of mobile devices with limited features to send requests and to receive summaries through the GSM network.
topic text summarization
fine-tuning
transformers
SMS
gateway
French Wikipedia
url https://www.mdpi.com/1999-5903/13/9/238
work_keys_str_mv AT jeanlouisebonguekediengfendji watssmsat5basedfrenchwikipediaabstractivetextsummarizerforsms
AT desiremanueltaira watssmsat5basedfrenchwikipediaabstractivetextsummarizerforsms
AT marcellinatemkeng watssmsat5basedfrenchwikipediaabstractivetextsummarizerforsms
AT adammusaali watssmsat5basedfrenchwikipediaabstractivetextsummarizerforsms
_version_ 1717366745510969344