The Saudi Novel Corpus: Design and Compilation

Arabic has recently received significant attention from corpus compilers. This situation has led to the creation of many Arabic corpora that cover various genres, most notably the newswire genre. Yet, Arabic novels, and specifically those authored by Saudi writers, lack the sufficient digital datase...

Full description

Bibliographic Details
Main Authors: Abdeen, M.A.R (Author), Alfraidi, T. (Author), Alluhaibi, R. (Author), Al-Thubaity, A. (Author), Yatimi, A. (Author)
Format: Article
Language:English
Published: MDPI 2022
Subjects:
Online Access:View Fulltext in Publisher
LEADER 01865nam a2200229Ia 4500
001 10.3390-app12136648
008 220718s2022 CNT 000 0 und d
020 |a 20763417 (ISSN) 
245 1 0 |a The Saudi Novel Corpus: Design and Compilation 
260 0 |b MDPI  |c 2022 
856 |z View Fulltext in Publisher  |u https://doi.org/10.3390/app12136648 
520 3 |a Arabic has recently received significant attention from corpus compilers. This situation has led to the creation of many Arabic corpora that cover various genres, most notably the newswire genre. Yet, Arabic novels, and specifically those authored by Saudi writers, lack the sufficient digital datasets that would enhance corpus linguistic and stylistic studies of these works. Thus, Arabic lags behind English and other European languages in this context. In this paper, we present the Saudi Novels Corpus, built to be a valuable resource for linguistic and stylistic research communities. We specifically present the procedures we followed and the decisions we made in creating the corpus. We describe and clarify the design criteria, data collection methods, process of annotation, and encoding. In addition, we present preliminary results that emerged from the analysis of the corpus content. We consider the work described in this paper as initial steps to bridge the existing gap between corpus linguistics and Arabic literary texts. Further work is planned to improve the quality of the corpus by adding advanced features. © 2022 by the authors. Licensee MDPI, Basel, Switzerland. 
650 0 4 |a Arabic 
650 0 4 |a corpora 
650 0 4 |a corpus linguistics 
650 0 4 |a Saudi novels 
700 1 |a Abdeen, M.A.R.  |e author 
700 1 |a Alfraidi, T.  |e author 
700 1 |a Alluhaibi, R.  |e author 
700 1 |a Al-Thubaity, A.  |e author 
700 1 |a Yatimi, A.  |e author 
773 |t Applied Sciences (Switzerland)