Textometr: an online tool for automated complexity level assessment of texts for Russian language learners

Evaluation of text accessibility seems to be an extremely urgent and labor-consuming task in the process of preparing texts for teaching Russian as a foreign language. On the other hand, the procedure of assigning a text to one of the levels on the CEFR scale (from A1 to C2) is well-formalized and d...

Full description

Bibliographic Details
Main Authors: Antonina N. Laposhina, Maria Yu. Lebedeva
Format: Article
Language:English
Published: Peoples’ Friendship University of Russia (RUDN University) 2021-09-01
Series:Russian Language Studies
Subjects:
Online Access:http://journals.rudn.ru/russian-language-studies/article/viewFile/27498/19822
id doaj-97b773a3924b442e91c1104c22f5c084
record_format Article
spelling doaj-97b773a3924b442e91c1104c22f5c0842021-09-29T08:19:12ZengPeoples’ Friendship University of Russia (RUDN University)Russian Language Studies2618-81632618-81712021-09-0119333134510.22363/2618-8163-2021-19-3-331-34520473Textometr: an online tool for automated complexity level assessment of texts for Russian language learnersAntonina N. Laposhina0Maria Yu. Lebedeva1Pushkin State Russian Language InstitutePushkin State Russian Language InstituteEvaluation of text accessibility seems to be an extremely urgent and labor-consuming task in the process of preparing texts for teaching Russian as a foreign language. On the other hand, the procedure of assigning a text to one of the levels on the CEFR scale (from A1 to C2) is well-formalized and described in the professional literature, which opens opportunities for its automation. This paper presents Textometr - a new free web-based tool for estimating CEFR level and other key statistics from any given text in Russian that can be relevant for adapting it for foreign students. The automated assessment of the text level here is based on a regression model, trained on the dataset of more than 800 texts from Russian textbooks for foreigners, applying several machine learning and natural language processing methods. In addition to the CEFR level, the tool provides information relevant for adapting the text to educational tasks: lists of keywords and words for a potential vocabulary list, statistics on the text coverage by frequency lists and CEFR-graded vocabulary lists (lexical minima), a frequency list of the text, a forecast of the time needed for reading. The tool shortages at the current stage of development and suggested ways to solve them are also discussed. Finally, the results of the test on the tool quality and the vectors for its further development are reported. Textometr can provide helpful information not only to teachers and guidance teachers, but to authors of textbooks and publishers to check the compliance of the text content with the declared level and educational goals.http://journals.rudn.ru/russian-language-studies/article/viewFile/27498/19822russian as a foreign languageeducational texttext complexityreadingtext adaptingcomputational linguodidacticscomputer assisted language learningrussian language learningweb tools
collection DOAJ
language English
format Article
sources DOAJ
author Antonina N. Laposhina
Maria Yu. Lebedeva
spellingShingle Antonina N. Laposhina
Maria Yu. Lebedeva
Textometr: an online tool for automated complexity level assessment of texts for Russian language learners
Russian Language Studies
russian as a foreign language
educational text
text complexity
reading
text adapting
computational linguodidactics
computer assisted language learning
russian language learning
web tools
author_facet Antonina N. Laposhina
Maria Yu. Lebedeva
author_sort Antonina N. Laposhina
title Textometr: an online tool for automated complexity level assessment of texts for Russian language learners
title_short Textometr: an online tool for automated complexity level assessment of texts for Russian language learners
title_full Textometr: an online tool for automated complexity level assessment of texts for Russian language learners
title_fullStr Textometr: an online tool for automated complexity level assessment of texts for Russian language learners
title_full_unstemmed Textometr: an online tool for automated complexity level assessment of texts for Russian language learners
title_sort textometr: an online tool for automated complexity level assessment of texts for russian language learners
publisher Peoples’ Friendship University of Russia (RUDN University)
series Russian Language Studies
issn 2618-8163
2618-8171
publishDate 2021-09-01
description Evaluation of text accessibility seems to be an extremely urgent and labor-consuming task in the process of preparing texts for teaching Russian as a foreign language. On the other hand, the procedure of assigning a text to one of the levels on the CEFR scale (from A1 to C2) is well-formalized and described in the professional literature, which opens opportunities for its automation. This paper presents Textometr - a new free web-based tool for estimating CEFR level and other key statistics from any given text in Russian that can be relevant for adapting it for foreign students. The automated assessment of the text level here is based on a regression model, trained on the dataset of more than 800 texts from Russian textbooks for foreigners, applying several machine learning and natural language processing methods. In addition to the CEFR level, the tool provides information relevant for adapting the text to educational tasks: lists of keywords and words for a potential vocabulary list, statistics on the text coverage by frequency lists and CEFR-graded vocabulary lists (lexical minima), a frequency list of the text, a forecast of the time needed for reading. The tool shortages at the current stage of development and suggested ways to solve them are also discussed. Finally, the results of the test on the tool quality and the vectors for its further development are reported. Textometr can provide helpful information not only to teachers and guidance teachers, but to authors of textbooks and publishers to check the compliance of the text content with the declared level and educational goals.
topic russian as a foreign language
educational text
text complexity
reading
text adapting
computational linguodidactics
computer assisted language learning
russian language learning
web tools
url http://journals.rudn.ru/russian-language-studies/article/viewFile/27498/19822
work_keys_str_mv AT antoninanlaposhina textometranonlinetoolforautomatedcomplexitylevelassessmentoftextsforrussianlanguagelearners
AT mariayulebedeva textometranonlinetoolforautomatedcomplexitylevelassessmentoftextsforrussianlanguagelearners
_version_ 1716864565635973120