Textometr: an online tool for automated complexity level assessment of texts for Russian language learners
Evaluation of text accessibility seems to be an extremely urgent and labor-consuming task in the process of preparing texts for teaching Russian as a foreign language. On the other hand, the procedure of assigning a text to one of the levels on the CEFR scale (from A1 to C2) is well-formalized and d...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Peoples’ Friendship University of Russia (RUDN University)
2021-09-01
|
Series: | Russian Language Studies |
Subjects: | |
Online Access: | http://journals.rudn.ru/russian-language-studies/article/viewFile/27498/19822 |
id |
doaj-97b773a3924b442e91c1104c22f5c084 |
---|---|
record_format |
Article |
spelling |
doaj-97b773a3924b442e91c1104c22f5c0842021-09-29T08:19:12ZengPeoples’ Friendship University of Russia (RUDN University)Russian Language Studies2618-81632618-81712021-09-0119333134510.22363/2618-8163-2021-19-3-331-34520473Textometr: an online tool for automated complexity level assessment of texts for Russian language learnersAntonina N. Laposhina0Maria Yu. Lebedeva1Pushkin State Russian Language InstitutePushkin State Russian Language InstituteEvaluation of text accessibility seems to be an extremely urgent and labor-consuming task in the process of preparing texts for teaching Russian as a foreign language. On the other hand, the procedure of assigning a text to one of the levels on the CEFR scale (from A1 to C2) is well-formalized and described in the professional literature, which opens opportunities for its automation. This paper presents Textometr - a new free web-based tool for estimating CEFR level and other key statistics from any given text in Russian that can be relevant for adapting it for foreign students. The automated assessment of the text level here is based on a regression model, trained on the dataset of more than 800 texts from Russian textbooks for foreigners, applying several machine learning and natural language processing methods. In addition to the CEFR level, the tool provides information relevant for adapting the text to educational tasks: lists of keywords and words for a potential vocabulary list, statistics on the text coverage by frequency lists and CEFR-graded vocabulary lists (lexical minima), a frequency list of the text, a forecast of the time needed for reading. The tool shortages at the current stage of development and suggested ways to solve them are also discussed. Finally, the results of the test on the tool quality and the vectors for its further development are reported. Textometr can provide helpful information not only to teachers and guidance teachers, but to authors of textbooks and publishers to check the compliance of the text content with the declared level and educational goals.http://journals.rudn.ru/russian-language-studies/article/viewFile/27498/19822russian as a foreign languageeducational texttext complexityreadingtext adaptingcomputational linguodidacticscomputer assisted language learningrussian language learningweb tools |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Antonina N. Laposhina Maria Yu. Lebedeva |
spellingShingle |
Antonina N. Laposhina Maria Yu. Lebedeva Textometr: an online tool for automated complexity level assessment of texts for Russian language learners Russian Language Studies russian as a foreign language educational text text complexity reading text adapting computational linguodidactics computer assisted language learning russian language learning web tools |
author_facet |
Antonina N. Laposhina Maria Yu. Lebedeva |
author_sort |
Antonina N. Laposhina |
title |
Textometr: an online tool for automated complexity level assessment of texts for Russian language learners |
title_short |
Textometr: an online tool for automated complexity level assessment of texts for Russian language learners |
title_full |
Textometr: an online tool for automated complexity level assessment of texts for Russian language learners |
title_fullStr |
Textometr: an online tool for automated complexity level assessment of texts for Russian language learners |
title_full_unstemmed |
Textometr: an online tool for automated complexity level assessment of texts for Russian language learners |
title_sort |
textometr: an online tool for automated complexity level assessment of texts for russian language learners |
publisher |
Peoples’ Friendship University of Russia (RUDN University) |
series |
Russian Language Studies |
issn |
2618-8163 2618-8171 |
publishDate |
2021-09-01 |
description |
Evaluation of text accessibility seems to be an extremely urgent and labor-consuming task in the process of preparing texts for teaching Russian as a foreign language. On the other hand, the procedure of assigning a text to one of the levels on the CEFR scale (from A1 to C2) is well-formalized and described in the professional literature, which opens opportunities for its automation. This paper presents Textometr - a new free web-based tool for estimating CEFR level and other key statistics from any given text in Russian that can be relevant for adapting it for foreign students. The automated assessment of the text level here is based on a regression model, trained on the dataset of more than 800 texts from Russian textbooks for foreigners, applying several machine learning and natural language processing methods. In addition to the CEFR level, the tool provides information relevant for adapting the text to educational tasks: lists of keywords and words for a potential vocabulary list, statistics on the text coverage by frequency lists and CEFR-graded vocabulary lists (lexical minima), a frequency list of the text, a forecast of the time needed for reading. The tool shortages at the current stage of development and suggested ways to solve them are also discussed. Finally, the results of the test on the tool quality and the vectors for its further development are reported. Textometr can provide helpful information not only to teachers and guidance teachers, but to authors of textbooks and publishers to check the compliance of the text content with the declared level and educational goals. |
topic |
russian as a foreign language educational text text complexity reading text adapting computational linguodidactics computer assisted language learning russian language learning web tools |
url |
http://journals.rudn.ru/russian-language-studies/article/viewFile/27498/19822 |
work_keys_str_mv |
AT antoninanlaposhina textometranonlinetoolforautomatedcomplexitylevelassessmentoftextsforrussianlanguagelearners AT mariayulebedeva textometranonlinetoolforautomatedcomplexitylevelassessmentoftextsforrussianlanguagelearners |
_version_ |
1716864565635973120 |