Grammar Error Correction in Morphologically Rich Languages: The Case of Russian

Until now, most of the research in grammar error correction focused on English, and the problem has hardly been explored for other languages. We address the task of correcting writing mistakes in morphologically rich languages, with a focus on Russian. We present a corrected an...

Full description

Bibliographic Details
Main Authors: Rozovskaya, Alla, Roth, Dan
Format: Article
Language:English
Published: The MIT Press 2019-11-01
Series:Transactions of the Association for Computational Linguistics
Online Access:https://www.mitpressjournals.org/doi/abs/10.1162/tacl_a_00251
id doaj-b9865560aeb14ce38ec8f458fe3713fb
record_format Article
spelling doaj-b9865560aeb14ce38ec8f458fe3713fb2020-11-25T03:16:37ZengThe MIT PressTransactions of the Association for Computational Linguistics2307-387X2019-11-01711710.1162/tacl_a_00251Grammar Error Correction in Morphologically Rich Languages: The Case of RussianRozovskaya, AllaRoth, Dan Until now, most of the research in grammar error correction focused on English, and the problem has hardly been explored for other languages. We address the task of correcting writing mistakes in morphologically rich languages, with a focus on Russian. We present a corrected and error-tagged corpus of Russian learner writing and develop models that make use of existing state-of-the-art methods that have been well studied for English. Although impressive results have recently been achieved for grammar error correction of non-native English writing, these results are limited to domains where plentiful training data are available. Because annotation is extremely costly, these approaches are not suitable for the majority of domains and languages. We thus focus on methods that use “minimal supervision”; that is, those that do not rely on large amounts of annotated training data, and show how existing minimal-supervision approaches extend to a highly inflectional language such as Russian. The results demonstrate that these methods are particularly useful for correcting mistakes in grammatical phenomena that involve rich morphology. https://www.mitpressjournals.org/doi/abs/10.1162/tacl_a_00251
collection DOAJ
language English
format Article
sources DOAJ
author Rozovskaya, Alla
Roth, Dan
spellingShingle Rozovskaya, Alla
Roth, Dan
Grammar Error Correction in Morphologically Rich Languages: The Case of Russian
Transactions of the Association for Computational Linguistics
author_facet Rozovskaya, Alla
Roth, Dan
author_sort Rozovskaya, Alla
title Grammar Error Correction in Morphologically Rich Languages: The Case of Russian
title_short Grammar Error Correction in Morphologically Rich Languages: The Case of Russian
title_full Grammar Error Correction in Morphologically Rich Languages: The Case of Russian
title_fullStr Grammar Error Correction in Morphologically Rich Languages: The Case of Russian
title_full_unstemmed Grammar Error Correction in Morphologically Rich Languages: The Case of Russian
title_sort grammar error correction in morphologically rich languages: the case of russian
publisher The MIT Press
series Transactions of the Association for Computational Linguistics
issn 2307-387X
publishDate 2019-11-01
description Until now, most of the research in grammar error correction focused on English, and the problem has hardly been explored for other languages. We address the task of correcting writing mistakes in morphologically rich languages, with a focus on Russian. We present a corrected and error-tagged corpus of Russian learner writing and develop models that make use of existing state-of-the-art methods that have been well studied for English. Although impressive results have recently been achieved for grammar error correction of non-native English writing, these results are limited to domains where plentiful training data are available. Because annotation is extremely costly, these approaches are not suitable for the majority of domains and languages. We thus focus on methods that use “minimal supervision”; that is, those that do not rely on large amounts of annotated training data, and show how existing minimal-supervision approaches extend to a highly inflectional language such as Russian. The results demonstrate that these methods are particularly useful for correcting mistakes in grammatical phenomena that involve rich morphology.
url https://www.mitpressjournals.org/doi/abs/10.1162/tacl_a_00251
work_keys_str_mv AT rozovskayaalla grammarerrorcorrectioninmorphologicallyrichlanguagesthecaseofrussian
AT rothdan grammarerrorcorrectioninmorphologicallyrichlanguagesthecaseofrussian
_version_ 1724635170751905792