Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers
Research of linguistic features requires part of speech (POS) tagging of texts. The existing POS taggers have been predominantly trained on native speakers’ texts to enhance their accuracy. The researchers exploring POS tagging of ELL (English language learners) texts distinguish tagger’s and learn...
| Published in: | Baltic Journal of English Language, Literature and Culture |
|---|---|
| Main Author: | |
| Format: | Article |
| Language: | English |
| Published: |
University of Latvia Press
2014-04-01
|
| Subjects: | |
| Online Access: | https://journal.lu.lv/bjellc/article/view/337 |
| _version_ | 1852701822494441472 |
|---|---|
| author | Zigrīda Vinčela |
| author_facet | Zigrīda Vinčela |
| author_sort | Zigrīda Vinčela |
| collection | DOAJ |
| container_title | Baltic Journal of English Language, Literature and Culture |
| description |
Research of linguistic features requires part of speech (POS) tagging of texts. The existing POS taggers have been predominantly trained on native speakers’ texts to enhance their accuracy. The researchers exploring POS tagging of ELL (English language learners) texts distinguish tagger’s and learners’ errors and suggest annotation enhancement schemes. However, the frequency and types of CLAWS7 (Constituent Likelihood Automatic Word Tagging System) tagging errors in ELL texts of different communicative purposes have not been sufficiently explored to suggest annotation enhancement solutions in each particular learner corpus building case. This study investigates CLAWS7 tagged texts composed by non-native English philology BA students (English Studies Department, University of Latvia) to uncover the overall precision of the tags having the greatest impact on the error rate and provide an insight into errors to reveal the texts requiring annotation enhancement solutions. Material for the analysis has been selected from the corpus of student-composed texts. The results show that tagging precision varies across the text groups. The texts edited by the students show greater tagging precision, and therefore would not require specific annotation enhancement procedures before their tagging. Tagging precision is lower in such interactional texts as chat messages that could be addressed by the application of an annotation enhancement scheme.
|
| format | Article |
| id | doaj-art-a7c293e8f3df4e2d95c0fdbe6fcedca6 |
| institution | Directory of Open Access Journals |
| issn | 1691-9971 2501-0395 |
| language | English |
| publishDate | 2014-04-01 |
| publisher | University of Latvia Press |
| record_format | Article |
| spelling | doaj-art-a7c293e8f3df4e2d95c0fdbe6fcedca62025-08-19T21:20:11ZengUniversity of Latvia PressBaltic Journal of English Language, Literature and Culture1691-99712501-03952014-04-014Tagging Errors in Non-Native English Language Student-Composed Texts of Different RegistersZigrīda Vinčela0University of Latvia. Research of linguistic features requires part of speech (POS) tagging of texts. The existing POS taggers have been predominantly trained on native speakers’ texts to enhance their accuracy. The researchers exploring POS tagging of ELL (English language learners) texts distinguish tagger’s and learners’ errors and suggest annotation enhancement schemes. However, the frequency and types of CLAWS7 (Constituent Likelihood Automatic Word Tagging System) tagging errors in ELL texts of different communicative purposes have not been sufficiently explored to suggest annotation enhancement solutions in each particular learner corpus building case. This study investigates CLAWS7 tagged texts composed by non-native English philology BA students (English Studies Department, University of Latvia) to uncover the overall precision of the tags having the greatest impact on the error rate and provide an insight into errors to reveal the texts requiring annotation enhancement solutions. Material for the analysis has been selected from the corpus of student-composed texts. The results show that tagging precision varies across the text groups. The texts edited by the students show greater tagging precision, and therefore would not require specific annotation enhancement procedures before their tagging. Tagging precision is lower in such interactional texts as chat messages that could be addressed by the application of an annotation enhancement scheme. https://journal.lu.lv/bjellc/article/view/337corpusannotationaccuracytagging error |
| spellingShingle | Zigrīda Vinčela Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers corpus annotation accuracy tagging error |
| title | Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers |
| title_full | Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers |
| title_fullStr | Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers |
| title_full_unstemmed | Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers |
| title_short | Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers |
| title_sort | tagging errors in non native english language student composed texts of different registers |
| topic | corpus annotation accuracy tagging error |
| url | https://journal.lu.lv/bjellc/article/view/337 |
| work_keys_str_mv | AT zigridavincela taggingerrorsinnonnativeenglishlanguagestudentcomposedtextsofdifferentregisters |
