Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers

Research of linguistic features requires part of speech (POS) tagging of texts. The existing POS taggers have been predominantly trained on native speakers’ texts to enhance their accuracy. The researchers exploring POS tagging of ELL (English language learners) texts distinguish tagger’s and learn...

Full description

Bibliographic Details
Published in:Baltic Journal of English Language, Literature and Culture
Main Author: Zigrīda Vinčela
Format: Article
Language:English
Published: University of Latvia Press 2014-04-01
Subjects:
Online Access:https://journal.lu.lv/bjellc/article/view/337
_version_ 1852701822494441472
author Zigrīda Vinčela
author_facet Zigrīda Vinčela
author_sort Zigrīda Vinčela
collection DOAJ
container_title Baltic Journal of English Language, Literature and Culture
description Research of linguistic features requires part of speech (POS) tagging of texts. The existing POS taggers have been predominantly trained on native speakers’ texts to enhance their accuracy. The researchers exploring POS tagging of ELL (English language learners) texts distinguish tagger’s and learners’ errors and suggest annotation enhancement schemes. However, the frequency and types of CLAWS7 (Constituent Likelihood Automatic Word Tagging System) tagging errors in ELL texts of different communicative purposes have not been sufficiently explored to suggest annotation enhancement solutions in each particular learner corpus building case. This study investigates CLAWS7 tagged texts composed by non-native English philology BA students (English Studies Department, University of Latvia) to uncover the overall precision of the tags having the greatest impact on the error rate and provide an insight into errors to reveal the texts requiring annotation enhancement solutions. Material for the analysis has been selected from the corpus of student-composed texts. The results show that tagging precision varies across the text groups. The texts edited by the students show greater tagging precision, and therefore would not require specific annotation enhancement procedures before their tagging. Tagging precision is lower in such interactional texts as chat messages that could be addressed by the application of an annotation enhancement scheme.
format Article
id doaj-art-a7c293e8f3df4e2d95c0fdbe6fcedca6
institution Directory of Open Access Journals
issn 1691-9971
2501-0395
language English
publishDate 2014-04-01
publisher University of Latvia Press
record_format Article
spelling doaj-art-a7c293e8f3df4e2d95c0fdbe6fcedca62025-08-19T21:20:11ZengUniversity of Latvia PressBaltic Journal of English Language, Literature and Culture1691-99712501-03952014-04-014Tagging Errors in Non-Native English Language Student-Composed Texts of Different RegistersZigrīda Vinčela0University of Latvia. Research of linguistic features requires part of speech (POS) tagging of texts. The existing POS taggers have been predominantly trained on native speakers’ texts to enhance their accuracy. The researchers exploring POS tagging of ELL (English language learners) texts distinguish tagger’s and learners’ errors and suggest annotation enhancement schemes. However, the frequency and types of CLAWS7 (Constituent Likelihood Automatic Word Tagging System) tagging errors in ELL texts of different communicative purposes have not been sufficiently explored to suggest annotation enhancement solutions in each particular learner corpus building case. This study investigates CLAWS7 tagged texts composed by non-native English philology BA students (English Studies Department, University of Latvia) to uncover the overall precision of the tags having the greatest impact on the error rate and provide an insight into errors to reveal the texts requiring annotation enhancement solutions. Material for the analysis has been selected from the corpus of student-composed texts. The results show that tagging precision varies across the text groups. The texts edited by the students show greater tagging precision, and therefore would not require specific annotation enhancement procedures before their tagging. Tagging precision is lower in such interactional texts as chat messages that could be addressed by the application of an annotation enhancement scheme. https://journal.lu.lv/bjellc/article/view/337corpusannotationaccuracytagging error
spellingShingle Zigrīda Vinčela
Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers
corpus
annotation
accuracy
tagging error
title Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers
title_full Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers
title_fullStr Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers
title_full_unstemmed Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers
title_short Tagging Errors in Non-Native English Language Student-Composed Texts of Different Registers
title_sort tagging errors in non native english language student composed texts of different registers
topic corpus
annotation
accuracy
tagging error
url https://journal.lu.lv/bjellc/article/view/337
work_keys_str_mv AT zigridavincela taggingerrorsinnonnativeenglishlanguagestudentcomposedtextsofdifferentregisters