Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus
The surprisal of a word on a probabilistic grammar constitutes a promising complexity metric for human sentence comprehension difficulty. Using two different grammar types, surprisal is shown to have an effect on fixation durations and regression probabilities in a sample of German readers’ eye move...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Bern Open Publishing
2008-09-01
|
Series: | Journal of Eye Movement Research |
Subjects: | |
Online Access: | https://bop.unibe.ch/JEMR/article/view/2255 |
id |
doaj-f766613505ce45679535606bcea91a22 |
---|---|
record_format |
Article |
spelling |
doaj-f766613505ce45679535606bcea91a222021-05-28T13:34:55ZengBern Open PublishingJournal of Eye Movement Research1995-86922008-09-012110.16910/jemr.2.1.1Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence CorpusMarisa Ferrara Boston0John Hale1Reinhold Kliegl2Umesh Patil3Shravan Vasishth4Cornell UniversityCornell UniversityUniversity of PotsdamUniversity of PotsdamUniversity of PotsdamThe surprisal of a word on a probabilistic grammar constitutes a promising complexity metric for human sentence comprehension difficulty. Using two different grammar types, surprisal is shown to have an effect on fixation durations and regression probabilities in a sample of German readers’ eye movements, the Potsdam Sentence Corpus. A linear mixed-effects model was used to quantify the effect of surprisal while taking into account unigram frequency and bigram frequency (transitional probability), word length, and empirically-derived word predictability; the socalled “early” and “late” measures of processing difficulty both showed an effect of surprisal. Surprisal is also shown to have a small but statistically non-significant effect on empirically-derived predictability itself. This work thus demonstrates the importance of including parsing costs as a predictor of comprehension difficulty in models of reading, and suggests that a simple identification of syntactic parsing costs with early measures and late measures with durations of post-syntactic events may be difficult to uphold.https://bop.unibe.ch/JEMR/article/view/2255surprisalparsing costspotsdam sentence corpusparsing difficulty |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Marisa Ferrara Boston John Hale Reinhold Kliegl Umesh Patil Shravan Vasishth |
spellingShingle |
Marisa Ferrara Boston John Hale Reinhold Kliegl Umesh Patil Shravan Vasishth Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus Journal of Eye Movement Research surprisal parsing costs potsdam sentence corpus parsing difficulty |
author_facet |
Marisa Ferrara Boston John Hale Reinhold Kliegl Umesh Patil Shravan Vasishth |
author_sort |
Marisa Ferrara Boston |
title |
Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus |
title_short |
Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus |
title_full |
Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus |
title_fullStr |
Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus |
title_full_unstemmed |
Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus |
title_sort |
parsing costs as predictors of reading difficulty: an evaluation using the potsdam sentence corpus |
publisher |
Bern Open Publishing |
series |
Journal of Eye Movement Research |
issn |
1995-8692 |
publishDate |
2008-09-01 |
description |
The surprisal of a word on a probabilistic grammar constitutes a promising complexity metric for human sentence comprehension difficulty. Using two different grammar types, surprisal is shown to have an effect on fixation durations and regression probabilities in a sample of German readers’ eye movements, the Potsdam Sentence Corpus. A linear mixed-effects model was used to quantify the effect of surprisal while taking into account unigram frequency and bigram frequency (transitional probability), word length, and empirically-derived word predictability; the socalled “early” and “late” measures of processing difficulty both showed an effect of surprisal. Surprisal is also shown to have a small but statistically non-significant effect on empirically-derived predictability itself. This work thus demonstrates the importance of including parsing costs as a predictor of comprehension difficulty in models of reading, and suggests that a simple identification of syntactic parsing costs with early measures and late measures with durations of post-syntactic events may be difficult to uphold. |
topic |
surprisal parsing costs potsdam sentence corpus parsing difficulty |
url |
https://bop.unibe.ch/JEMR/article/view/2255 |
work_keys_str_mv |
AT marisaferraraboston parsingcostsaspredictorsofreadingdifficultyanevaluationusingthepotsdamsentencecorpus AT johnhale parsingcostsaspredictorsofreadingdifficultyanevaluationusingthepotsdamsentencecorpus AT reinholdkliegl parsingcostsaspredictorsofreadingdifficultyanevaluationusingthepotsdamsentencecorpus AT umeshpatil parsingcostsaspredictorsofreadingdifficultyanevaluationusingthepotsdamsentencecorpus AT shravanvasishth parsingcostsaspredictorsofreadingdifficultyanevaluationusingthepotsdamsentencecorpus |
_version_ |
1721423745032126464 |