The Influence of Syntactic Frequencies on Human Sentence Processing
Main Author: | |
---|---|
Language: | English |
Published: |
The Ohio State University / OhioLINK
2017
|
Subjects: | |
Online Access: | http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929 |
id |
ndltd-OhioLink-oai-etd.ohiolink.edu-osu1502452939626929 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-OhioLink-oai-etd.ohiolink.edu-osu15024529396269292021-08-03T07:03:46Z The Influence of Syntactic Frequencies on Human Sentence Processing van Schijndel, Marten Linguistics Psychology Computer Science syntax computational linguistics psycholinguistics frequency effects text complexity prediction Humans are sensitive to the frequency of events, and this sensitivity is reflected in a wide range of behavioral and neural measures. This thesis focuses on the ways in which syntactic co-occurrence frequencies affect human language comprehension.Previous psycholinguistic findings seemed to show that humans are not sensitive to verbal subcategorization frequencies. Instead, this work demonstrates that sensitivity to fine-grained syntactic frequencies provide a confounding explanation for those findings. A left-corner parser is defined that can be used to compute a variety of psycholinguistic complexity metrics in order to better control for such syntactic influences in future studies.One of the strongest and most commonly used psycholinguistic measures output by the parser is surprisal (Hale, 2001; Levy, 2008), which estimates frequency-based comprehension difficulty based on the probability of an observation conditioned on the observations that preceded it. When used to predict reading times, however, this work shows that surprisal is mathematically inconsistent since it conditions on the immediately adjacent lexical material despite the fact that reading proceeds via saccades over non-adjacent material. This mathematical problem with surprisal can be corrected by summing surprisal over each saccade region to enable the measure to account for the probability of each new span of text conditioned on the preceding material that was actually observed. The corrected version of lexical (n-gram) surprisal, cumulative n-gram surprisal, obtains a better fit to reading times than the uncorrected version, though the correction does not work for surprisal over syntactic (probabilistic context-free; PCFG) structure.In addition to the frequency of observed events, this work explores the influence of frequency in how humans predict upcoming events. In particular, uncertainty about upcoming material (entropy) is shown to influence reading times, corroborating previous results in the literature (Roark et al., 2009; Angele et al., 2015). Unfortunately, the entropy over upcoming material is very expensive to compute, and so can be difficult to control for in psycholinguistic experiments. This work shows that the surprisal (n-gram and PCFG) of upcoming words, which is inexpensive to compute, can approximate the influence of that uncertainty on self-paced reading times.The results in this thesis indicate that humans are sensitive to both lexical sequence frequencies and syntactic frequencies, and this work concludes by providing a proof-of-concept model of syntactic acquisition that links the two types of frequencies. The acquisition model demonstrates how a learner that is sensitive to linear ordering frequencies could end up acquiring long-distance dependencies, typically conceived as a hallmark of hierarchical syntax, in a fashion that replicates the acquisition timeline of children. 2017 English text The Ohio State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929 http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws. |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
topic |
Linguistics Psychology Computer Science syntax computational linguistics psycholinguistics frequency effects text complexity prediction |
spellingShingle |
Linguistics Psychology Computer Science syntax computational linguistics psycholinguistics frequency effects text complexity prediction van Schijndel, Marten The Influence of Syntactic Frequencies on Human Sentence Processing |
author |
van Schijndel, Marten |
author_facet |
van Schijndel, Marten |
author_sort |
van Schijndel, Marten |
title |
The Influence of Syntactic Frequencies on Human Sentence Processing |
title_short |
The Influence of Syntactic Frequencies on Human Sentence Processing |
title_full |
The Influence of Syntactic Frequencies on Human Sentence Processing |
title_fullStr |
The Influence of Syntactic Frequencies on Human Sentence Processing |
title_full_unstemmed |
The Influence of Syntactic Frequencies on Human Sentence Processing |
title_sort |
influence of syntactic frequencies on human sentence processing |
publisher |
The Ohio State University / OhioLINK |
publishDate |
2017 |
url |
http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929 |
work_keys_str_mv |
AT vanschijndelmarten theinfluenceofsyntacticfrequenciesonhumansentenceprocessing AT vanschijndelmarten influenceofsyntacticfrequenciesonhumansentenceprocessing |
_version_ |
1719452830462377984 |