The Influence of Syntactic Frequencies on Human Sentence Processing

Bibliographic Details
Main Author: van Schijndel, Marten
Language:English
Published: The Ohio State University / OhioLINK 2017
Subjects:
Online Access:http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929
id ndltd-OhioLink-oai-etd.ohiolink.edu-osu1502452939626929
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-osu15024529396269292021-08-03T07:03:46Z The Influence of Syntactic Frequencies on Human Sentence Processing van Schijndel, Marten Linguistics Psychology Computer Science syntax computational linguistics psycholinguistics frequency effects text complexity prediction Humans are sensitive to the frequency of events, and this sensitivity is reflected in a wide range of behavioral and neural measures. This thesis focuses on the ways in which syntactic co-occurrence frequencies affect human language comprehension.Previous psycholinguistic findings seemed to show that humans are not sensitive to verbal subcategorization frequencies. Instead, this work demonstrates that sensitivity to fine-grained syntactic frequencies provide a confounding explanation for those findings. A left-corner parser is defined that can be used to compute a variety of psycholinguistic complexity metrics in order to better control for such syntactic influences in future studies.One of the strongest and most commonly used psycholinguistic measures output by the parser is surprisal (Hale, 2001; Levy, 2008), which estimates frequency-based comprehension difficulty based on the probability of an observation conditioned on the observations that preceded it. When used to predict reading times, however, this work shows that surprisal is mathematically inconsistent since it conditions on the immediately adjacent lexical material despite the fact that reading proceeds via saccades over non-adjacent material. This mathematical problem with surprisal can be corrected by summing surprisal over each saccade region to enable the measure to account for the probability of each new span of text conditioned on the preceding material that was actually observed. The corrected version of lexical (n-gram) surprisal, cumulative n-gram surprisal, obtains a better fit to reading times than the uncorrected version, though the correction does not work for surprisal over syntactic (probabilistic context-free; PCFG) structure.In addition to the frequency of observed events, this work explores the influence of frequency in how humans predict upcoming events. In particular, uncertainty about upcoming material (entropy) is shown to influence reading times, corroborating previous results in the literature (Roark et al., 2009; Angele et al., 2015). Unfortunately, the entropy over upcoming material is very expensive to compute, and so can be difficult to control for in psycholinguistic experiments. This work shows that the surprisal (n-gram and PCFG) of upcoming words, which is inexpensive to compute, can approximate the influence of that uncertainty on self-paced reading times.The results in this thesis indicate that humans are sensitive to both lexical sequence frequencies and syntactic frequencies, and this work concludes by providing a proof-of-concept model of syntactic acquisition that links the two types of frequencies. The acquisition model demonstrates how a learner that is sensitive to linear ordering frequencies could end up acquiring long-distance dependencies, typically conceived as a hallmark of hierarchical syntax, in a fashion that replicates the acquisition timeline of children. 2017 English text The Ohio State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929 http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic Linguistics
Psychology
Computer Science
syntax
computational linguistics
psycholinguistics
frequency effects
text complexity
prediction
spellingShingle Linguistics
Psychology
Computer Science
syntax
computational linguistics
psycholinguistics
frequency effects
text complexity
prediction
van Schijndel, Marten
The Influence of Syntactic Frequencies on Human Sentence Processing
author van Schijndel, Marten
author_facet van Schijndel, Marten
author_sort van Schijndel, Marten
title The Influence of Syntactic Frequencies on Human Sentence Processing
title_short The Influence of Syntactic Frequencies on Human Sentence Processing
title_full The Influence of Syntactic Frequencies on Human Sentence Processing
title_fullStr The Influence of Syntactic Frequencies on Human Sentence Processing
title_full_unstemmed The Influence of Syntactic Frequencies on Human Sentence Processing
title_sort influence of syntactic frequencies on human sentence processing
publisher The Ohio State University / OhioLINK
publishDate 2017
url http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929
work_keys_str_mv AT vanschijndelmarten theinfluenceofsyntacticfrequenciesonhumansentenceprocessing
AT vanschijndelmarten influenceofsyntacticfrequenciesonhumansentenceprocessing
_version_ 1719452830462377984