The Influence of Syntactic Frequencies on Human Sentence Processing

Bibliographic Details
Main Author:	van Schijndel, Marten
Language:	English
Published:	The Ohio State University / OhioLINK 2017
Subjects:	Linguistics Psychology Computer Science syntax computational linguistics psycholinguistics frequency effects text complexity prediction
Online Access:	http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929

id	ndltd-OhioLink-oai-etd.ohiolink.edu-osu1502452939626929
record_format	oai_dc
spelling	ndltd-OhioLink-oai-etd.ohiolink.edu-osu15024529396269292021-08-03T07:03:46Z The Influence of Syntactic Frequencies on Human Sentence Processing van Schijndel, Marten Linguistics Psychology Computer Science syntax computational linguistics psycholinguistics frequency effects text complexity prediction Humans are sensitive to the frequency of events, and this sensitivity is reflected in a wide range of behavioral and neural measures. This thesis focuses on the ways in which syntactic co-occurrence frequencies affect human language comprehension.Previous psycholinguistic findings seemed to show that humans are not sensitive to verbal subcategorization frequencies. Instead, this work demonstrates that sensitivity to fine-grained syntactic frequencies provide a confounding explanation for those findings. A left-corner parser is defined that can be used to compute a variety of psycholinguistic complexity metrics in order to better control for such syntactic influences in future studies.One of the strongest and most commonly used psycholinguistic measures output by the parser is surprisal (Hale, 2001; Levy, 2008), which estimates frequency-based comprehension difficulty based on the probability of an observation conditioned on the observations that preceded it. When used to predict reading times, however, this work shows that surprisal is mathematically inconsistent since it conditions on the immediately adjacent lexical material despite the fact that reading proceeds via saccades over non-adjacent material. This mathematical problem with surprisal can be corrected by summing surprisal over each saccade region to enable the measure to account for the probability of each new span of text conditioned on the preceding material that was actually observed. The corrected version of lexical (n-gram) surprisal, cumulative n-gram surprisal, obtains a better fit to reading times than the uncorrected version, though the correction does not work for surprisal over syntactic (probabilistic context-free; PCFG) structure.In addition to the frequency of observed events, this work explores the influence of frequency in how humans predict upcoming events. In particular, uncertainty about upcoming material (entropy) is shown to influence reading times, corroborating previous results in the literature (Roark et al., 2009; Angele et al., 2015). Unfortunately, the entropy over upcoming material is very expensive to compute, and so can be difficult to control for in psycholinguistic experiments. This work shows that the surprisal (n-gram and PCFG) of upcoming words, which is inexpensive to compute, can approximate the influence of that uncertainty on self-paced reading times.The results in this thesis indicate that humans are sensitive to both lexical sequence frequencies and syntactic frequencies, and this work concludes by providing a proof-of-concept model of syntactic acquisition that links the two types of frequencies. The acquisition model demonstrates how a learner that is sensitive to linear ordering frequencies could end up acquiring long-distance dependencies, typically conceived as a hallmark of hierarchical syntax, in a fashion that replicates the acquisition timeline of children. 2017 English text The Ohio State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929 http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection	NDLTD
language	English
sources	NDLTD
topic	Linguistics Psychology Computer Science syntax computational linguistics psycholinguistics frequency effects text complexity prediction
spellingShingle	Linguistics Psychology Computer Science syntax computational linguistics psycholinguistics frequency effects text complexity prediction van Schijndel, Marten The Influence of Syntactic Frequencies on Human Sentence Processing
author	van Schijndel, Marten
author_facet	van Schijndel, Marten
author_sort	van Schijndel, Marten
title	The Influence of Syntactic Frequencies on Human Sentence Processing
title_short	The Influence of Syntactic Frequencies on Human Sentence Processing
title_full	The Influence of Syntactic Frequencies on Human Sentence Processing
title_fullStr	The Influence of Syntactic Frequencies on Human Sentence Processing
title_full_unstemmed	The Influence of Syntactic Frequencies on Human Sentence Processing
title_sort	influence of syntactic frequencies on human sentence processing
publisher	The Ohio State University / OhioLINK
publishDate	2017
url	http://rave.ohiolink.edu/etdc/view?acc_num=osu1502452939626929
work_keys_str_mv	AT vanschijndelmarten theinfluenceofsyntacticfrequenciesonhumansentenceprocessing AT vanschijndelmarten influenceofsyntacticfrequenciesonhumansentenceprocessing
_version_	1719452830462377984

The Influence of Syntactic Frequencies on Human Sentence Processing

Similar Items