Gesture in automatic discourse processing

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008. === Includes bibliographical references (p. 145-153). === Computers cannot fully understand spoken language without access to the wide range of modalities that accompany speech. This t...

Full description

Bibliographic Details
Main Author:	Eisenstein, Jacob (Jacob Richard)
Other Authors:	Regina Barzilay and Randall Davis.
Format:	Others
Language:	English
Published:	Massachusetts Institute of Technology 2009
Subjects:	Electrical Engineering and Computer Science.
Online Access:	http://hdl.handle.net/1721.1/44401

id	ndltd-MIT-oai-dspace.mit.edu-1721.1-44401
record_format	oai_dc
spelling	ndltd-MIT-oai-dspace.mit.edu-1721.1-444012019-05-02T16:38:20Z Gesture in automatic discourse processing Structured models of gesture for discourse processing Eisenstein, Jacob (Jacob Richard) Regina Barzilay and Randall Davis. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008. Includes bibliographical references (p. 145-153). Computers cannot fully understand spoken language without access to the wide range of modalities that accompany speech. This thesis addresses the particularly expressive modality of hand gesture, and focuses on building structured statistical models at the intersection of speech, vision, and meaning. My approach is distinguished in two key respects. First, gestural patterns are leveraged to discover parallel structures in the meaning of the associated speech. This differs from prior work that attempted to interpret individual gestures directly, an approach that was prone to a lack of generality across speakers. Second, I present novel, structured statistical models for multimodal language processing, which enable learning about gesture in its linguistic context, rather than in the abstract. These ideas find successful application in a variety of language processing tasks: resolving ambiguous noun phrases, segmenting speech into topics, and producing keyframe summaries of spoken language. In all three cases, the addition of gestural features - extracted automatically from video - yields significantly improved performance over a state-of-the-art text-only alternative. This marks the first demonstration that hand gesture improves automatic discourse processing. by Jacob Eisenstein. Ph.D. 2009-01-30T16:42:08Z 2009-01-30T16:42:08Z 2008 2008 Thesis http://hdl.handle.net/1721.1/44401 289020749 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 153 p. application/pdf Massachusetts Institute of Technology
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Electrical Engineering and Computer Science.
spellingShingle	Electrical Engineering and Computer Science. Eisenstein, Jacob (Jacob Richard) Gesture in automatic discourse processing
description	Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008. === Includes bibliographical references (p. 145-153). === Computers cannot fully understand spoken language without access to the wide range of modalities that accompany speech. This thesis addresses the particularly expressive modality of hand gesture, and focuses on building structured statistical models at the intersection of speech, vision, and meaning. My approach is distinguished in two key respects. First, gestural patterns are leveraged to discover parallel structures in the meaning of the associated speech. This differs from prior work that attempted to interpret individual gestures directly, an approach that was prone to a lack of generality across speakers. Second, I present novel, structured statistical models for multimodal language processing, which enable learning about gesture in its linguistic context, rather than in the abstract. These ideas find successful application in a variety of language processing tasks: resolving ambiguous noun phrases, segmenting speech into topics, and producing keyframe summaries of spoken language. In all three cases, the addition of gestural features - extracted automatically from video - yields significantly improved performance over a state-of-the-art text-only alternative. This marks the first demonstration that hand gesture improves automatic discourse processing. === by Jacob Eisenstein. === Ph.D.
author2	Regina Barzilay and Randall Davis.
author_facet	Regina Barzilay and Randall Davis. Eisenstein, Jacob (Jacob Richard)
author	Eisenstein, Jacob (Jacob Richard)
author_sort	Eisenstein, Jacob (Jacob Richard)
title	Gesture in automatic discourse processing
title_short	Gesture in automatic discourse processing
title_full	Gesture in automatic discourse processing
title_fullStr	Gesture in automatic discourse processing
title_full_unstemmed	Gesture in automatic discourse processing
title_sort	gesture in automatic discourse processing
publisher	Massachusetts Institute of Technology
publishDate	2009
url	http://hdl.handle.net/1721.1/44401
work_keys_str_mv	AT eisensteinjacobjacobrichard gestureinautomaticdiscourseprocessing AT eisensteinjacobjacobrichard structuredmodelsofgesturefordiscourseprocessing
_version_	1719044345209815040

Gesture in automatic discourse processing

Similar Items