Toward a Needs-Based Architecture for ‘Intelligent’ Communicative Agents: Speaking with Intention

The past few years have seen considerable progress in the deployment of voice-enabled personal assistants, first on smartphones (such as Apple’s Siri) and most recently as standalone devices in people’s homes (such as Amazon’s Alexa). Such ‘intelligent’ communicative agents are distinguished from th...

Full description

Bibliographic Details
Main Authors:	Roger K. Moore, Mauro Nicolao
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2017-12-01
Series:	Frontiers in Robotics and AI
Subjects:	communicative agents spoken language processing hierarchical control intentional speech synthesis autonomous social agents mismatched priors
Online Access:	http://journal.frontiersin.org/article/10.3389/frobt.2017.00066/full

id	doaj-ea93aadd28474b97bd0877e53fe1efa5
record_format	Article
spelling	doaj-ea93aadd28474b97bd0877e53fe1efa52020-11-24T22:25:30ZengFrontiers Media S.A.Frontiers in Robotics and AI2296-91442017-12-01410.3389/frobt.2017.00066277317Toward a Needs-Based Architecture for ‘Intelligent’ Communicative Agents: Speaking with IntentionRoger K. Moore0Mauro Nicolao1Speech and Hearing Research Group, Department of Computer Science, University of Sheffield, Sheffield, United KingdomSpeech and Hearing Research Group, Department of Computer Science, University of Sheffield, Sheffield, United KingdomThe past few years have seen considerable progress in the deployment of voice-enabled personal assistants, first on smartphones (such as Apple’s Siri) and most recently as standalone devices in people’s homes (such as Amazon’s Alexa). Such ‘intelligent’ communicative agents are distinguished from the previous generation of speech-based systems in that they claim to offer access to services and information via conversational interaction (rather than simple voice commands). In reality, conversations with such agents have limited depth and, after initial enthusiasm, users typically revert to more traditional ways of getting things done. It is argued here that one source of the problem is that the standard architecture for a contemporary spoken language interface fails to capture the fundamental teleological properties of human spoken language. As a consequence, users have difficulty engaging with such systems, primarily due to a gross mismatch in intentional priors. This paper presents an alternative needs-driven cognitive architecture which models speech-based interaction as an emergent property of coupled hierarchical feedback-control processes in which a speaker has in mind the needs of a listener and a listener has in mind the intentions of a speaker. The implications of this architecture for future spoken language systems are illustrated using results from a new type of ‘intentional speech synthesiser’ that is capable of optimising its pronunciation in unpredictable acoustic environments as a function of its perceived communicative success. It is concluded that such purposeful behavior is essential to the facilitation of meaningful and productive spoken language interaction between human beings and autonomous social agents (such as robots). However, it is also noted that persistent mismatched priors may ultimately impose a fundamental limit on the effectiveness of speech-based human–robot interaction.http://journal.frontiersin.org/article/10.3389/frobt.2017.00066/fullcommunicative agentsspoken language processinghierarchical controlintentional speech synthesisautonomous social agentsmismatched priors
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Roger K. Moore Mauro Nicolao
spellingShingle	Roger K. Moore Mauro Nicolao Toward a Needs-Based Architecture for ‘Intelligent’ Communicative Agents: Speaking with Intention Frontiers in Robotics and AI communicative agents spoken language processing hierarchical control intentional speech synthesis autonomous social agents mismatched priors
author_facet	Roger K. Moore Mauro Nicolao
author_sort	Roger K. Moore
title	Toward a Needs-Based Architecture for ‘Intelligent’ Communicative Agents: Speaking with Intention
title_short	Toward a Needs-Based Architecture for ‘Intelligent’ Communicative Agents: Speaking with Intention
title_full	Toward a Needs-Based Architecture for ‘Intelligent’ Communicative Agents: Speaking with Intention
title_fullStr	Toward a Needs-Based Architecture for ‘Intelligent’ Communicative Agents: Speaking with Intention
title_full_unstemmed	Toward a Needs-Based Architecture for ‘Intelligent’ Communicative Agents: Speaking with Intention
title_sort	toward a needs-based architecture for ‘intelligent’ communicative agents: speaking with intention
publisher	Frontiers Media S.A.
series	Frontiers in Robotics and AI
issn	2296-9144
publishDate	2017-12-01
description	The past few years have seen considerable progress in the deployment of voice-enabled personal assistants, first on smartphones (such as Apple’s Siri) and most recently as standalone devices in people’s homes (such as Amazon’s Alexa). Such ‘intelligent’ communicative agents are distinguished from the previous generation of speech-based systems in that they claim to offer access to services and information via conversational interaction (rather than simple voice commands). In reality, conversations with such agents have limited depth and, after initial enthusiasm, users typically revert to more traditional ways of getting things done. It is argued here that one source of the problem is that the standard architecture for a contemporary spoken language interface fails to capture the fundamental teleological properties of human spoken language. As a consequence, users have difficulty engaging with such systems, primarily due to a gross mismatch in intentional priors. This paper presents an alternative needs-driven cognitive architecture which models speech-based interaction as an emergent property of coupled hierarchical feedback-control processes in which a speaker has in mind the needs of a listener and a listener has in mind the intentions of a speaker. The implications of this architecture for future spoken language systems are illustrated using results from a new type of ‘intentional speech synthesiser’ that is capable of optimising its pronunciation in unpredictable acoustic environments as a function of its perceived communicative success. It is concluded that such purposeful behavior is essential to the facilitation of meaningful and productive spoken language interaction between human beings and autonomous social agents (such as robots). However, it is also noted that persistent mismatched priors may ultimately impose a fundamental limit on the effectiveness of speech-based human–robot interaction.
topic	communicative agents spoken language processing hierarchical control intentional speech synthesis autonomous social agents mismatched priors
url	http://journal.frontiersin.org/article/10.3389/frobt.2017.00066/full
work_keys_str_mv	AT rogerkmoore towardaneedsbasedarchitectureforintelligentcommunicativeagentsspeakingwithintention AT mauronicolao towardaneedsbasedarchitectureforintelligentcommunicativeagentsspeakingwithintention
_version_	1725757309689266176

Toward a Needs-Based Architecture for ‘Intelligent’ Communicative Agents: Speaking with Intention

Similar Items