Ecological adaptation in the context of an actor-critic

This thesis addresses the problem of behaviour selection and learning in an ethological manner. This considers the agent’s perception of the environment and of its internal drives as modulatory elements that bias the way behaviours are selected. Furthermore, this happens in a framework where the age...

Full description

Bibliographic Details
Main Author: Cos-Aguilera, Ignasi
Published: University of Edinburgh 2006
Subjects:
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.645013
Description
Summary:This thesis addresses the problem of behaviour selection and learning in an ethological manner. This considers the agent’s perception of the environment and of its internal drives as modulatory elements that bias the way behaviours are selected. Furthermore, this happens in a framework where the agent is capable of perceiving changes in the feedback from the environment in terms of reward. A schema to learn object affordances is proposed and integrated in an actor-critic reinforcement learning algorithm. This is proposed to be the core of a motivation and reinforcement framework driving the selection of behaviour and the adaptation of behavioural patterns. Its working principle is the aforementioned capacity of perceiving changes in the environment and in the agent’s internal physiology in terms of reward, and of modifying the behavioural patterns accordingly. The aforementioned ideas on motivation, behaviour selection, learning and perception have been made explicit in an architecture integrated in an simulated robotic platform. To demonstrate the reach of its validity, extensive simulations have been performed to address the affordance learning paradigm and the reach of the adaptive power offered by the framework of the actor-critic. To this end, the effect of external and internal perception on the learning and behaviour selection processes has been measured along there different dimensions: the performance in terms of flexibility of adaptation, the physiological stability and the cycles of execution of behaviour at each contingency. In addition to this, this thesis has begun to frame the integration of behaviours of appetitive and consummatory nature in a single schema. Finally, it also contributes to disambiguate the role of dopamine as a neurotransmitter involved in the process of learning.