Scalable reward learning from demonstration

Scalable reward learning from demonstration

Reward learning from demonstration is the task of inferring the intents or goals of an agent demonstrating a task. Inverse reinforcement learning methods utilize the Markov decision process (MDP) framework to learn rewards, but typically scale poorly since they rely on the calculation of optimal val...

Full description

Bibliographic Details
Main Authors:	Michini, Bernard J. (Contributor), How, Jonathan P. (Contributor), Cutler, Mark Johnson (Contributor)
Other Authors:	Massachusetts Institute of Technology. Aerospace Controls Laboratory (Contributor), Massachusetts Institute of Technology. Department of Aeronautics and Astronautics (Contributor)
Format:	Article
Language:	English
Published:	Institute of Electrical and Electronics Engineers (IEEE), 2015-05-08T18:42:15Z.
Subjects:	Article
Online Access:	Get fulltext

Similar Items

Bayesian nonparametric reward learning from demonstration
by: Michini, Bernard (Bernard J.)
Published: (2014)

Lightweight infrared sensing for relative navigation of quadrotors
by: Cutler, Mark Johnson, et al.
Published: (2013)

Bayesian Nonparametric Inverse Reinforcement Learning
by: How, Jonathan P., et al.
Published: (2013)

Improving the efficiency of Bayesian inverse reinforcement learning
by: How, Jonathan P., et al.
Published: (2013)

Comparison of Fixed and Variable Pitch Actuators for Agile Quadrotors
by: Cutler, Mark Johnson, et al.
Published: (2013)

L[subscript 1] Adaptive Control for Indoor Autonomous Vehicles: Design Process and Flight Testing
by: How, Jonathan P., et al.
Published: (2010)

A Human-Interactive Course of Action Planner for Aircraft Carrier Deck Operations
by: Michini, Bernard J., et al.
Published: (2013)

Efficient reinforcement learning for robots using informative simulated priors
by: Cutler, Mark Johnson, et al.
Published: (2017)

Reinforcement learning with multi-fidelity simulators
by: Cutler, Mark Johnson, et al.
Published: (2016)

Efficient hindsight reinforcement learning using demonstrations for robotic tasks with sparse rewards
by: Guoyu Zuo, et al.
Published: (2020-01-01)

Predicting optimal value functions by interpolating reward functions in scalarized multi-objective reinforcement learning
by: Kusari, Arpan, et al.
Published: (2021)

Predicting optimal value functions by interpolating reward functions in scalarized multi-objective reinforcement learning
by: Kusari, Arpan, et al.
Published: (2021)

Actuator Constrained Trajectory Generation and Control for Variable-Pitch Quadrotors
by: Cutler, Mark Johnson, et al.
Published: (2013)

Analysis and Control of a Variable-Pitch Quadrotor for Agile Flight
by: Cutler, Mark Johnson, et al.
Published: (2017)

Active Reward Learning for Co-Robotic Vision Based Exploration in Bandwidth Limited Environments
by: Jamieson, Stewart Christopher, et al.
Published: (2021)

Automated Battery Swap and Recharge to Enable Persistent UAV Missions
by: Toksoz, Tuna, et al.
Published: (2013)

Rapid transfer of controllers between UAVs using learning-based adaptive control
by: Chowdhary, Girish, et al.
Published: (2015)

Demonstration of a Scalable, Multiplexed Ion Trap for Quantum Information Processing
by: Leibrandt, David Ray, et al.
Published: (2014)

Modeling and adaptive control of indoor unmanned aerial vehicles
by: Michini, Bernard (Bernard J.)
Published: (2010)

Scalable, High-Sensitivity X-Band Rectenna Array for the Demonstration of Space-to-Earth Power Beaming
by: Brian B. Tierney, et al.
Published: (2021-01-01)

Demonstration of a Flexible Bandwidth Optical Transmitter/Receiver System Scalable to Terahertz Bandwidths
by: David J. Geisler, et al.
Published: (2011-01-01)

Demonstrated Internal-External Reward Excectancies as a Variable in Group Counseling
by: Lamb, Donald Wayne
Published: (1968)

An SGBM-XVA demonstrator: a scalable Python tool for pricing XVA
by: Ki Wai Chau, et al.
Published: (2020-02-01)

Learning from delayed rewards
by: Watkins, Christopher John Cornish Hellaby
Published: (1989)

Experimental Results of Concurrent Learning Adaptive Controllers
by: Chowdhary, Girish, et al.
Published: (2013)

Learning from Noisy and Delayed Rewards The Value of Reinforcement Learning to Defense Modeling and Simulation
by: Alt, Jonathan K.
Published: (2012)

Learning from human-generated reward
by: Knox, William Bradley
Published: (2013)

Demonstration of scalable microring weight bank control for large-scale photonic integrated circuits
by: Chaoran Huang, et al.
Published: (2020-04-01)

Learning reward timing in cortex through reward dependent expression of synaptic plasticity
by: Gavornik, Jeffrey, et al.
Published: (2009)

Decoupled multiagent path planning via incremental sequential convex programming
by: Chen, Yu Fan, et al.
Published: (2016)

Reinforcement learning for robots through efficient simulator sampling
by: Cutler, Mark Johnson
Published: (2016)

Demonstrating the power of quantum computers, certification of highly entangled measurements and scalable quantum nonlocality
by: Elisa Bäumer, et al.
Published: (2021-07-01)

Learning reward frequency over reward probability: A tale of two learning rules
by: Cornwall, A.C, et al.
Published: (2019)

Reinforcement Learning from Demonstration
by: Suay, Halit Bener
Published: (2016)

A model of food reward learning with dynamic reward exposure
by: Ross A Hammond, et al.
Published: (2012-10-01)

Effects of Nicotine Withdrawal on Motivation, Reward Sensitivity and Reward-Learning
by: Oliver, Jason A.
Published: (2015)

The nature of reward, and the modification of reward contingencies, in emotion-based learning
by: Bowman, Caroline H.
Published: (2004)

Scalable learning of actions from unlabeled videos
by: O'Hara, Stephen
Published: (2013)

Prior fear conditioning and reward learning interact in fear and reward networks
by: Lisa eBulganin, et al.
Published: (2014-03-01)

Learning motion primitives from demonstration
by: Mingshan Chi, et al.
Published: (2017-12-01)

Cannot write session to /tmp/vufind_sessions/sess_nqj18nuvplfbl0rq5ap1b91jqv