|
|
|
|
LEADER |
02809 am a22002653u 4500 |
001 |
107887 |
042 |
|
|
|a dc
|
100 |
1 |
0 |
|a Shah, Julie A
|e author
|
100 |
1 |
0 |
|a Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
|e contributor
|
100 |
1 |
0 |
|a Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
|e contributor
|
100 |
1 |
0 |
|a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
|e contributor
|
100 |
1 |
0 |
|a Shah, Julie A
|e contributor
|
100 |
1 |
0 |
|a Nikolaidis, Stefanos
|e contributor
|
100 |
1 |
0 |
|a Ramakrishnan, Ramya
|e contributor
|
100 |
1 |
0 |
|a Gu, Keren
|e contributor
|
700 |
1 |
0 |
|a Nikolaidis, Stefanos
|e author
|
700 |
1 |
0 |
|a Ramakrishnan, Ramya
|e author
|
700 |
1 |
0 |
|a Gu, Keren
|e author
|
245 |
0 |
0 |
|a Efficient Model Learning from Joint-Action Demonstrations for Human-Robot Collaborative Tasks
|
260 |
|
|
|b Institute of Electrical and Electronics Engineers (IEEE),
|c 2017-04-05T20:03:20Z.
|
856 |
|
|
|z Get fulltext
|u http://hdl.handle.net/1721.1/107887
|
520 |
|
|
|a We present a framework for automatically learning human user models from joint-action demonstrations that enables a robot to compute a robust policy for a collaborative task with a human. First, the demonstrated action sequences are clustered into different human types using an unsupervised learning algorithm. A reward function is then learned for each type through the employment of an inverse reinforcement learning algorithm. The learned model is then incorporated into a mixed-observability Markov decision process (MOMDP) formulation, wherein the human type is a partially observable variable. With this framework, we can infer online the human type of a new user that was not included in the training set, and can compute a policy for the robot that will be aligned to the preference of this user. In a human subject experiment (n=30), participants agreed more strongly that the robot anticipated their actions when working with a robot incorporating the proposed framework (p<0.01), compared to manually annotating robot actions. In trials where participants faced difficulty annotating the robot actions to complete the task, the proposed framework significantly improved team efficiency (p<0.01). The robot incorporating the framework was also found to be more responsive to human actions compared to policies computed using a hand-coded reward function by a domain expert (p<0.01). These results indicate that learning human user models from joint-action demonstrations and encoding them in a MOMDP formalism can support effective teaming in human-robot collaborative tasks.
|
546 |
|
|
|a en_US
|
655 |
7 |
|
|a Article
|
773 |
|
|
|t Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction - HRI '15
|