Episodic Self-Imitation Learning with Hindsight

Episodic self-imitation learning, a novel self-imitation algorithm with a trajectory selection module and an adaptive loss function, is proposed to speed up reinforcement learning. Compared to the original self-imitation learning algorithm, which samples good state–action pairs from the experience r...

Full description

Bibliographic Details
Main Authors: Tianhong Dai, Hengyan Liu, Anil Anthony Bharath
Format: Article
Language:English
Published: MDPI AG 2020-10-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/9/10/1742