Enhanced Off-Policy Reinforcement Learning With Focused Experience Replay
Utilizing the collected experience tuples in the replay buffer (RB) is the primary way of exploiting the experiences in the off-policy reinforcement learning (RL) algorithms, and, therefore, the sampling scheme for the experience tuples in the RB can be critical for experience utilization. In this p...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9444458/ |