Distributional Policy Optimization for Multi-Goal Reinforcement Learning
碩士 === 國立交通大學 === 電機工程學系 === 107 === Reinforcement learning generally performs learning though interaction with environment. During learning procedure, the agent will interact with environment, acquire the observations from environment, decide the actions and then receive the rewards. Reinforcement...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2018
|
Online Access: | http://ndltd.ncl.edu.tw/handle/43w23x |