M-Learning: Heuristic Approach for Delayed Rewards in Reinforcement Learning

The current design of reinforcement learning methods requires extensive computational resources. Algorithms such as Deep Q-Network (DQN) have obtained outstanding results in advancing the field. However, the need to tune thousands of parameters and run millions of training episodes remains a signifi...

詳細記述

書誌詳細
出版年:	Mathematics
主要な著者:	Cesar Andrey Perdomo Charry, Marlon Sneider Mora Cortes, Oscar J. Perdomo
フォーマット:	論文
言語:	英語
出版事項:	MDPI AG 2025-06-01
主題:	reinforcement learning exploration–exploitation dilemma Q-learning frozen lake heuristic approach
オンライン･アクセス:	https://www.mdpi.com/2227-7390/13/13/2108

インターネット

https://www.mdpi.com/2227-7390/13/13/2108

M-Learning: Heuristic Approach for Delayed Rewards in Reinforcement Learning

インターネット

類似資料