A study of value iteration and policy iteration for Markov decision processes in Deterministic systems
In the context of deterministic discrete-time control systems, we examined the implementation of value iteration (VI) and policy (PI) algorithms in Markov decision processes (MDPs) situated within Borel spaces. The deterministic nature of the system's transfer function plays a pivotal role, as...
| Published in: | AIMS Mathematics |
|---|---|
| Main Authors: | , |
| Format: | Article |
| Language: | English |
| Published: |
AIMS Press
2024-11-01
|
| Subjects: | |
| Online Access: | https://www.aimspress.com/article/doi/10.3934/math.20241613 |
