Basis Function Adaptation Methods for Cost Approximation in MDP
We generalize a basis adaptation method for cost approximation in Markov decision processes (MDP), extending earlier work of Menache, Mannor, and Shimkin. In our context, basis functions are parametrized and their parameters are tuned by minimizing an objective function involving the cost function a...
Main Authors: | Yu, Huizhen (Author), Bertsekas, Dimitri P. (Contributor) |
---|---|
Other Authors: | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor), Massachusetts Institute of Technology. Laboratory for Information and Decision Systems (Contributor) |
Format: | Article |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers,
2010-10-13T18:33:03Z.
|
Subjects: | |
Online Access: | Get fulltext |
Similar Items
-
A Unifying Polyhedral Approximation Framework for Convex Optimization
by: Bertsekas, Dimitri P., et al.
Published: (2011) -
Pathologies of Temporal Difference Methods in Approximate Dynamic Programming
by: Bertsekas, Dimitri P.
Published: (2011) -
Approximate policy iteration: A survey and some new methods
by: Bertsekas, Dimitri P.
Published: (2012) -
Convergence Results for Some Temporal Difference Methods Based on Least Squares
by: Yu, Huizhen, et al.
Published: (2012) -
Q-learning and policy iteration algorithms for stochastic shortest path problems
by: Yu, Huizhen, et al.
Published: (2015)