Least Squares Temporal Difference Methods: An Analysis under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) with the least squares temporal difference (LSTD) algorithm, LSTD($\lambda$), in an exploration-enhanced learning context, where policy costs are computed from observations of a Markov chain differe...
Main Author: | Yu, Huizhen (Contributor) |
---|---|
Other Authors: | Massachusetts Institute of Technology. Laboratory for Information and Decision Systems (Contributor) |
Format: | Article |
Language: | English |
Published: |
Society for Industrial and Applied Mathematics,
2013-03-12T18:09:37Z.
|
Subjects: | |
Online Access: | Get fulltext |
Similar Items
-
Convergence Results for Some Temporal Difference Methods Based on Least Squares
by: Yu, Huizhen, et al.
Published: (2012) -
Gauss–Newton–Secant Method for Solving Nonlinear Least Squares Problems under Generalized Lipschitz Conditions
by: Ioannis K. Argyros, et al.
Published: (2021-07-01) -
Kernel Recursive Least-Squares Temporal Difference Algorithms with Sparsification and Regularization
by: Chunyuan Zhang, et al.
Published: (2016-01-01) -
Inequalities and equalities associated with ordinary least squares and generalized least squares in partitioned linear models
by: Chu, Ka Lok, 1975-
Published: (2004) -
Analysis of MIMO Receiver Using Generalized Least Squares Method in Colored Environments
by: Mohamed Lassaad Ammari, et al.
Published: (2014-01-01)