Limitations and Extensions of the WoLF-PHC Algorithm

Policy Hill Climbing (PHC) is a reinforcement learning algorithm that extends Q-learning to learn probabilistic policies for multi-agent games. WoLF-PHC extends PHC with the "win or learn fast" principle. A proof that PHC will diverge in self-play when playing Shapley's game is given...

Full description

Bibliographic Details
Main Author: Cook, Philip R.
Format: Others
Published: BYU ScholarsArchive 2007
Subjects:
Online Access:https://scholarsarchive.byu.edu/etd/1222
https://scholarsarchive.byu.edu/cgi/viewcontent.cgi?article=2221&context=etd