Limitations and Extensions of the WoLF-PHC Algorithm
Policy Hill Climbing (PHC) is a reinforcement learning algorithm that extends Q-learning to learn probabilistic policies for multi-agent games. WoLF-PHC extends PHC with the "win or learn fast" principle. A proof that PHC will diverge in self-play when playing Shapley's game is given...
Main Author: | |
---|---|
Format: | Others |
Published: |
BYU ScholarsArchive
2007
|
Subjects: | |
Online Access: | https://scholarsarchive.byu.edu/etd/1222 https://scholarsarchive.byu.edu/cgi/viewcontent.cgi?article=2221&context=etd |