On multi-armed bandits theory and applications

How would one go about choosing a near-best option from an effectively infinite set in finite time, with imperfect knowledge of the quality of the options? Such problems arise in computer science (e.g. online learning, reinforcement learning, and recommender systems) and beyond. Consider drug testin...

Full description

Bibliographic Details
Published:
Online Access:http://hdl.handle.net/2047/D20316241