Fast Two-Stage Computation of an Index Policy for Multi-Armed Bandits with Setup Delays
We consider the multi-armed bandit problem with penalties for switching that include setup delays and costs, extending the former results of the author for the special case with no switching delays. A priority index for projects with setup delays that characterizes, in part, optimal policies was int...
| Published in: | Mathematics |
|---|---|
| Main Author: | |
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2020-12-01
|
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/9/1/52 |
