Batched Bandit Problems
Presenter
May 18, 2015
Keywords:
- Practical arithmetic
MSC:
- 97F90
Abstract
Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic multi-armed bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives already close to minimax optimal regret bounds and we also evaluate the number of trials in each batch.