Videos

Batched Bandit Problems

Presenter
May 18, 2015
Keywords:
  • Practical arithmetic
MSC:
  • 97F90
Abstract
Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic multi-armed bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives already close to minimax optimal regret bounds and we also evaluate the number of trials in each batch.