Videos

Optimization, Learning and the Universality of Mirror Descent

March 29, 2012
Keywords:
  • Statistical Learning Theory, Stochastic Optimization, Online Learning
Abstract
I will discuss deep connections between Statistical Learning, Online Learning and Optimization. I will show that there is a tight correspondence between the sample size required for learning and the number of local oracle accesses required for optimization, and the same measures of "complexity" (e.g. the fat-shattering dimension or Rademacher complexity) control both of them. Furthermore, I will show how the Mirror Descent method, and in particular its stochastic/online variant, is in a strong sense "universal" for online learning, statistical learning and optimization. That is, for a general class of convex learning/optimization problems, Mirror Descent can always achieve a (nearly) optimal guarantee. In the context of statistical learning, this also implies that for a broad generic class of convex problems, learning can be done optimally (in the worst-case agnostic-PAC sense) with a single pass over the data.