Multiscale analysis of accelerated gradient methods in machine learning
Presenter
October 28, 2019
Abstract
Mohammad Farazmand
North Carolina State University
Mathematics
Accelerated gradient descent iterations are widely used in optimization and, in particular, in machine learning. It is known that, in the continuous-time limit, these iterations converge to a second-order differential equation which we refer to as the accelerated gradient flow. Using geometric singular perturbation theory, we show that, under certain conditions, the accelerated gradient flow possesses an attracting invariant slow manifold to which the trajectories of the flow converge asymptotically. We obtain a general explicit expression in the form of functional series expansions that approximates the slow manifold to any arbitrary order of accuracy. To the leading order, the accelerated gradient flow reduced to this slow manifold coincides with the usual gradient descent. We illustrate the implications of our results on three examples.