MathInstitutes.org

Momentum in Stochastic Gradient Descent and Deep Neural Nets

Presenter

Bao Wang

January 29, 2020

Momentum in Stochastic Gradient Descent and Deep Neural Nets Thumbnail

Abstract

Bao Wang University of California, Los Angeles (UCLA) Mathematics Stochastic gradient-based optimization algorithms play perhaps the most important role in modern machine learning, in particular, deep learning. Nesterov accelerated gradient (NAG) is a celebrated technique to accelerate gradient descent, however, the NAG technique will fail in stochastic gradient descent (SGD). In this talk, I will discuss some recent progress in leveraging NAG and restart techniques to accelerate SGD. Also, I will discuss how to leverage momentum to design deep neural nets in a mathematically mechanistic manner. This is joint work with Tan Nguyen, Richard Baraniuk, Andrea Bertozzi, and Stan Osher.

Abstract

Supplementary Materials

Momentum in Stochastic Gradient Descent and Deep Neural Nets

Videos

Momentum in Stochastic Gradient Descent and Deep Neural Nets

Presenter

Abstract

Supplementary Materials