MathInstitutes.org

On the convergence of gradient descent for wide two-layer neural networks (Part 1)

Presenter

Francis Bach

October 4, 2021

IMSI

Event: Introduction to Distributed Solutions

On the convergence of gradient descent for wide two-layer neural networks (Part 1) Thumbnail

Play Video

Abstract

Many supervised learning methods are naturally cast as optimization problems. For prediction models which are linear in their parameters, this often leads to convex problems for which many guarantees exist. Models which are non-linear in their parameters such as neural networks lead to non-convex optimization problems for which guarantees are harder to obtain. In this talk, I will consider two-layer neural networks with homogeneous activation functions where the number of hidden neurons tends to infinity, and show how qualitative convergence guarantees may be derived. I will also highlight open problems related to the quantitative behavior of gradient descent for such models. (Joint work with Lénaïc Chizat)

Abstract

Videos

On the convergence of gradient descent for wide two-layer neural networks (Part 1)

Presenter

Abstract