Videos

Implicit and Explicit Regularization in Deep Neural Networks

Presenter
May 18, 2021
Abstract
Babak Hassibi - California Institute of Technology Deep learning has demonstrably enjoyed a great deal of recent practical success and is arguably the main driver behind the resurgent interest in machine learning and AI. Despite its tremendous empirical achievements, we are far from a theoretical understanding of deep networks. In this talk, we will argue that the success of deep learning is not only due to the special deep architecture of the models, but also due to the behavior of the stochastic descent methods used, which play a key role in reaching "good" solutions that generalize well to unseen data. We will connect learning algorithms such as stochastic gradient descent (SGD) and stochastic mirror descent (SMD) to work in H-infinity control in the 1990's, and thereby explain the convergence and implicit-regularization behavior of the algorithms when we are highly over-parametrized (what is now being called the "interpolating regime"). This gives us insight into why deep networks exhibit such powerful generalization abilities, a phenomenon now being referred to as "the blessing of dimensionality". The theory also allows us to construct a new algorithm called regularized SMD (RSMD) which can provable explicitly regularize and has far superior generalization performance over SGD for noist data sets.