On the Connection between Deep Neural Networks and Kernel Methods
Presenter
June 28, 2023
Abstract
Recent theoretical work has shown that under certain
conditions, massively overparameterized neural networks are equivalent
to kernel regressors with a family of kernels called Neural Tangent
Kernels (NTKs). My work in this subject aims to better understand the
properties of NTK for various network architectures and relate them to
the inductive bias of real neural networks. In particular, I will
argue that for input data distributed uniformly on the sphere NTK
favors low-frequency predictions over high-frequency ones, potentially
explaining why overparameterized networks can generalize even when
they perfectly fit their training data. I will further discuss the
behavior of NTK when data is distributed nonuniformly and show that
NTK (with ReLU activation) is tightly related to the classical Laplace
kernel, which has a simple closed-form. Finally, I will discuss our
analysis of NTK for convolutional networks, which indicates that these
networks are biased toward learning low frequency target functions
with any higher frequencies concentrated in local regions. Overall,
our results suggest that much insight about neural networks can be
obtained from the analysis of NTK.