Videos

Dan Roy - Size of Teachers as Measure of Data Complexity: PAC-Bayes Excess Risk Bounds & Scaling Law

Presenter
October 16, 2024
Abstract
Recorded 16 October 2024. Dan Roy of the University of Toronto presents "The Size of Teachers as a Measure of Data Complexity: PAC-Bayes Excess Risk Bounds and Scaling Laws" at IPAM's Theory and Practice of Deep Learning Workshop. Abstract: We study the generalization properties of randomly initialized neural networks, under the assumption that the network is larger than some unknown "teacher" network that achieves low risk. We extend the analysis of Buzaglo et al. (2024) to allow for student networks of arbitrary width and depth, and to the setting where no (small) teacher network perfectly interpolates the data. We obtain an oracle inequality, relating the risk of Gibbs posterior sampling to that of narrow teacher networks. As a result, the sample complexity is once again bounded in terms of the size of narrow teacher networks that themselves achieve small risk. We then introduce a new notion of data complexity, based on the minimal size of a teacher network required to achieve a certain level of excess risk. By comparing the scaling laws resulting from our bounds to those observed in empirical studies, we are able to estimate the data complexity of standard benchmarks according to our measure. Learn more online at: https://www.ipam.ucla.edu/programs/workshops/workshop-ii-theory-and-practice-of-deep-learning/?tab=overview