Nonlinear Meta-learning Can Guarantee Faster Rates
Presenter
January 9, 2024
Abstract
Many recent theoretical works on meta-learning aim to achieve guarantees in leveraging similar representational structures from related tasks towards simplifying a target task. The main aim of theoretical guarantees on the subject is to establish the extent to which convergence rates---in learning a common representation---may scale with the number N of tasks (as well as the number of samples per task). First steps in this setting demonstrate this property when both the shared representation amongst tasks, and task-specific regression functions, are linear. This linear setting readily reveals the benefits of aggregating tasks, e.g., via averaging arguments. In practice, however, the representation is often highly nonlinear, introducing nontrivial biases in each task that cannot easily be averaged out as in the linear case. In the present work, we derive theoretical guarantees for meta-learning with nonlinear representations. In particular, assuming the shared nonlinearity maps to an infinite-dimensional RKHS, we show that additional biases can be mitigated with careful regularization that leverages the smoothness of task-specific regression functions, yielding improved rates that scale with the number of tasks as desired.