Relaxed gradient-type descent methods
Presenter
May 7, 2026
Abstract
Gradient methods and their variants are still widely used in large scale optimization. The main reasons for their appeal is that they are simple to understand and implement and that they require relatively little memory making them ideal for large-scale machine learning applications. One particular variation of the method that received a great deal of attention in the past is based on `relaxing' the traditional optimal step length associated with Cauchy's steepest descent (SD) method. The result of the relaxation procedure is it avoids the well-known zigzag effect of SD and every so often the search direction approaches some eigenvector of the underlying Hessian matrix. A number of techniques are described to speed-up convergence by taking advantage of the properties of the Lanczos method once a search direction approaches an eigenvector well-enough. We analyze the proposed strategies and illustrate them on the global minimization of strictly convex functions.