Polymorphic Vectorization
Presenter
January 14, 2026
Abstract
Vectorization in GPUs is a specific form of parallelization that, loosely speaking, executes the same code on different inputs. This generally makes it hard to use vector hardware to parallelize tasks that are polymorphic, in the sense that the required sequence of instructions differs between tasks. I will explain an approach to this problem that augments the state space of the executed program, and will sketch two applications: (i) To certain MCMC algorithms such as slice sampling or HMC-NUTS, where threads differ in the number of times an inner while loop must be executed. (ii) To mechanical design problems, where each thread must optimize a different part of a coupled mechanical system.
Joint work with Ryan P Adams, Joshua Aduol, Hugh Dance, Pierre Glaser, and Alex Guerra.