Calibration and Validation of Approximate Likelihood Models
Presenter
May 12, 2021
Abstract
Many areas of the physical, engineering and biological sciences make extensive use of computer simulators to model complex systems. Whereas these simulators may be able to generate realistic synthetic data, they are often poorly suited for the inverse problem of inferring the underlying scientific mechanisms associated with observed real-world phenomena. Hence, a recent trend in the sciences has been to fit approximate models to high-fidelity simulators, and then use these approximate models for scientific inference. Inevitably, any downstream analysis will depend on the trustworthiness of the approximate model, the data collected, as well as the design of the simulations. In the first part of my talk, I will discuss the problem of validating a forward model. Most validation techniques compare histograms of a few summary statistics from a forward model with that of observed data or, equivalently, output from a high-resolution but costly model. Here we propose new methods that can provide insight into how two distributions of high-dimensional data (e.g. images or sequences of images) may differ, and if such differences are statistically significant. Then, in the second part of my talk, I will discuss the inverse problem of inferring parameters of interest when the likelihood (or function relating internal parameters with observed data) cannot be evaluated but is implicitly encoded by a forward model. I will describe a new machinery that bridges classical statistics with modern machine learning to provide scalable tools and diagnostics for constructing frequentist confidence sets with finite-sample validity in such a setting. (Part of this work is joint with Niccolo Dalmasso, Rafael Izbicki, Ilmun Kim and David Zhao.)