Natural Image Statistics Enable Us to Quantitatively Model Visual Grouping and Figure-ground Cues

March 6, 2006
  • Visualization algorithms
  • 76M27
Visual grouping and figure-ground discrimination were first studied by the Gestalt school of visual perception nearly a century ago. By the use of cleverly constructed examples, they were able to demonstrate the role of factors such as proximity, similarity, curvilinear continuity and common fate in visual grouping and factors such as convexity, size, and symmetry in figure-ground discrimination. However, this left open (at least) three major problems (1) there wasn't a precise operationalization of these factors for general images, (2) the interaction of these cues was ill understood (3) and there was no justification for why these factors might be helpful to an observer interacting with the visual world. Over the last few years, we have been pursuing these problems in the following paradigm: (1) We start with a set of natural images and use human observers to mark the perceptual groups and assign figure-ground labels to the various boundary contours. (2) We construct computational models of various grouping and figure-ground factors. (3) We calibrate and optimally combine the grouping and figure-ground factors by using the principle that vision evolved to be adaptive to the statistics of objects in the natural world. In my talk I will report on two recent results in this paradigm. One is on understanding the power of the figure-ground cues, specifically size, lower-region and convexity. We compared the predictions of such a model with pyschophysics and found a pleasing agreement. The second is an attempt at a unified probabilistic framework for mid-level vision using conditional random fields defined on constrained Delaunay triangulations of image edges. This talk draws on joint work with Charless Fowlkes, David Martin and Xiaofeng Ren; various papers can be found on the web site