Videos

Locating Outliers in Large Matrices with Adaptive Compressive Sampling

Presenter
December 8, 2015
Keywords:
  • Adaptive Compressive Sampling
Abstract
Fueled by our increasingly information-based and data-driven society, there is a growing need for computational methods capable of “making sense” of volumes of data that may be very high-dimensional, corrupted, or even largely incomplete. A unifying theme in many modern investigations into problems of this form is to exploit intrinsic low-dimensional structure to facilitate inference. For example, a number of recent efforts have shown that computationally tractable “robust” variants of principal component analysis methods can provably succeed in recovering data vectors originating from some common low-dimensional subspace, even when the data are corrupted by outliers and/or have many missing elements. These tools have found utility in a number of applications in machine learning, bioinformatics, image processing, and collaborative filtering, to name a few. In this talk, we discuss some of our initial work on a related, but complimentary, task – rather than estimate vectors in the presence of corruption or outliers, we consider the problem of identifying the locations of corrupted data points or outliers in a large, otherwise low-rank, matrix. Such problems may arise, for example, when identifying malicious responses in survey data, detecting anomalous patterns in network traffic, or even in visual surveillance applications where one seeks to locate anomalies in a scene. We propose a simple two-step adaptive sensing and inference approach to these problems, and provide theoretical guarantees quantifying its performance. Our results show that under somewhat mild assumptions, accurate outlier identification is achievable using very few linear summaries of the rows and columns of the original data matrix (as few as the squared rank of the low-rank component plus the number of outliers, times constant and logarithmic factors). More generally, our investigation shows that the sample complexity and computational burden associated with the task of identifying outliers from a structured background can be significantly less than for methods that seek to recover or estimate the structured background itself in the presence of outliers. We demonstrate the performance of our approach experimentally on synthetic data, and in an application motivated by saliency map estimation tasks arising in computer vision and automated surveillance. Jarvis Haupt received the B.S., M.S. and Ph.D. degrees in electrical engineering from the University of Wisconsin-Madison in 2002, 2003, and 2009, respectively. From 2009-2010 he was a Postdoctoral Research Associate in the Dept. of Electrical and Computer Engineering at Rice University in Houston, Texas. He is currently an Assistant Professor in the Dept. of Electrical and Computer Engineering at the University of Minnesota. Professor Haupt is the recipient of several academic awards, including the Wisconsin Academic Excellence Scholarship, the Ford Motor Company Scholarship, the Consolidated Papers and Mead Witter Foundation Tuition Scholarships, the Frank D. Cady Mathematics Scholarship, and the Claude and Dora Richardson Distinguished Fellowship. He received the DARPA Young Faculty Award in 2014. His research interests generally include high-dimensional statistical inference, adaptive sampling techniques, and statistical signal processing and learning theory, with applications in the biological sciences, communications, imaging, and networks.