Sparser Johnson-Lindenstrauss Transforms
Presenter
February 16, 2012
Keywords:
- Mappings
MSC:
- 20M15
Abstract
The Johnson-Lindenstrauss (JL) lemma (1984) states that any n points
in d-dimensional Euclidean space can be embedded into k = O((log
n)/eps^2) dimensions so that all pairwise distances are preserved up
to 1+/-eps. Furthermore, this embedding can be achieved via a linear
mapping. The JL lemma is a useful tool for speeding up solutions to
several high-dimensional problems: closest pair, nearest neighbor,
diameter, minimum spanning tree, etc. It also speeds up some
clustering and string processing algorithms, reduces the amount of
storage required to store a dataset, and can be used to reduce memory
required for numerical linear algebra problems such as linear
regression and low rank approximation.
The original proofs of the JL lemma let the linear mapping be
specified by a random dense k x d matrix (e.g. i.i.d. Gaussian
entries). Thus, performing an embedding requires dense matrix-vector
multiplication. We give the first construction of linear mappings for
JL in which only a subconstant fraction of the embedding matrix is
non-zero, regardless of how eps and n are related, thus always
speeding up the embedding time. Previous constructions only achieved
sparse embedding matrices for 1/eps >> log n.
This is joint work with Daniel Kane (Stanford).