Metric and manifold repair for missing data

September 11, 2020
Abstract: For many machine learning tasks, the input data lie on a low-dimensional manifold embedded in a high-dimensional space and, because of this high-dimensional structure, most algorithms inefficient. The typical solution is to reduce the dimension of the input data using a standard dimension reduction algorithms such as {\sc Isomap, Laplacian Eigenmaps} or {\sc LLEs}. This approach, however, does not always work in practice as these algorithms require that we have somewhat ideal data. Unfortunately, most data sets either have missing entries or unacceptably noisy values. That is, real data are far from ideal and we cannot use these algorithms directly. In this talk, we focus on the case when we have missing data. Some techniques, such as matrix completion, can be used to fill in missing data but these methods do not capture the non-linear structure of the manifold. Here, we present a new algorithm {\sc MR-Missing} that extends these previous algorithms and can be used to compute low dimensional representation on data sets with missing entries. We demonstrate the effectiveness of our algorithm by running three different experiments.