GEOMETRIC AND TOPOLOGICAL APPROACHES TO REPRESENTATION LEARNING IN BIOMEDICAL DATA
December 16, 2021
High-throughput, high-dimensional data has become ubiquitous in the biomedical sciences as a result of breakthroughs in measurement technologies and data collection. While these large datasets containing millions of observations of cells, peoples, or brain voxels hold great potential for understanding generative state space of the data, as well as drivers of differentiation, disease and progression, they also pose new challenges in terms of noise, missing data, measurement artifacts, and the so-called “curse of dimensionality.” In this talk, I will cover data geometric and topological approaches to understanding the shape and structure of the data. First, we show how diffusion geometry and deep learning can be used to obtain useful representations of the data that enable denoising, dimensionality reduction. Next we show how to combine diffusion geometry with topology to extract multi-granular features from the data to assist in differential and predictive analysis. On the flip side, we also create a manifold geometry from topological descriptors, and show its applications to neuroscience. Finally, we will show how to learn dynamics from static snapshot data by using a manifold-regularized neural ODE-based optimal transport. Together, we will show a complete framework for exploratory and unsupervised analysis of big biomedical data.