Constrained Multimodal Data Mining using Coupled Matrix and Tensor Factorizations
Presenter
July 24, 2023
Abstract
There is an emerging need to jointly analyze heterogeneous multimodal data sets and capture the underlying patterns in an interpretable way. For instance, joint analysis of omics measurements (e.g., about the metabolome, microbiome, genome) holds the promise to provide a more complete picture of human health, and reveal better stratifications of people improving precision medicine and nutrition. Some of these measurements are dynamic and can be arranged as a higher-order tensor (e.g., subjects by metabolites by time) while some are static data sets in the form of matrices (e.g., subjects by features). Tensor factorizations have proved useful in terms of revealing the underlying patterns from higher-order tensors, and have been extended to joint analysis of data from multiple sources through coupled matrix and tensor factorizations (CMTF). While CMTF-based methods are effective for multimodal data mining, there are various challenges, in particular, in terms of capturing the underlying patterns in an interpretable way and understanding the temporal evolution of those patterns. In this talk, we first introduce a flexible algorithmic framework relying on Alternating Optimization (AO) and the Alternating Direction Method of Multipliers (ADMM) in order to facilitate the use of a variety of constraints, loss functions and couplings with linear transformations when fitting CMTF models. Numerical experiments on simulated and real data demonstrate that the proposed AO-ADMM-based approach is accurate, flexible and computationally efficient with comparable or better performance than available CMTF algorithms. We then discuss the extension of the framework to joint analysis of dynamic and static data sets by incorporating alternative tensor factorization approaches, which have shown promising performance in terms of revealing evolving patterns in temporal data analysis.