MathInstitutes.org

Fault-Tolerant, Distributed In-Memory Computing for Large-Scale Linear Algebra and Optimization: An Algorithm–Hardware Co-Design Approach

Presenter

Paritosh Ramanan

May 4, 2026

ICERM

Event: Asynchronous Methods for Numerical Linear Algebra

Play Video

Abstract

Modern high-performance digital architectures are increasingly constrained by memory bottlenecks, energy consumption, and data movement costs. RRAM-based in-memory computing (IMC) has emerged as a promising approach to accelerate optimization and linear algebra workloads at substantially lower energy. However, IMC introduces non-idealities stemming from device characteristics, with highly consequential accuracy implications for the underlying computational workloads. This talk presents the development of hardware-software co-design strategies for scalable, fault-tolerant RRAM-based in-memory computation. We begin with a discussion of MELISO, a novel generalizable distributed-memory framework for simulating diverse IMC hardware paradigms including RRAMs. Building on this, we develop a multi-level error correction mechanism for matrix-vector multiplication that enables resilient computation across device types, enabling low-accuracy, low-energy RRAM devices to effectively compete with high-accuracy alternatives without sacrificing computational or energy gains. Using MELISO, we demonstrate up to five orders of magnitude improvement in energy consumption and two orders of magnitude reduction in latency for high-dimensional linear algebra workloads. We then present an RRAM-based distributed Primal-Dual Hybrid Gradient (PDHG) solver that targets linear optimization problems (LPs). We show that RRAM-based PDHG implementations can significantly reduce compute latency and energy consumption relative to GPU-based solvers while delivering competitive solution quality. We present a theoretical convergence analysis under realistic RRAM noise distributions as well as simulation results demonstrating up to two orders of magnitude latency reduction and three orders of magnitude energy savings over GPU baselines for medium-scale LPs. Finally, we introduce a noise-informed in-memory computing oriented Randomized Kaczmarz (IMC-RK) method for solving linear systems on RRAM hardware. We present preliminary results for IMC-RK using online SNR estimates from RRAM-induced noise that serve as the primary row selection mechanism, benchmarked against offline alternatives. We conclude our discussion by surveying some open problems and future directions in IMC.

Videos

Fault-Tolerant, Distributed In-Memory Computing for Large-Scale Linear Algebra and Optimization: An Algorithm–Hardware Co-Design Approach

Presenter

Abstract