Streaming (Hyper)Graph Partitioning Algorithms
Presenter
February 20, 2025
Abstract
Processing large-scale (hyper)graphs with billions of entities and trillions of edges is crucial in fields such as bioinformatics, high-performance computing, as well as navigation and planning problems. Efficient (hyper)graph partitioning, which divides a (hyper)graph into sub-graphs while minimizing inter-block edges, is vital for optimizing parallel computing and enhancing data locality. Traditional in-memory partitioners like METIS and KaHIP deliver high-quality partitions but are often impractical for massive graphs due to their substantial memory requirements. Streaming partitioners address this challenge by reducing memory usage to O(n), where n is the number of nodes, by loading vertices sequentially and assigning them to blocks on-the-fly.
This talk first presents an overview of our recent work on scalable (hyper)graph algorithms, and then focuses on our recent advances in streaming (hyper)graph partitioning. We introduce higher quality streaming partitioning algorithms and present lightning-fast algorithms that reduce computational complexity from O(nk+m) to O(n+m). Additionally, we propose memory-efficient schemes that eliminate the need to store all block IDs for all vertices plainly in an array. Ultimately, we present streaming algorithms capable of partitioning trillion-edge graphs on edge devices.