Presentation 2023-12-06
An Efficient Sparse Matrix Storage Format for Sparse Matrix-Vector Multiplication and Sparse Matrix-Transpose-Vector Multiplication on GPUs
Ryohei Izawa, Yasushi Inoguchi,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) The utilization of sparse matrix storage formats is widespread across various fields, including scientific computing, machine learning, and statistics. Within these domains, there is a need to perform Sparse Matrix-Vector Multiplication (SpMV) and Sparse Matrix-Transpose-Vector Multiplication (SpMVT) iteratively within a single application. However, executing SpMV and SpMVT on GPUs using existing sparse matrix storage formats presents challenges in terms of memory usage, memory access and load balancing. In our study, we present a novel sparse matrix storage format named GCSB, designed specifically for optimizing SpMV and SpMVT operations on GPUs through the implementation of advanced memory compression techniques. Expanding upon the pre-existing CSB format compatible with CPU-based SpMV and SpMVT, we extend its functionality to the GPU environment. This adaptation enables quicker execution of SpMV and SpMVT in comparison to CSR, achieved by effectively utilizing the L1 cache and ensuring load balancing, while maintaining the theoretical memory usage equivalent to that of CSR. Through our experiments, we demonstrate that GCSB achieves comparable theoretical memory usage to CSR while outperforming CSR in terms of speed on various matrices sourced from the University of Florida Sparse Matrix Collection. GCSB achieves a speedup of up to $1.47times$ speedup on TITAN RTX and $2.75times$ on A100. Furthermore, we show that GCSB reduces the L1 cache miss counts by strategically grouping and rearranging non-zero elements. Additionally, we conduct a qualitative assessment, affirming that GCSB exhibits superior performance, particularly when non-zero elements are widely dispersed throughout the matrix and the proportion of non-zero elements within the matrix is relatively high.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) sparse matrix-vector multiplicationsparse matrix-transpose-vector multiplicationcompressed sparse blocks; CSBcompressed sparse rows; CSRparallelGPU
Paper # CPSY2023-37
Date of Issue 2023-11-28 (CPSY)

Conference Information
Committee CPSY / IPSJ-ARC / IPSJ-HPC
Conference Date 2023/12/5(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Okinawa Industry Support Center
Topics (in Japanese) (See Japanese page)
Topics (in English) Computer Systems, HPC, etc.
Chair Kota Nakajima(Fujitsu Lab.) / 津邑 公暁(名工大) / Takahiro Katagiri(名大)
Vice Chair Yasushi Inoguchi(JAIST) / Tomoaki Tsumura(Nagoya Inst. of Tech.)
Secretary Yasushi Inoguchi(Univ. of Tsukuba) / Tomoaki Tsumura(Hitachi) / (富士通) / (九大)
Assistant Ryuichi Sakamoto(Tokyo Inst. of Tech.) / Takumi Honda(Fujitsu)

Paper Information
Registration To Technical Committee on Computer Systems / Special Interest Group on System Architecture / Special Interest Group on High Performance Computing
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) An Efficient Sparse Matrix Storage Format for Sparse Matrix-Vector Multiplication and Sparse Matrix-Transpose-Vector Multiplication on GPUs
Sub Title (in English)
Keyword(1) sparse matrix-vector multiplicationsparse matrix-transpose-vector multiplicationcompressed sparse blocks; CSBcompressed sparse rows; CSRparallelGPU
1st Author's Name Ryohei Izawa
1st Author's Affiliation Japan Advanced Institute of Science and Technology(JAIST)
2nd Author's Name Yasushi Inoguchi
2nd Author's Affiliation Japan Advanced Institute of Science and Technology(JAIST)
Date 2023-12-06
Paper # CPSY2023-37
Volume (vol) vol.123
Number (no) CPSY-293
Page pp.pp.58-63(CPSY),
#Pages 6
Date of Issue 2023-11-28 (CPSY)