Presentation | 2023-12-06 An Efficient Sparse Matrix Storage Format for Sparse Matrix-Vector Multiplication and Sparse Matrix-Transpose-Vector Multiplication on GPUs Ryohei Izawa, Yasushi Inoguchi, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | The utilization of sparse matrix storage formats is widespread across various fields, including scientific computing, machine learning, and statistics. Within these domains, there is a need to perform Sparse Matrix-Vector Multiplication (SpMV) and Sparse Matrix-Transpose-Vector Multiplication (SpMVT) iteratively within a single application. However, executing SpMV and SpMVT on GPUs using existing sparse matrix storage formats presents challenges in terms of memory usage, memory access and load balancing. In our study, we present a novel sparse matrix storage format named GCSB, designed specifically for optimizing SpMV and SpMVT operations on GPUs through the implementation of advanced memory compression techniques. Expanding upon the pre-existing CSB format compatible with CPU-based SpMV and SpMVT, we extend its functionality to the GPU environment. This adaptation enables quicker execution of SpMV and SpMVT in comparison to CSR, achieved by effectively utilizing the L1 cache and ensuring load balancing, while maintaining the theoretical memory usage equivalent to that of CSR. Through our experiments, we demonstrate that GCSB achieves comparable theoretical memory usage to CSR while outperforming CSR in terms of speed on various matrices sourced from the University of Florida Sparse Matrix Collection. GCSB achieves a speedup of up to $1.47times$ speedup on TITAN RTX and $2.75times$ on A100. Furthermore, we show that GCSB reduces the L1 cache miss counts by strategically grouping and rearranging non-zero elements. Additionally, we conduct a qualitative assessment, affirming that GCSB exhibits superior performance, particularly when non-zero elements are widely dispersed throughout the matrix and the proportion of non-zero elements within the matrix is relatively high. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | sparse matrix-vector multiplicationsparse matrix-transpose-vector multiplicationcompressed sparse blocks; CSBcompressed sparse rows; CSRparallelGPU |
Paper # | CPSY2023-37 |
Date of Issue | 2023-11-28 (CPSY) |
Conference Information | |
Committee | CPSY / IPSJ-ARC / IPSJ-HPC |
---|---|
Conference Date | 2023/12/5(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Okinawa Industry Support Center |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Computer Systems, HPC, etc. |
Chair | Kota Nakajima(Fujitsu Lab.) / 津邑 公暁(名工大) / Takahiro Katagiri(名大) |
Vice Chair | Yasushi Inoguchi(JAIST) / Tomoaki Tsumura(Nagoya Inst. of Tech.) |
Secretary | Yasushi Inoguchi(Univ. of Tsukuba) / Tomoaki Tsumura(Hitachi) / (富士通) / (九大) |
Assistant | Ryuichi Sakamoto(Tokyo Inst. of Tech.) / Takumi Honda(Fujitsu) |
Paper Information | |
Registration To | Technical Committee on Computer Systems / Special Interest Group on System Architecture / Special Interest Group on High Performance Computing |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | An Efficient Sparse Matrix Storage Format for Sparse Matrix-Vector Multiplication and Sparse Matrix-Transpose-Vector Multiplication on GPUs |
Sub Title (in English) | |
Keyword(1) | sparse matrix-vector multiplicationsparse matrix-transpose-vector multiplicationcompressed sparse blocks; CSBcompressed sparse rows; CSRparallelGPU |
1st Author's Name | Ryohei Izawa |
1st Author's Affiliation | Japan Advanced Institute of Science and Technology(JAIST) |
2nd Author's Name | Yasushi Inoguchi |
2nd Author's Affiliation | Japan Advanced Institute of Science and Technology(JAIST) |
Date | 2023-12-06 |
Paper # | CPSY2023-37 |
Volume (vol) | vol.123 |
Number (no) | CPSY-293 |
Page | pp.pp.58-63(CPSY), |
#Pages | 6 |
Date of Issue | 2023-11-28 (CPSY) |