Presentation 2021-03-03
[Memorial Lecture] Scheduling Sparse Matrix-Vector Multiplication onto Parallel Communication Architecture
Mingfei Yu, Ruitao Gao, Masahiro Fujita,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) There is an obvious trend to make use of hardware including many-core CPU, GPU and FPGA, to conduct computationally intensive tasks of deep learning implementations, whilea large proportion of which can be formulated into the format of sparse matrix-vector multiplication(SpMV). In contrast with dense matrix-vector multiplication(DMV), scheduling solutions for SpMV targeting parallel processing turn out to be irregular, leading to the dilemma that scheduling problems are time-consuming or even infeasible, especially when the size of the involved matrix increases. In this paper, the minimum schedulingproblem of 4*4 SpMV on ring-connected architecture is first studied, with two concepts named multi-Input Vector and multi- Output Vector introduced. Then, we have conducted classification of 4*4 sparse matrices, since parallel schedule for matrices that are able to transform into each other can be simply obtained through mutual transformation, rather than time-consuming search. On account of this theory, we have put forward a decomposition-based algorithm for larger matrices. With the proposed algorithm, search space of the minimum schedule is considerably reduced, as the solvement is guided by known sub-scheduling solutions. Through comparison with an exhaustivesearch method and a brute force-based parallel scheduling method, the proposed algorithm is proved to be able to offer scheduling solutions of high-equality: averagely utilize 65.27%of the sparseness of the involved matrices and achieve 91.39% of the performance of the solutions generated by exhaustive search, with a remarkable saving of compilation time cost (250 times less) and the best scalability among the above mentioned approaches.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) sparse matrix-vector multiplication / parallel computing / communication structure / convolutional neural network
Paper # VLD2020-71,HWS2020-46
Date of Issue 2021-02-24 (VLD, HWS)

Conference Information
Committee HWS / VLD
Conference Date 2021/3/3(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Online
Topics (in Japanese) (See Japanese page)
Topics (in English) Design Technology for System-on-Silicon, Hardware Security, etc.
Chair Makoto Ikeda(Univ. of Tokyo) / Daisuke Fukuda(Fujitsu Labs.)
Vice Chair Yasuhisa Shimazaki(Renesas Electronics) / Makoto Nagata(Kobe Univ.) / Kazutoshi Kobayashi(Kyoto Inst. of Tech.)
Secretary Yasuhisa Shimazaki(Kyushu Univ.) / Makoto Nagata(NTT) / Kazutoshi Kobayashi(Hitachi)
Assistant / Takuma Nishimoto(Hitachi)

Paper Information
Registration To Technical Committee on Hardware Security / Technical Committee on VLSI Design Technologies
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) [Memorial Lecture] Scheduling Sparse Matrix-Vector Multiplication onto Parallel Communication Architecture
Sub Title (in English)
Keyword(1) sparse matrix-vector multiplication
Keyword(2) parallel computing
Keyword(3) communication structure
Keyword(4) convolutional neural network
1st Author's Name Mingfei Yu
1st Author's Affiliation The University of Tokyo(Univ. Tokyo)
2nd Author's Name Ruitao Gao
2nd Author's Affiliation The University of Tokyo(Univ. Tokyo)
3rd Author's Name Masahiro Fujita
3rd Author's Affiliation The University of Tokyo(Univ. Tokyo)
Date 2021-03-03
Paper # VLD2020-71,HWS2020-46
Volume (vol) vol.120
Number (no) VLD-400,HWS-401
Page pp.pp.24-29(VLD), pp.24-29(HWS),
#Pages 6
Date of Issue 2021-02-24 (VLD, HWS)