Presentation | 2021-03-03 [Memorial Lecture] Scheduling Sparse Matrix-Vector Multiplication onto Parallel Communication Architecture Mingfei Yu, Ruitao Gao, Masahiro Fujita, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | There is an obvious trend to make use of hardware including many-core CPU, GPU and FPGA, to conduct computationally intensive tasks of deep learning implementations, whilea large proportion of which can be formulated into the format of sparse matrix-vector multiplication(SpMV). In contrast with dense matrix-vector multiplication(DMV), scheduling solutions for SpMV targeting parallel processing turn out to be irregular, leading to the dilemma that scheduling problems are time-consuming or even infeasible, especially when the size of the involved matrix increases. In this paper, the minimum schedulingproblem of 4*4 SpMV on ring-connected architecture is first studied, with two concepts named multi-Input Vector and multi- Output Vector introduced. Then, we have conducted classification of 4*4 sparse matrices, since parallel schedule for matrices that are able to transform into each other can be simply obtained through mutual transformation, rather than time-consuming search. On account of this theory, we have put forward a decomposition-based algorithm for larger matrices. With the proposed algorithm, search space of the minimum schedule is considerably reduced, as the solvement is guided by known sub-scheduling solutions. Through comparison with an exhaustivesearch method and a brute force-based parallel scheduling method, the proposed algorithm is proved to be able to offer scheduling solutions of high-equality: averagely utilize 65.27%of the sparseness of the involved matrices and achieve 91.39% of the performance of the solutions generated by exhaustive search, with a remarkable saving of compilation time cost (250 times less) and the best scalability among the above mentioned approaches. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | sparse matrix-vector multiplication / parallel computing / communication structure / convolutional neural network |
Paper # | VLD2020-71,HWS2020-46 |
Date of Issue | 2021-02-24 (VLD, HWS) |
Conference Information | |
Committee | HWS / VLD |
---|---|
Conference Date | 2021/3/3(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Online |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Design Technology for System-on-Silicon, Hardware Security, etc. |
Chair | Makoto Ikeda(Univ. of Tokyo) / Daisuke Fukuda(Fujitsu Labs.) |
Vice Chair | Yasuhisa Shimazaki(Renesas Electronics) / Makoto Nagata(Kobe Univ.) / Kazutoshi Kobayashi(Kyoto Inst. of Tech.) |
Secretary | Yasuhisa Shimazaki(Kyushu Univ.) / Makoto Nagata(NTT) / Kazutoshi Kobayashi(Hitachi) |
Assistant | / Takuma Nishimoto(Hitachi) |
Paper Information | |
Registration To | Technical Committee on Hardware Security / Technical Committee on VLSI Design Technologies |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | [Memorial Lecture] Scheduling Sparse Matrix-Vector Multiplication onto Parallel Communication Architecture |
Sub Title (in English) | |
Keyword(1) | sparse matrix-vector multiplication |
Keyword(2) | parallel computing |
Keyword(3) | communication structure |
Keyword(4) | convolutional neural network |
1st Author's Name | Mingfei Yu |
1st Author's Affiliation | The University of Tokyo(Univ. Tokyo) |
2nd Author's Name | Ruitao Gao |
2nd Author's Affiliation | The University of Tokyo(Univ. Tokyo) |
3rd Author's Name | Masahiro Fujita |
3rd Author's Affiliation | The University of Tokyo(Univ. Tokyo) |
Date | 2021-03-03 |
Paper # | VLD2020-71,HWS2020-46 |
Volume (vol) | vol.120 |
Number (no) | VLD-400,HWS-401 |
Page | pp.pp.24-29(VLD), pp.24-29(HWS), |
#Pages | 6 |
Date of Issue | 2021-02-24 (VLD, HWS) |