Presentation | 2020-09-10 With GPU-FPGA Heterogeneous computing, Highly Effective Communication for Distributed Deep Learning Kenji Tanaka, Yuki Arikawa, Tsuyoshi Ito, Kazutaka Morita, Naru Nemoto, Fumiaki Miura, Kazuhiko Terada, Junji Teramoto, Takashi Sakamoto, |
---|---|
PDF Download Page | ![]() |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In distributed deep learning (DL), collective communication (Allreduce) used to share training results between GPUs is a bottleneck. We develop a network interface card (NIC) implementing the Allreduce circuit in FPGA and a device driver for remote direct memory access (RDMA) between GPU and FPGA. A comparison of our system with a conventional RDMA system shows that our system can also conceal about 90 % of the communication overhead and improve scalability by 20 %. The end-to-end time consumed for training in distributed DL with ResNet-50 and ImageNet is reduced to 87.3 % without any degradation in validation accuracy. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Distributed Deep Learning / Parallel Computing / Heterogeneous Computing / FPGA |
Paper # | RECONF2020-19 |
Date of Issue | 2020-09-03 (RECONF) |
Conference Information | |
Committee | RECONF |
---|---|
Conference Date | 2020/9/10(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Online |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Reconfigurable system, etc. |
Chair | Yuichiro Shibata(Nagasaki Univ.) |
Vice Chair | Kentaro Sano(RIKEN) / Yoshiki Yamaguchi(Tsukuba Univ.) |
Secretary | Kentaro Sano(e-trees.Japan) / Yoshiki Yamaguchi(NEC) |
Assistant | Hiroki Nakahara(Tokyo Inst. of Tech.) / Yukitaka Takemura(INTEL) |
Paper Information | |
Registration To | Technical Committee on Reconfigurable Systems |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | With GPU-FPGA Heterogeneous computing, Highly Effective Communication for Distributed Deep Learning |
Sub Title (in English) | |
Keyword(1) | Distributed Deep Learning |
Keyword(2) | Parallel Computing |
Keyword(3) | Heterogeneous Computing |
Keyword(4) | FPGA |
1st Author's Name | Kenji Tanaka |
1st Author's Affiliation | NTT Device Technology Laboratories(NTT) |
2nd Author's Name | Yuki Arikawa |
2nd Author's Affiliation | NTT Device Technology Laboratories(NTT) |
3rd Author's Name | Tsuyoshi Ito |
3rd Author's Affiliation | NTT Device Technology Laboratories(NTT) |
4th Author's Name | Kazutaka Morita |
4th Author's Affiliation | NTT Software Innovation Center(NTT) |
5th Author's Name | Naru Nemoto |
5th Author's Affiliation | NTT Device Technology Laboratories(NTT) |
6th Author's Name | Fumiaki Miura |
6th Author's Affiliation | NTT Software Innovation Center(NTT) |
7th Author's Name | Kazuhiko Terada |
7th Author's Affiliation | NTT Device Technology Laboratories(NTT) |
8th Author's Name | Junji Teramoto |
8th Author's Affiliation | NTT Software Innovation Center(NTT) |
9th Author's Name | Takashi Sakamoto |
9th Author's Affiliation | NTT Device Technology Laboratories(NTT) |
Date | 2020-09-10 |
Paper # | RECONF2020-19 |
Volume (vol) | vol.120 |
Number (no) | RECONF-168 |
Page | pp.pp.1-6(RECONF), |
#Pages | 6 |
Date of Issue | 2020-09-03 (RECONF) |