Presentation 2016-01-20
A performance evaluation of PEACH3
Takahiro Kaneda, Chiharu Tsuruta, Toshihiro Hanawa, Hideharu Amano,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) A recent rapid increase of GPU performance makes GPGPU (General Purpose Computation using on GPUs) a mainstream of high performance computing. For a large scale target, multi-GPU systems in which a node with a few GPUs are connected through each host CPU have become popular. However, such systems suffer a large latency to communicate between GPUs attached to different hosts. In order to cope with this problem, Center for Computational Sciences, University of Tsukuba has developed a tightly couple accelerators (TCA) architecture which connects a large number of GPUs directly with dedicated switches PEACH2 (PCI-Express Adaptive Communication Hub 2). PEACH2 connects a host directly with PCI express Generation 2 x8, and forms a ring network by connecting neighboring PEACH2 switches directly also with PCI express. But, the PCIe Gen 2 protocol's bandwidths are smaller for the HPC. We developed the new board, PEACH3 with PCIe Gen 3 x8. In this report, We had a communication test and evaluated the performance of the PEACH3, and we achieved about twice higher bandwidth and lower latency as compared to the CUDA API in CPU-GPU communication within the node. In GPU-GPU communication that's across the nodes, PEACH3s have achieved almost the same performance as PEACH2s. Even if a communication driver level compared to CPU-CPU communication, by utilizing the API was less performance loss.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Heterogeneous cluster / Accelerator / GPU / TCA / PEACH3
Paper # VLD2015-96,CPSY2015-128,RECONF2015-78
Date of Issue 2016-01-12 (VLD, CPSY, RECONF)

Conference Information
Committee VLD / CPSY / RECONF / IPSJ-SLDM / IPSJ-ARC
Conference Date 2016/1/19(3days)
Place (in Japanese) (See Japanese page)
Place (in English) Hiyoshi Campus, Keio University
Topics (in Japanese) (See Japanese page)
Topics (in English) FPGA Applications, etc
Chair Yusuke Matsunaga(Kyushu Univ.) / Yasuhiko Nakashima(NAIST) / Minoru Watanabe(Shizuoka Univ.) / Masahiro Fukui(Ritsumeikan Univ.) / Masahiro Goshima(国情研)
Vice Chair Takashi Takenana(NEC) / Koji Nakano(Hiroshima Univ.) / Hidetsugu Irie(Univ. of Tokyo) / Masato Motomura(Hokkaido Univ.) / Yuichiro Shibata(Nagasaki Univ.)
Secretary Takashi Takenana(Ritsumeikan Univ.) / Koji Nakano(Fujitsu Labs.) / Hidetsugu Irie(Fujitsu Labs.) / Masato Motomura(NII) / Yuichiro Shibata(Toshiba) / (Univ. of Tsukuba) / (Sharp)
Assistant Ittetsu Taniguchi(Ritsumeikan Univ.) / Shinya Takameda(NAIST) / Takeshi Ohkawa(Utsunomiya Univ.) / Kazuya Tanikagawa(Hiroshima City Univ.) / Takefumi Miyoshi(e-trees.Japan)

Paper Information
Registration To Technical Committee on VLSI Design Technologies / Technical Committee on Computer Systems / Technical Committee on Reconfigurable Systems / Special Interest Group on System and LSI Design Methodology / Special Interest Group on System Architecture
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A performance evaluation of PEACH3
Sub Title (in English)
Keyword(1) Heterogeneous cluster
Keyword(2) Accelerator
Keyword(3) GPU
Keyword(4) TCA
Keyword(5) PEACH3
1st Author's Name Takahiro Kaneda
1st Author's Affiliation Keio University(Keio Univ)
2nd Author's Name Chiharu Tsuruta
2nd Author's Affiliation Keio University(Keio Univ)
3rd Author's Name Toshihiro Hanawa
3rd Author's Affiliation The University Tokyo(UTokyo)
4th Author's Name Hideharu Amano
4th Author's Affiliation Keio University(Keio Univ)
Date 2016-01-20
Paper # VLD2015-96,CPSY2015-128,RECONF2015-78
Volume (vol) vol.115
Number (no) VLD-398,CPSY-399,RECONF-400
Page pp.pp.155-160(VLD), pp.155-160(CPSY), pp.155-160(RECONF),
#Pages 6
Date of Issue 2016-01-12 (VLD, CPSY, RECONF)