Presentation 2021-03-25
Optimizing Data Transfer between CPU and GPU in Model Parallel Training with Mesh TensorFlow
Hironori Yokote, Shinobu Miwa, Hayato Yamaki, Hiroki Honda,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Since deep learning requires an enormous amount of computation time, it is often executed on multiple GPUs. Mesh TensorFlow has been proposed as a language for model parallelization, which is one of the parallelization methods for deep learning. In this paper, we optimize data transfer between CPU and GPU in model parallelization using Mesh TensorFlow. Specifically, our optimization enables training data to be transferred from the CPU to each GPU directly in parallel, though it is originally transferred via a specific GPU in the sample code of Mesh TensorFlow. Our experimental results show that our optimization can both reduce the time of the data transfer and improve the efficiency of GPU-memory utilization.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Mesh TensorFlow / Model Parallel / GPU
Paper # CPSY2020-56,DC2020-86
Date of Issue 2021-03-18 (CPSY, DC)

Conference Information
Committee CPSY / DC / IPSJ-SLDM / IPSJ-EMB / IPSJ-ARC
Conference Date 2021/3/25(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Online
Topics (in Japanese) (See Japanese page)
Topics (in English) ETNET2021
Chair Hidetsugu Irie(Univ. of Tokyo) / Hiroshi Takahashi(Ehime Univ.) / Yuichi Nakamura(NEC) / / Hiroshi Inoue(Kyushu Univ.)
Vice Chair Michihiro Koibuchi(NII) / Kota Nakajima(Fujitsu Lab.) / Tatsuhiro Tsuchiya(Osaka Univ.)
Secretary Michihiro Koibuchi(Univ. of Tokyo) / Kota Nakajima(Nagoya Inst. of Tech.) / Tatsuhiro Tsuchiya(Nihon Univ.) / (Chiba Univ.) / (Tokyo City Univ.) / (Kochi Univ. of Tech.)
Assistant Shugo Ogawa(Hitachi) / Eiji Arima(Univ. of Tokyo)

Paper Information
Registration To Technical Committee on Computer Systems / Technical Committee on Dependable Computing / Special Interest Group on System and LSI Design Methodology / Special Interest Group on Embedded Systems / Special Interest Group on System Architecture
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Optimizing Data Transfer between CPU and GPU in Model Parallel Training with Mesh TensorFlow
Sub Title (in English)
Keyword(1) Mesh TensorFlow
Keyword(2) Model Parallel
Keyword(3) GPU
1st Author's Name Hironori Yokote
1st Author's Affiliation The University of Electro-Communications(UEC)
2nd Author's Name Shinobu Miwa
2nd Author's Affiliation The University of Electro-Communications(UEC)
3rd Author's Name Hayato Yamaki
3rd Author's Affiliation The University of Electro-Communications(UEC)
4th Author's Name Hiroki Honda
4th Author's Affiliation The University of Electro-Communications(UEC)
Date 2021-03-25
Paper # CPSY2020-56,DC2020-86
Volume (vol) vol.120
Number (no) CPSY-435,DC-436
Page pp.pp.37-42(CPSY), pp.37-42(DC),
#Pages 6
Date of Issue 2021-03-18 (CPSY, DC)