Randomly Wired Convolutional Neural Network推論回路のFPGA実装とその性能改善について

倉持 亮佑; 中原 啓貴

講演名	2021-09-10 Randomly Wired Convolutional Neural Network推論回路のFPGA実装とその性能改善について倉持亮佑(東工大), 中原啓貴(東工大),
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	畳み込みニューラルネットワーク (Convolutional Neural Network: CNN) は組み込みシステムやデータセンターなどで広く使用されており，画像を対象とするタスクにおいて高い認識精度を達成している．特にストリーミングビデオに対する画像処理をデータセンターで運用する際には，高認識精度・低レイテンシが重要となる．本研究ではランダムグラフを基に CNN モデルを構築する Randomly Wired Convolutional Neural Network (RWCNN) を対象とする推論回路を提案する．RWCNNは直接の依存関係のない多数の畳み込み層で構成される特徴を持ち，提案回路では畳み込み層を効率的にパイプライン処理する．各畳み込み層の処理時に複数の層の計算結果を入力として読み込む必要があるため，複数のHBM2 チャネルを用いて並列に計算結果へアクセスする．また，各畳み込み層の処理順序を適切に制御することで，パイプライン実行効率を向上させる．さらに HBM2 へのメモリアクセスの衝突を防ぐために，衝突グラフに対する頂点彩色を行うことで，HBM2 チャネルへ畳み込み層の割当てを行う．HBM2 を搭載するAlveo U50 FPGA 上に提案回路を実装し，性能測定を行った．CPU, GPU における RWCNN の推論性能との比較を行い，それぞれに対して 12.6 倍，4.93 倍高い電力効率を達成することができた．
抄録(英)	Convolutional neural networks (CNNs) are widely used for image processing tasks in both embedded systems and data centers. In data centers, high accuracy and low latency are desired for various tasks such as image processing of streaming videos. We propose an FPGA-based low-latency CNN inference for randomly wired convolutional neural networks (RWCNNs), whose layer structures are based on random graph models. Because RWCNNs have several convolution layers that have no direct dependencies between them, our architecture can process them efficiently using a pipeline method. At each layer, we need to use the calculation results of multiple layers as the input. We use an FPGA with HBM2 to enable parallel access to the input data with multiple HBM2 channels. We schedule the order of execution of the layers to improve the pipeline efficiency. We build a conflict graph using the scheduling results. Then, we allocate the calculation results of each layer to the HBM2 channels by coloring the graph. We implemented the proposed architecture on the Alveo U50 FPGA. We obtained 12.6 and 4.93 times better efficiency than CPU and GPU, respectively.
キーワード(和)	Deep Learning / CNN / FPGA / RWCNN
キーワード(英)	Deep Learning / CNN / FPGA / RWCNN
資料番号	RECONF2021-17
発行日	2021-09-03 (RECONF)

研究会情報
研究会	RECONF
開催期間	2021/9/10(から1日開催)
開催地（和）	オンライン開催
開催地（英）	Online
テーマ（和）	リコンフィギャラブルシステム，一般
テーマ（英）	Reconfigurable system, etc.
委員長氏名（和）	佐野健太郎(理研)
委員長氏名（英）	Kentaro Sano(RIKEN)
副委員長氏名（和）	山口佳樹(筑波大) / 泉知論(立命館大)
副委員長氏名（英）	Yoshiki Yamaguchi(Tsukuba Univ.) / Tomonori Izumi(Ritsumeikan Univ.)
幹事氏名（和）	小林悠記(NEC) / 中原啓貴(東工大)
幹事氏名（英）	Yuuki Kobayashi(NEC) / Hiroki Nakahara(Tokyo Inst. of Tech.)
幹事補佐氏名（和）	竹村幸尚(インテル) / 長名保範(琉球大学)
幹事補佐氏名（英）	Yukitaka Takemura(INTEL) / Yasunori Osana(Ryukyu Univ.)

講演論文情報詳細
申込み研究会	Technical Committee on Reconfigurable Systems
本文の言語	JPN
タイトル（和）	Randomly Wired Convolutional Neural Network推論回路のFPGA実装とその性能改善について
サブタイトル（和）
タイトル（英）	A Low-Latency Inference of Randomly Wired Convolutional Neural Networks on an FPGA
サブタイトル（和）
キーワード(1)（和/英）	Deep Learning / Deep Learning
キーワード(2)（和/英）	CNN / CNN
キーワード(3)（和/英）	FPGA / FPGA
キーワード(4)（和/英）	RWCNN / RWCNN
第 1 著者氏名（和/英）	倉持亮佑 / Ryosuke Kuramochi
第 1 著者所属（和/英）	東京工業大学(略称：東工大) Tokyo Institute of Technology(略称：Tokyo Tech)
第 2 著者氏名（和/英）	中原啓貴 / Hiroki Nakahara
第 2 著者所属（和/英）	東京工業大学(略称：東工大) Tokyo Institute of Technology(略称：Tokyo Tech)
発表年月日	2021-09-10
資料番号	RECONF2021-17
巻番号（vol）	vol.121
号番号（no）	RECONF-175
ページ範囲	pp.1-6(RECONF),
ページ数	6
発行日	2021-09-03 (RECONF)