Presentation | 2020-03-17 Acceleration of Deep Learning Inference by Model Cascading Shohei Enomoto, Takeharu Eda, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In recent years, various applications have appeared due to the development of deep learning and the spread of IoT devices. It is desirable for these applications to complete its processing on IoT devices, but it is difficult to create DNN models that have high-accuracy and satisfy the resource constraints of IoT devices. Therefore, an approach, called model cascading is studied that can realize high-accuracy and high-speed inference by deploying a lightweight model on an IoT device and a high-accuracy model on the cloud and offloading the inference processing to the high-accuracy model only when the prediction of the lightweight model is not credible. In the model cascading, confidence score which estimates how much the lightweight model is confident about its prediction result is important. Several previous studies obtained the confidence score from their prediction probability values, but it is known that the probability value is not accurate for estimating confidence score. In this paper, we propose a method for optimizing the loss function for the normal task and the model cascading at the same time when learning a lightweight model, and obtaining confidence using prediction probability suitable for model cascading. The proposed method achieved a reduction in computational cost of up to $ 36 % $ and a reduction in communication cost of $ 41 % $ compared to the case of inference using only ResNet152. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Deep Learning / DNN / Inference / Model Cascading / IoT / Edge Device |
Paper # | PRMU2019-98 |
Date of Issue | 2020-03-09 (PRMU) |
Conference Information | |
Committee | PRMU / IPSJ-CVIM |
---|---|
Conference Date | 2020/3/16(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Yoichi Sato(Univ. of Tokyo) |
Vice Chair | Toru Tamaki(Hiroshima Univ.) / Akisato Kimura(NTT) |
Secretary | Toru Tamaki(NTT) / Akisato Kimura(OMRON SINICX) |
Assistant | Yusuke Uchida(DeNA) / Takayoshi Yamashita(Chubu Univ.) |
Paper Information | |
Registration To | Technical Committee on Pattern Recognition and Media Understanding / Special Interest Group on Computer Vision and Image Media |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Acceleration of Deep Learning Inference by Model Cascading |
Sub Title (in English) | |
Keyword(1) | Deep Learning |
Keyword(2) | DNN |
Keyword(3) | Inference |
Keyword(4) | Model Cascading |
Keyword(5) | IoT |
Keyword(6) | Edge Device |
1st Author's Name | Shohei Enomoto |
1st Author's Affiliation | NIPPON TELEGRAPH AND TELEPHONE CORPORATION(NTT) |
2nd Author's Name | Takeharu Eda |
2nd Author's Affiliation | NIPPON TELEGRAPH AND TELEPHONE CORPORATION(NTT) |
Date | 2020-03-17 |
Paper # | PRMU2019-98 |
Volume (vol) | vol.119 |
Number (no) | PRMU-481 |
Page | pp.pp.203-208(PRMU), |
#Pages | 6 |
Date of Issue | 2020-03-09 (PRMU) |