Paper Abstract and Keywords |
Presentation |
2020-03-17 16:00
Acceleration of Deep Learning Inference by Model Cascading Shohei Enomoto, Takeharu Eda (NTT) PRMU2019-98 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
In recent years, various applications have appeared due to the development of deep learning and the spread of IoT devices.
It is desirable for these applications to complete its processing on IoT devices, but it is difficult to create DNN models that have high-accuracy and satisfy the resource constraints of IoT devices.
Therefore, an approach, called model cascading is studied that can realize high-accuracy and high-speed inference by deploying a lightweight model on an IoT device and a high-accuracy model on the cloud and offloading the inference processing to the high-accuracy model only when the prediction of the lightweight model is not credible.
In the model cascading, confidence score which estimates how much the lightweight model is confident about its prediction result is important. Several previous studies obtained the confidence score from their prediction probability values, but it is known that the probability value is not accurate for estimating confidence score.
In this paper, we propose a method for optimizing the loss function for the normal task and the model cascading at the same time when learning a lightweight model, and obtaining confidence using prediction probability suitable for model cascading.
The proposed method achieved a reduction in computational cost of up to $ 36 % $ and a reduction in communication cost of $ 41 % $ compared to the case of inference using only ResNet152. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
Deep Learning / DNN / Inference / Model Cascading / IoT / Edge Device / / |
Reference Info. |
IEICE Tech. Rep., vol. 119, no. 481, PRMU2019-98, pp. 203-208, March 2020. |
Paper # |
PRMU2019-98 |
Date of Issue |
2020-03-09 (PRMU) |
ISSN |
Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
PRMU2019-98 |
Conference Information |
Committee |
PRMU IPSJ-CVIM |
Conference Date |
2020-03-16 - 2020-03-17 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
|
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
|
Paper Information |
Registration To |
PRMU |
Conference Code |
2020-03-PRMU-CVIM |
Language |
Japanese |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
Acceleration of Deep Learning Inference by Model Cascading |
Sub Title (in English) |
|
Keyword(1) |
Deep Learning |
Keyword(2) |
DNN |
Keyword(3) |
Inference |
Keyword(4) |
Model Cascading |
Keyword(5) |
IoT |
Keyword(6) |
Edge Device |
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Shohei Enomoto |
1st Author's Affiliation |
NIPPON TELEGRAPH AND TELEPHONE CORPORATION (NTT) |
2nd Author's Name |
Takeharu Eda |
2nd Author's Affiliation |
NIPPON TELEGRAPH AND TELEPHONE CORPORATION (NTT) |
3rd Author's Name |
|
3rd Author's Affiliation |
() |
4th Author's Name |
|
4th Author's Affiliation |
() |
5th Author's Name |
|
5th Author's Affiliation |
() |
6th Author's Name |
|
6th Author's Affiliation |
() |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
Speaker |
Author-1 |
Date Time |
2020-03-17 16:00:00 |
Presentation Time |
15 minutes |
Registration for |
PRMU |
Paper # |
PRMU2019-98 |
Volume (vol) |
vol.119 |
Number (no) |
no.481 |
Page |
pp.203-208 |
#Pages |
6 |
Date of Issue |
2020-03-09 (PRMU) |
|