IEICE Technical Committee Submission System
Conference Paper's Information
Online Proceedings
[Sign in]
Tech. Rep. Archives
 Go Top Page Go Previous   [Japanese] / [English] 

Paper Abstract and Keywords
Presentation 2020-03-17 16:00
Acceleration of Deep Learning Inference by Model Cascading
Shohei Enomoto, Takeharu Eda (NTT) PRMU2019-98
Abstract (in Japanese) (See Japanese page) 
(in English) In recent years, various applications have appeared due to the development of deep learning and the spread of IoT devices.
It is desirable for these applications to complete its processing on IoT devices, but it is difficult to create DNN models that have high-accuracy and satisfy the resource constraints of IoT devices.
Therefore, an approach, called model cascading is studied that can realize high-accuracy and high-speed inference by deploying a lightweight model on an IoT device and a high-accuracy model on the cloud and offloading the inference processing to the high-accuracy model only when the prediction of the lightweight model is not credible.
In the model cascading, confidence score which estimates how much the lightweight model is confident about its prediction result is important. Several previous studies obtained the confidence score from their prediction probability values, but it is known that the probability value is not accurate for estimating confidence score.
In this paper, we propose a method for optimizing the loss function for the normal task and the model cascading at the same time when learning a lightweight model, and obtaining confidence using prediction probability suitable for model cascading.
The proposed method achieved a reduction in computational cost of up to $ 36 % $ and a reduction in communication cost of $ 41 % $ compared to the case of inference using only ResNet152.
Keyword (in Japanese) (See Japanese page) 
(in English) Deep Learning / DNN / Inference / Model Cascading / IoT / Edge Device / /  
Reference Info. IEICE Tech. Rep., vol. 119, no. 481, PRMU2019-98, pp. 203-208, March 2020.
Paper # PRMU2019-98 
Date of Issue 2020-03-09 (PRMU) 
ISSN Online edition: ISSN 2432-6380
Copyright
and
reproduction
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034)
Download PDF PRMU2019-98

Conference Information
Committee PRMU IPSJ-CVIM  
Conference Date 2020-03-16 - 2020-03-17 
Place (in Japanese) (See Japanese page) 
Place (in English)  
Topics (in Japanese) (See Japanese page) 
Topics (in English)  
Paper Information
Registration To PRMU 
Conference Code 2020-03-PRMU-CVIM 
Language Japanese 
Title (in Japanese) (See Japanese page) 
Sub Title (in Japanese) (See Japanese page) 
Title (in English) Acceleration of Deep Learning Inference by Model Cascading 
Sub Title (in English)  
Keyword(1) Deep Learning  
Keyword(2) DNN  
Keyword(3) Inference  
Keyword(4) Model Cascading  
Keyword(5) IoT  
Keyword(6) Edge Device  
Keyword(7)  
Keyword(8)  
1st Author's Name Shohei Enomoto  
1st Author's Affiliation NIPPON TELEGRAPH AND TELEPHONE CORPORATION (NTT)
2nd Author's Name Takeharu Eda  
2nd Author's Affiliation NIPPON TELEGRAPH AND TELEPHONE CORPORATION (NTT)
3rd Author's Name  
3rd Author's Affiliation ()
4th Author's Name  
4th Author's Affiliation ()
5th Author's Name  
5th Author's Affiliation ()
6th Author's Name  
6th Author's Affiliation ()
7th Author's Name  
7th Author's Affiliation ()
8th Author's Name  
8th Author's Affiliation ()
9th Author's Name  
9th Author's Affiliation ()
10th Author's Name  
10th Author's Affiliation ()
11th Author's Name  
11th Author's Affiliation ()
12th Author's Name  
12th Author's Affiliation ()
13th Author's Name  
13th Author's Affiliation ()
14th Author's Name  
14th Author's Affiliation ()
15th Author's Name  
15th Author's Affiliation ()
16th Author's Name  
16th Author's Affiliation ()
17th Author's Name  
17th Author's Affiliation ()
18th Author's Name  
18th Author's Affiliation ()
19th Author's Name  
19th Author's Affiliation ()
20th Author's Name  
20th Author's Affiliation ()
Speaker Author-1 
Date Time 2020-03-17 16:00:00 
Presentation Time 15 minutes 
Registration for PRMU 
Paper # PRMU2019-98 
Volume (vol) vol.119 
Number (no) no.481 
Page pp.203-208 
#Pages
Date of Issue 2020-03-09 (PRMU) 


[Return to Top Page]

[Return to IEICE Web Page]


The Institute of Electronics, Information and Communication Engineers (IEICE), Japan