Presentation Slide Assessment System using Visual and Semantic Segmentation Features

易 聖舟; ?上 純一郎; 山崎 俊彦

Presentation	2022-09-09 Presentation Slide Assessment System using Visual and Semantic Segmentation Features Shengzhou Yi, Junichiro Matsugami, Toshihiko Yamasaki,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	In this paper, we present a new presentation slide assessment system that can consider structural features of the slides more easily. Our previous work used a neural network to identify novice vs. well-designed presentation slides based on visual and structural features. However, the structural feature extraction was based on the bounding box information of a PPTX file. Therefore, it is unavailable for the users who are unwilling to upload editable PPTX files and those who use other applications such as Google Slides and Keynote. In order to solve this problem, we extract the semantic segmentation of presentation slides from the slide images as a new format of structural features to replace the previous structural features extracted from XML files (i.e., PPTX files). The proposed multi-modal Transformer extracts the visual and structural features from the original images and semantic segmentation results, respectively, to assess the slide design. The prediction targets are the top-10 checkpoints pointed out by the professional consultants. Class-imbalanced learning methods are used for addressing the imbalanced label distribution, and multi-task learning are also applied to improve the accuracy of the proposed model. In the optimal settings of the used machine learning methods for each checkpoint, the proposed model only requiring slide images achieved an average accuracy of 81.67% that is comparative to the performance of the previous work requiring slide images and XML files.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Presentation Slide / Feature Learning / Class Imbalance / Multi-Task Learning
Paper #	MVE2022-13
Date of Issue	2022-09-01 (MVE)

Conference Information
Committee	MVE
Conference Date	2022/9/8(2days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair	Kiyoshi Kiyokawa(NAIST)
Vice Chair	Sumaru Niida(KDDI Research)
Secretary	Sumaru Niida(NAIST)
Assistant	Hidehiko Shishido(Univ. of Tsukuba) / Atsushi Nakazawa(Kyoto Univ.) / Naoya Tojo(KDDI Research) / Naoki Hagiyama(NTT)

Paper Information
Registration To	Technical Committee on Media Experience and Virtual Environment
Language	ENG
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Presentation Slide Assessment System using Visual and Semantic Segmentation Features
Sub Title (in English)
Keyword(1)	Presentation Slide
Keyword(2)	Feature Learning
Keyword(3)	Class Imbalance
Keyword(4)	Multi-Task Learning
1st Author's Name	Shengzhou Yi
1st Author's Affiliation	The University of Tokyo(UTokyo)
2nd Author's Name	Junichiro Matsugami
2nd Author's Affiliation	Rubato Co., Ltd.(Rubato)
3rd Author's Name	Toshihiko Yamasaki
3rd Author's Affiliation	The University of Tokyo(UTokyo)
Date	2022-09-09
Paper #	MVE2022-13
Volume (vol)	vol.122
Number (no)	MVE-175
Page	pp.pp.16-21(MVE),
#Pages	6
Date of Issue	2022-09-01 (MVE)