Presentation | 2022-09-09 Presentation Slide Assessment System using Visual and Semantic Segmentation Features Shengzhou Yi, Junichiro Matsugami, Toshihiko Yamasaki, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In this paper, we present a new presentation slide assessment system that can consider structural features of the slides more easily. Our previous work used a neural network to identify novice vs. well-designed presentation slides based on visual and structural features. However, the structural feature extraction was based on the bounding box information of a PPTX file. Therefore, it is unavailable for the users who are unwilling to upload editable PPTX files and those who use other applications such as Google Slides and Keynote. In order to solve this problem, we extract the semantic segmentation of presentation slides from the slide images as a new format of structural features to replace the previous structural features extracted from XML files (i.e., PPTX files). The proposed multi-modal Transformer extracts the visual and structural features from the original images and semantic segmentation results, respectively, to assess the slide design. The prediction targets are the top-10 checkpoints pointed out by the professional consultants. Class-imbalanced learning methods are used for addressing the imbalanced label distribution, and multi-task learning are also applied to improve the accuracy of the proposed model. In the optimal settings of the used machine learning methods for each checkpoint, the proposed model only requiring slide images achieved an average accuracy of 81.67% that is comparative to the performance of the previous work requiring slide images and XML files. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Presentation Slide / Feature Learning / Class Imbalance / Multi-Task Learning |
Paper # | MVE2022-13 |
Date of Issue | 2022-09-01 (MVE) |
Conference Information | |
Committee | MVE |
---|---|
Conference Date | 2022/9/8(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Kiyoshi Kiyokawa(NAIST) |
Vice Chair | Sumaru Niida(KDDI Research) |
Secretary | Sumaru Niida(NAIST) |
Assistant | Hidehiko Shishido(Univ. of Tsukuba) / Atsushi Nakazawa(Kyoto Univ.) / Naoya Tojo(KDDI Research) / Naoki Hagiyama(NTT) |
Paper Information | |
Registration To | Technical Committee on Media Experience and Virtual Environment |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Presentation Slide Assessment System using Visual and Semantic Segmentation Features |
Sub Title (in English) | |
Keyword(1) | Presentation Slide |
Keyword(2) | Feature Learning |
Keyword(3) | Class Imbalance |
Keyword(4) | Multi-Task Learning |
1st Author's Name | Shengzhou Yi |
1st Author's Affiliation | The University of Tokyo(UTokyo) |
2nd Author's Name | Junichiro Matsugami |
2nd Author's Affiliation | Rubato Co., Ltd.(Rubato) |
3rd Author's Name | Toshihiko Yamasaki |
3rd Author's Affiliation | The University of Tokyo(UTokyo) |
Date | 2022-09-09 |
Paper # | MVE2022-13 |
Volume (vol) | vol.122 |
Number (no) | MVE-175 |
Page | pp.pp.16-21(MVE), |
#Pages | 6 |
Date of Issue | 2022-09-01 (MVE) |