Presentation 2022-02-22
Contrastive Self-Supervised Learning Framework for Unsupervised Video Summarization
Xianliang Zhang, Li Tao, Xueting Wang, Toshihiko Yamasaki,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) The rapid growth of video data aggravates the effort by viewers in exploring informative data. This paper presents a framework based on contrastive learning for unsupervised video summarization to help people to extract important parts in those videos. In contrastive learning, anchor-positive and anchor-negative pairs are usually employed to fulfill learning deep representation from the anchor. In our study, a positive sample by reversing the anchor video is introduced, whose summarization should also be a reversed one. Meanwhile, by destroying temporal relations in the anchor video, the intra-negative video is generated, whose summarization should be quite different from the anchor. Finally, we design our framework to explore the similarity and differences of such samples with the anchor by two proposed summary losses. Experimental evaluations on two benchmark datasets show that our proposed framework surpasses the state-of-the-art unsupervised methods in terms of F-score and correlation coefficients. Without using any annotation, our method can even outperform many supervised methods. We also show that our framework can further enhance the summarization performance by training on large-scale external data that are collected from social networks. Quantitative experiments also show that our method can be integrated into other models with better performance and quicker convergence, indicating the generality of the algorithm.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) contrastive learningvideo summarizationlarge-scale external dataquicker convergence
Paper # ITS2021-44,IE2021-53
Date of Issue 2022-02-14 (ITS, IE)

Conference Information
Committee IE / ITS / ITE-AIT / ITE-ME / ITE-MMS
Conference Date 2022/2/21(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Online
Topics (in Japanese) (See Japanese page)
Topics (in English) Image Processing, etc.
Chair Kazuya Kodama(NII) / Masahiro Fujii(Utsunomiya Univ.) / Hisaki Nate(Tokyo Polytechnic Univ.) / Hiroyuki Arai(Nippon Inst. of Tech.) / Kenji Machida(NHK)
Vice Chair Hiroyuki Bandoh(NTT) / Toshihiko Yamazaki(Univ. of Tokyo) / Kohei Ohno(Meiji Univ.) / Naohisa Hashimoto(AIST) / / Shogo Muramatsu(Niigata Univ.)
Secretary Hiroyuki Bandoh(KDDI Research) / Toshihiko Yamazaki(Nagoya Inst. of Tech.) / Kohei Ohno(Akita Prefectural Univ.) / Naohisa Hashimoto(NIT, Tsuruoka College) / / Shogo Muramatsu(NHK) / (Hokkaido Univ.)
Assistant Shunsuke Iwamura(NHK) / Shinobu Kudo(NTT) / Msataka Imao(Mitsubishi Electric) / Kenshi Saho(Toyama Prefectural Univ.) / Keiji Jimi(Gunma Univ.)

Paper Information
Registration To Technical Committee on Image Engineering / Technical Committee on Intelligent Transport Systems Technology / Technical Group on Artistic Image Technology / Technical Group on Media Engineering / Technical Group on Multi-media Storage
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Contrastive Self-Supervised Learning Framework for Unsupervised Video Summarization
Sub Title (in English)
Keyword(1) contrastive learningvideo summarizationlarge-scale external dataquicker convergence
1st Author's Name Xianliang Zhang
1st Author's Affiliation The University of Tokyo(UTokyo)
2nd Author's Name Li Tao
2nd Author's Affiliation The University of Tokyo(UTokyo)
3rd Author's Name Xueting Wang
3rd Author's Affiliation CyberAgent AI Lab(CyberAgent AI Lab)
4th Author's Name Toshihiko Yamasaki
4th Author's Affiliation The University of Tokyo(UTokyo)
Date 2022-02-22
Paper # ITS2021-44,IE2021-53
Volume (vol) vol.121
Number (no) ITS-373,IE-374
Page pp.pp.115-120(ITS), pp.115-120(IE),
#Pages 6
Date of Issue 2022-02-14 (ITS, IE)