Summary
International Conference on Machine Vision Applications
2023
Session Number:P1
Session:
Number:P1-20
ViTVO: Vision Transformer based Visual Odometry with Attention Supervision
Chiu Chu-Chi, Yang Hsuan-Kung, Chen Hao-Wei, Chen Yu-Wen, Lee Chun-Yi,
pp.-
Publication Date:2023/07/23
Online ISSN:2188-5079
DOI:10.34385/proc.78.P1-20
PDF download
Summary:
In this paper, we develop a Vision Transformer based visual odometry (VO), called ViTVO. ViTVO introduces an attention mechanism to perform visual odometry. Due to the nature of VO, Transformer based VO models tend to overconcentrate on few points, which may result in a degradation of accuracy. In addition, noises from dynamic objects usually cause difficulties in performing VO tasks. To overcome these issues, we propose an attention loss during training, which utilizes ground truth masks or self supervision to guide the attention maps to focus more on static regions of an image. In our experiments, we demonstrate the superior performance of ViTVO on the Sintel validation set, and validate the effectiveness of our attention supervision mechanism in performing VO tasks.