Presentation 2016-10-06
A Method for Predicting Image Region Specified by Query Texts
Kou Endo, Kotarou Funakoshi, Eric Nichols, Masaki Aono,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We propose a method to identify the area in the image that corresponds to a user's text query. The input to the system is the image and the text representing the area. The output from the system is the bounding box (x, y, w, h), where (x, y) is the upper-left corner coordinates of the area, w is its width, and h is its height. We treat area prediction as a regression problem and train deep neural network model directly the input image and query text, eliminating the need for external candidate region prediction. In evaluation on the "ReferIt" dataset, provided by the University of North Carolina, our proposed approach achieves state-of-the-art performance, surpassing a baseline system that learns independent regressions model for each of the four parameters and the candidate generation and ranking approach of Hu et al.[2]
Keyword(in Japanese) (See Japanese page)
Keyword(in English) deep learning / query texts / bounding box / regression
Paper # IE2016-66
Date of Issue 2016-09-29 (IE)

Conference Information
Committee IE / ITE-ME / ITE-AIT
Conference Date 2016/10/6(2days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Seishi Takamura(NTT) / Miki Haseyama(Hokkaido Univ.) / Tokiichiro Takahashi(TDU)
Vice Chair Takayuki Hamamoto(Tokyo Univ. of Science) / Atsuro Ichigaya(NHK) / Norio Tagawa(Tokyo Metropolitan Univ.)
Secretary Takayuki Hamamoto(NTT) / Atsuro Ichigaya(Chiba Inst. of Tech.) / Norio Tagawa
Assistant Kei Kawamura(KDDI R&D Labs.) / Keita Takahashi(Nagoya Univ.)

Paper Information
Registration To Technical Committee on Image Engineering / Technical Group on Media Engineering / Technical Group on Artistic Image Technology
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A Method for Predicting Image Region Specified by Query Texts
Sub Title (in English)
Keyword(1) deep learning
Keyword(2) query texts
Keyword(3) bounding box
Keyword(4) regression
1st Author's Name Kou Endo
1st Author's Affiliation Toyohashi University of Technology(TUT)
2nd Author's Name Kotarou Funakoshi
2nd Author's Affiliation Honda Research Institute Japan(HRI-J)
3rd Author's Name Eric Nichols
3rd Author's Affiliation Honda Research Institute Japan(HRI-J)
4th Author's Name Masaki Aono
4th Author's Affiliation Toyohashi University of Technology(TUT)
Date 2016-10-06
Paper # IE2016-66
Volume (vol) vol.116
Number (no) IE-239
Page pp.pp.7-12(IE),
#Pages 6
Date of Issue 2016-09-29 (IE)