顔から声,声から顔のマルチモーダルな人物同定

蒲池 みゆき; Hill Harold; Vatikiotis-Bateson Eric; Lander Karen

Presentation	2002/12/13 Multimodal identification from face to voice, and vice versa Miyuki KAMACHI, Harold HILL, Eric Vatikiotis-BATESON, Karen LANDER,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	We explore whether there is common audio and visual information for speaker identification. As a basic scheme, we utilized XAB tasks where a face (or a voice) speaking a short sentence was learned as X, and people choose between two voices (or faces) at test. Sentences used at learning and test were similar but not identical. Experiment 1 showed that performance was significantly better than chance, for both face and voice learning. However, in experiment 2, performances dropped to chance when the stimuli were presented backwards, suggesting that the critical information is spatiotemporal and direction specific. In experiment 3, we used sinewave speech (SWS) to limit the information available in the auditory signal. People were still able to match this to a previously seen face, consistent with the importance of coarse grain spatiotemporal information for this task. However, in this experiment people were at chance when going from voice to face, suggesting that it is difficult to encode identity specific information from sinewave speech although it can be recovered. Results are interpreted as suggesting that there is coarse-scale dynamic information specific to identity that is available both auditorily and visually.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	face / voice / speaker identification / sinewave speech
Paper #	HIP2002-47
Date of Issue

Conference Information
Committee	HIP
Conference Date	2002/12/13(1days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To	Human Information Processing (HIP)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Multimodal identification from face to voice, and vice versa
Sub Title (in English)
Keyword(1)	face
Keyword(2)	voice
Keyword(3)	speaker identification
Keyword(4)	sinewave speech
1st Author's Name	Miyuki KAMACHI
1st Author's Affiliation	ATR International Human Information Sciences Laboratories()
2nd Author's Name	Harold HILL
2nd Author's Affiliation	ATR International Human Information Sciences Laboratories
3rd Author's Name	Eric Vatikiotis-BATESON
3rd Author's Affiliation	ATR International Human Information Sciences Laboratories
4th Author's Name	Karen LANDER
4th Author's Affiliation	Univrsity of Manchester, Department of Psychology
Date	2002/12/13
Paper #	HIP2002-47
Volume (vol)	vol.102
Number (no)	534
Page	pp.pp.-
#Pages	6
Date of Issue