Presentation 2002/12/13
Multimodal identification from face to voice, and vice versa
Miyuki KAMACHI, Harold HILL, Eric Vatikiotis-BATESON, Karen LANDER,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We explore whether there is common audio and visual information for speaker identification. As a basic scheme, we utilized XAB tasks where a face (or a voice) speaking a short sentence was learned as X, and people choose between two voices (or faces) at test. Sentences used at learning and test were similar but not identical. Experiment 1 showed that performance was significantly better than chance, for both face and voice learning. However, in experiment 2, performances dropped to chance when the stimuli were presented backwards, suggesting that the critical information is spatiotemporal and direction specific. In experiment 3, we used sinewave speech (SWS) to limit the information available in the auditory signal. People were still able to match this to a previously seen face, consistent with the importance of coarse grain spatiotemporal information for this task. However, in this experiment people were at chance when going from voice to face, suggesting that it is difficult to encode identity specific information from sinewave speech although it can be recovered. Results are interpreted as suggesting that there is coarse-scale dynamic information specific to identity that is available both auditorily and visually.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) face / voice / speaker identification / sinewave speech
Paper # HIP2002-47
Date of Issue

Conference Information
Committee HIP
Conference Date 2002/12/13(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Human Information Processing (HIP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Multimodal identification from face to voice, and vice versa
Sub Title (in English)
Keyword(1) face
Keyword(2) voice
Keyword(3) speaker identification
Keyword(4) sinewave speech
1st Author's Name Miyuki KAMACHI
1st Author's Affiliation ATR International Human Information Sciences Laboratories()
2nd Author's Name Harold HILL
2nd Author's Affiliation ATR International Human Information Sciences Laboratories
3rd Author's Name Eric Vatikiotis-BATESON
3rd Author's Affiliation ATR International Human Information Sciences Laboratories
4th Author's Name Karen LANDER
4th Author's Affiliation Univrsity of Manchester, Department of Psychology
Date 2002/12/13
Paper # HIP2002-47
Volume (vol) vol.102
Number (no) 534
Page pp.pp.-
#Pages 6
Date of Issue