Presentation | 2002/12/13 Multimodal identification from face to voice, and vice versa Miyuki KAMACHI, Harold HILL, Eric Vatikiotis-BATESON, Karen LANDER, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | We explore whether there is common audio and visual information for speaker identification. As a basic scheme, we utilized XAB tasks where a face (or a voice) speaking a short sentence was learned as X, and people choose between two voices (or faces) at test. Sentences used at learning and test were similar but not identical. Experiment 1 showed that performance was significantly better than chance, for both face and voice learning. However, in experiment 2, performances dropped to chance when the stimuli were presented backwards, suggesting that the critical information is spatiotemporal and direction specific. In experiment 3, we used sinewave speech (SWS) to limit the information available in the auditory signal. People were still able to match this to a previously seen face, consistent with the importance of coarse grain spatiotemporal information for this task. However, in this experiment people were at chance when going from voice to face, suggesting that it is difficult to encode identity specific information from sinewave speech although it can be recovered. Results are interpreted as suggesting that there is coarse-scale dynamic information specific to identity that is available both auditorily and visually. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | face / voice / speaker identification / sinewave speech |
Paper # | HIP2002-47 |
Date of Issue |
Conference Information | |
Committee | HIP |
---|---|
Conference Date | 2002/12/13(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Human Information Processing (HIP) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Multimodal identification from face to voice, and vice versa |
Sub Title (in English) | |
Keyword(1) | face |
Keyword(2) | voice |
Keyword(3) | speaker identification |
Keyword(4) | sinewave speech |
1st Author's Name | Miyuki KAMACHI |
1st Author's Affiliation | ATR International Human Information Sciences Laboratories() |
2nd Author's Name | Harold HILL |
2nd Author's Affiliation | ATR International Human Information Sciences Laboratories |
3rd Author's Name | Eric Vatikiotis-BATESON |
3rd Author's Affiliation | ATR International Human Information Sciences Laboratories |
4th Author's Name | Karen LANDER |
4th Author's Affiliation | Univrsity of Manchester, Department of Psychology |
Date | 2002/12/13 |
Paper # | HIP2002-47 |
Volume (vol) | vol.102 |
Number (no) | 534 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |