Presentation 2023-10-14
Comparative study on different speaker embedding spaces focusing on the relation to perceptual inter-speaker similarity
Wakuto Morita, Daisuke Saito, Nobuaki Minematsu,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This study examines the correspondence between inter-speaker similarity based on speaker embeddings and perceptual speaker similarity based on human listening tests. In our previous study, we have shown that the tendency of correspondence mentioned above depends on the dimension of embedding space. This paper introduces a speaker embedding method which can encode discriminative information on speaker individuality even in low dimensions, and discusses the effect of differences in embedding methods on the correspondence with human perception. The experimental results have shown that 1) a general tendency independent of the embedding methods was confirmed and 2) the degree of change in the tendency depended on the embedding methods.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Speaker Embeddings / Human Perception / Triplet Loss / Poincar? Embeddings
Paper # SP2023-31,WIT2023-22
Date of Issue 2023-10-07 (SP, WIT)

Conference Information
Committee WIT / SP / IPSJ-SLP
Conference Date 2023/10/14(1days)
Place (in Japanese) (See Japanese page)
Place (in English) Kyushu Institute of Technology
Topics (in Japanese) (See Japanese page)
Topics (in English) Speech and Well-being Information Technology, etc.
Chair Takeaki Shionome(Teikyo Univ.) / Tomoki Toda(Nagoya Univ.) / Tomoki Toda(Nagoya Univ.)
Vice Chair Shinji Sakou(Nagoya Inst. of Tech.)
Secretary Shinji Sakou(AIST) / (Univ. of Toyama) / (Tsukuba Univ. of Tech.)
Assistant Tsubasa Uchida(NHK) / Teppei Miura(National Inst. of Techn. Toyota College) / Ryo Aihara(Mitsubishi Electric) / Daisuke Saito(Univ. of Tokyo)

Paper Information
Registration To Technical Committee on Well-being Information Technology / Technical Committee on Speech / Special Interest Group on Spoken Language Processing
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Comparative study on different speaker embedding spaces focusing on the relation to perceptual inter-speaker similarity
Sub Title (in English)
Keyword(1) Speaker Embeddings
Keyword(2) Human Perception
Keyword(3) Triplet Loss
Keyword(4) Poincar? Embeddings
1st Author's Name Wakuto Morita
1st Author's Affiliation The University of Tokyo(Univ. of Tokyo)
2nd Author's Name Daisuke Saito
2nd Author's Affiliation The University of Tokyo(Univ. of Tokyo)
3rd Author's Name Nobuaki Minematsu
3rd Author's Affiliation The University of Tokyo(Univ. of Tokyo)
Date 2023-10-14
Paper # SP2023-31,WIT2023-22
Volume (vol) vol.123
Number (no) SP-212,WIT-213
Page pp.pp.21-26(SP), pp.21-26(WIT),
#Pages 6
Date of Issue 2023-10-07 (SP, WIT)