Presentation | 2023-10-14 Comparative study on different speaker embedding spaces focusing on the relation to perceptual inter-speaker similarity Wakuto Morita, Daisuke Saito, Nobuaki Minematsu, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This study examines the correspondence between inter-speaker similarity based on speaker embeddings and perceptual speaker similarity based on human listening tests. In our previous study, we have shown that the tendency of correspondence mentioned above depends on the dimension of embedding space. This paper introduces a speaker embedding method which can encode discriminative information on speaker individuality even in low dimensions, and discusses the effect of differences in embedding methods on the correspondence with human perception. The experimental results have shown that 1) a general tendency independent of the embedding methods was confirmed and 2) the degree of change in the tendency depended on the embedding methods. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Speaker Embeddings / Human Perception / Triplet Loss / Poincar? Embeddings |
Paper # | SP2023-31,WIT2023-22 |
Date of Issue | 2023-10-07 (SP, WIT) |
Conference Information | |
Committee | WIT / SP / IPSJ-SLP |
---|---|
Conference Date | 2023/10/14(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Kyushu Institute of Technology |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Speech and Well-being Information Technology, etc. |
Chair | Takeaki Shionome(Teikyo Univ.) / Tomoki Toda(Nagoya Univ.) / Tomoki Toda(Nagoya Univ.) |
Vice Chair | Shinji Sakou(Nagoya Inst. of Tech.) |
Secretary | Shinji Sakou(AIST) / (Univ. of Toyama) / (Tsukuba Univ. of Tech.) |
Assistant | Tsubasa Uchida(NHK) / Teppei Miura(National Inst. of Techn. Toyota College) / Ryo Aihara(Mitsubishi Electric) / Daisuke Saito(Univ. of Tokyo) |
Paper Information | |
Registration To | Technical Committee on Well-being Information Technology / Technical Committee on Speech / Special Interest Group on Spoken Language Processing |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Comparative study on different speaker embedding spaces focusing on the relation to perceptual inter-speaker similarity |
Sub Title (in English) | |
Keyword(1) | Speaker Embeddings |
Keyword(2) | Human Perception |
Keyword(3) | Triplet Loss |
Keyword(4) | Poincar? Embeddings |
1st Author's Name | Wakuto Morita |
1st Author's Affiliation | The University of Tokyo(Univ. of Tokyo) |
2nd Author's Name | Daisuke Saito |
2nd Author's Affiliation | The University of Tokyo(Univ. of Tokyo) |
3rd Author's Name | Nobuaki Minematsu |
3rd Author's Affiliation | The University of Tokyo(Univ. of Tokyo) |
Date | 2023-10-14 |
Paper # | SP2023-31,WIT2023-22 |
Volume (vol) | vol.123 |
Number (no) | SP-212,WIT-213 |
Page | pp.pp.21-26(SP), pp.21-26(WIT), |
#Pages | 6 |
Date of Issue | 2023-10-07 (SP, WIT) |