Presentation | 2012-11-08 Intra-speaker spectral parameter variation between utterances of the same sentence and its prediction Tatsuo INUKAI, Tomoki TODA, Graham NEUBIG, Sakriani SAKTI, Satoshi NAKAMURA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In spectral conversion of statistical voice conversion technologies, distance measures between the converted and target parameters, such as mel-cepstral distortion, are often used as evaluation/training metrics. However, even if the same speaker utters the same sentence, the spectral parameters of those utterances vary, and therefore, a distance between them still exists. Moreover, in real-time conversion procedure, converted speech keeping original prosodic features of input speech is often generated due to an essential difficulty of complex conversion of those features in real time. In such a case, an ideal sample of converted speech will be a speech sample uttered by a target speaker imitating prosody of the input speech but a spectral variation caused by such a prosodic change is not considered in the current evaluation/training metrics. In this report, we investigate an intra-speaker spectral variation between utterances of the same sentence focusing on mel-cepstrum as a spectral parameter. Moreover, we propose a method for predicting it from prosodic parameter differences between those utterances and conduct experimental evaluations to show its effectiveness. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | voice conversion / training/evaluation metric / spectral variation / utterances of the same sentence / prosodic variation |
Paper # | SP2012-74 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2012/11/1(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Intra-speaker spectral parameter variation between utterances of the same sentence and its prediction |
Sub Title (in English) | |
Keyword(1) | voice conversion |
Keyword(2) | training/evaluation metric |
Keyword(3) | spectral variation |
Keyword(4) | utterances of the same sentence |
Keyword(5) | prosodic variation |
1st Author's Name | Tatsuo INUKAI |
1st Author's Affiliation | Nara Institute of Science and Technology() |
2nd Author's Name | Tomoki TODA |
2nd Author's Affiliation | Nara Institute of Science and Technology |
3rd Author's Name | Graham NEUBIG |
3rd Author's Affiliation | Nara Institute of Science and Technology |
4th Author's Name | Sakriani SAKTI |
4th Author's Affiliation | Nara Institute of Science and Technology |
5th Author's Name | Satoshi NAKAMURA |
5th Author's Affiliation | Nara Institute of Science and Technology |
Date | 2012-11-08 |
Paper # | SP2012-74 |
Volume (vol) | vol.112 |
Number (no) | 281 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |