Presentation 2012-11-08
Intra-speaker spectral parameter variation between utterances of the same sentence and its prediction
Tatsuo INUKAI, Tomoki TODA, Graham NEUBIG, Sakriani SAKTI, Satoshi NAKAMURA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In spectral conversion of statistical voice conversion technologies, distance measures between the converted and target parameters, such as mel-cepstral distortion, are often used as evaluation/training metrics. However, even if the same speaker utters the same sentence, the spectral parameters of those utterances vary, and therefore, a distance between them still exists. Moreover, in real-time conversion procedure, converted speech keeping original prosodic features of input speech is often generated due to an essential difficulty of complex conversion of those features in real time. In such a case, an ideal sample of converted speech will be a speech sample uttered by a target speaker imitating prosody of the input speech but a spectral variation caused by such a prosodic change is not considered in the current evaluation/training metrics. In this report, we investigate an intra-speaker spectral variation between utterances of the same sentence focusing on mel-cepstrum as a spectral parameter. Moreover, we propose a method for predicting it from prosodic parameter differences between those utterances and conduct experimental evaluations to show its effectiveness.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) voice conversion / training/evaluation metric / spectral variation / utterances of the same sentence / prosodic variation
Paper # SP2012-74
Date of Issue

Conference Information
Committee SP
Conference Date 2012/11/1(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Intra-speaker spectral parameter variation between utterances of the same sentence and its prediction
Sub Title (in English)
Keyword(1) voice conversion
Keyword(2) training/evaluation metric
Keyword(3) spectral variation
Keyword(4) utterances of the same sentence
Keyword(5) prosodic variation
1st Author's Name Tatsuo INUKAI
1st Author's Affiliation Nara Institute of Science and Technology()
2nd Author's Name Tomoki TODA
2nd Author's Affiliation Nara Institute of Science and Technology
3rd Author's Name Graham NEUBIG
3rd Author's Affiliation Nara Institute of Science and Technology
4th Author's Name Sakriani SAKTI
4th Author's Affiliation Nara Institute of Science and Technology
5th Author's Name Satoshi NAKAMURA
5th Author's Affiliation Nara Institute of Science and Technology
Date 2012-11-08
Paper # SP2012-74
Volume (vol) vol.112
Number (no) 281
Page pp.pp.-
#Pages 6
Date of Issue