Presentation | 2003/8/14 Investigation of Power Spectral Density based Channel Equalization Jinfu NI, Hisashi KAWAI, Minoru TSUZAKI, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In an effort to improve the naturalness of concatenative speech synthesis, a large scale speech corpus is recorded and the recording may last a long period from several months to years. Different recording sessions may have differences in transmission channels and the channel differences more or less cause time-dependent variability in voice quality. This paper presents a study on correction of the channel differences, focusing particularly on the long-term power spectral densities (PSDs) based channel equalization method, with objective and subjective evaluation experiments conducted out on 677 utterances of a Japanese sentence recorded by a speaker during two years. We first present the optimal setup of analysis condition, like frame length and the period of the long term, and then discuss the design of a corrective filter with one of the four filter types, namely, LPC based IIR, MLSA filter, and FIR filters with cepstral and mel-frequency cepstral transformation based channel smoothing. The effect of individual filter types on the channel equalization is investigated through comparing the likelihood values of the PSDs of equalized samples with respect to the probability densities estimated from the PSDs of source samples. A preliminary perception test shows the essential effectiveness of the proposed method for correction of the channel differences with no degradation of the speech naturalness. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Channel equalization / Corpus based speech synthesis |
Paper # | SP2003-67 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2003/8/14(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Investigation of Power Spectral Density based Channel Equalization |
Sub Title (in English) | |
Keyword(1) | Channel equalization |
Keyword(2) | Corpus based speech synthesis |
1st Author's Name | Jinfu NI |
1st Author's Affiliation | ATR Spoken Language Translation Research Laboratories() |
2nd Author's Name | Hisashi KAWAI |
2nd Author's Affiliation | ATR Spoken Language Translation Research Laboratories |
3rd Author's Name | Minoru TSUZAKI |
3rd Author's Affiliation | ATR Spoken Language Translation Research Laboratories |
Date | 2003/8/14 |
Paper # | SP2003-67 |
Volume (vol) | vol.103 |
Number (no) | 263 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |