Presentation 2003/8/14
Investigation of Power Spectral Density based Channel Equalization
Jinfu NI, Hisashi KAWAI, Minoru TSUZAKI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In an effort to improve the naturalness of concatenative speech synthesis, a large scale speech corpus is recorded and the recording may last a long period from several months to years. Different recording sessions may have differences in transmission channels and the channel differences more or less cause time-dependent variability in voice quality. This paper presents a study on correction of the channel differences, focusing particularly on the long-term power spectral densities (PSDs) based channel equalization method, with objective and subjective evaluation experiments conducted out on 677 utterances of a Japanese sentence recorded by a speaker during two years. We first present the optimal setup of analysis condition, like frame length and the period of the long term, and then discuss the design of a corrective filter with one of the four filter types, namely, LPC based IIR, MLSA filter, and FIR filters with cepstral and mel-frequency cepstral transformation based channel smoothing. The effect of individual filter types on the channel equalization is investigated through comparing the likelihood values of the PSDs of equalized samples with respect to the probability densities estimated from the PSDs of source samples. A preliminary perception test shows the essential effectiveness of the proposed method for correction of the channel differences with no degradation of the speech naturalness.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Channel equalization / Corpus based speech synthesis
Paper # SP2003-67
Date of Issue

Conference Information
Committee SP
Conference Date 2003/8/14(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Investigation of Power Spectral Density based Channel Equalization
Sub Title (in English)
Keyword(1) Channel equalization
Keyword(2) Corpus based speech synthesis
1st Author's Name Jinfu NI
1st Author's Affiliation ATR Spoken Language Translation Research Laboratories()
2nd Author's Name Hisashi KAWAI
2nd Author's Affiliation ATR Spoken Language Translation Research Laboratories
3rd Author's Name Minoru TSUZAKI
3rd Author's Affiliation ATR Spoken Language Translation Research Laboratories
Date 2003/8/14
Paper # SP2003-67
Volume (vol) vol.103
Number (no) 263
Page pp.pp.-
#Pages 6
Date of Issue