Presentation 2010-12-21
Study on HMM-based F0 Coding for Very Low Bit-Rate Vocoder
Takashi NOSE, Masashi KUMAMOTO, Takao KOBAYASHI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper presents a novel F0 coding technique for very low bit-rate HMM-based phonetic vocoder. Our technique is based on the multi-space distribution HMM (MSD-HMM) with quantized F0 symbols used as a prosodic context. By introducing the F0 symbol, we can model F0 values without using manually labeled speech data including accent information. In the encoding process, the F0 sequence extracted from an input utterance is converted into the quantized F0 symbol sequence, and these symbols are transmitted with the phonemes and state durations obtained by a phoneme recognizer. In the decoding process, context-dependent labels are created from the phonemes and F0 symbols, and the spectral and F0 sequences are generated using the pre-trained MSD-HMM on the basis of a maximum likelihood criterion. The experimental results show that the degradation of F0 quality through the coding process is not annoying even if the bit-rate for F0 is less than 50 bit/s.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) phonetic vocoder / HMM-based speech synthesis / very low bit-rate speech coding / quantized F0 context / multi-space distribution HMM (MSD-HMM)
Paper # NLC2010-28,SP2010-101
Date of Issue

Conference Information
Committee NLC
Conference Date 2010/12/13(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Study on HMM-based F0 Coding for Very Low Bit-Rate Vocoder
Sub Title (in English)
Keyword(1) phonetic vocoder
Keyword(2) HMM-based speech synthesis
Keyword(3) very low bit-rate speech coding
Keyword(4) quantized F0 context
Keyword(5) multi-space distribution HMM (MSD-HMM)
1st Author's Name Takashi NOSE
1st Author's Affiliation Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology()
2nd Author's Name Masashi KUMAMOTO
2nd Author's Affiliation Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
3rd Author's Name Takao KOBAYASHI
3rd Author's Affiliation Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
Date 2010-12-21
Paper # NLC2010-28,SP2010-101
Volume (vol) vol.110
Number (no) 356
Page pp.pp.-
#Pages 6
Date of Issue