Presentation | 2016-01-14 [Invited Talk] Articulatory controllable statistical parametric speech synthesis using EMA data Junichi Yamagishi, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper describes speech processing work in which articulator movements are used in conjunction with the acoustic speech signal and/or linguistic information. By “articulator movements”, we mean the changing positions of human speech articulators such as the tongue and lips, which may be recorded by EMA, amongst other articulography techniques. In this paper we provide an overview of statistical voice conversion and speech synthesis techniques which use articulator movements as part of the process to generate synthetic speech. The statistical parametirc speech synthesis is able to synthesise highly intelligible and smooth speech sounds. In addition, the HMM’s parameters can be adapted using a small amount of training data to diversify the characteristics of syn- thetic speech. However, this approach still has some limitations.The structure of conventional HMM-based acoustic models is akin to a black box, without explicit correspondence to the speech production mechanism. It is difficult to integrate phonetic knowledge concerning the properties of speech into acoustic feature prediction directly. By incorporating articulatory signals we can explicitly introduce articulator movements into the speech synthesis framework to make speech synthesis outputs “articulatorily controllable”, meaning that we can manipulate synthetic speech not only in the acoustic domain but also in the articulatory domain intuitively. In this paper we have introduced an overview of several systems that we have built and experiments that we have conducted for this purpose. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | HMM speech synthesis / Articulatory movement / EMA / Multiple regression HMM |
Paper # | SP2015-88 |
Date of Issue | 2016-01-07 (SP) |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2016/1/14(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Sunpian Kawasaki |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Synthesis, Generation, Prosody, etc. |
Chair | Kazunori Mano(Shibaura Inst. of Tech.) |
Vice Chair | Norihide Kitaoka(Tokushima Univ.) |
Secretary | Norihide Kitaoka(Tokyo City Univ.) |
Assistant | Takashi Nose(Tohoku Univ.) / Taichi Asami(NTT) |
Paper Information | |
Registration To | Technical Committee on Speech |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | [Invited Talk] Articulatory controllable statistical parametric speech synthesis using EMA data |
Sub Title (in English) | |
Keyword(1) | HMM speech synthesis |
Keyword(2) | Articulatory movement |
Keyword(3) | EMA |
Keyword(4) | Multiple regression HMM |
1st Author's Name | Junichi Yamagishi |
1st Author's Affiliation | National Institute of Informatics/University of Edinburgh(NII/Univ. Edinburgh) |
Date | 2016-01-14 |
Paper # | SP2015-88 |
Volume (vol) | vol.115 |
Number (no) | SP-392 |
Page | pp.pp.19-24(SP), |
#Pages | 6 |
Date of Issue | 2016-01-07 (SP) |