Presentation | 2015-12-03 Deep Auto-encoder based Low-dimensional Feature Extraction using FFT Spectral Envelopes in Statistical Parametric Speech Synthesis Shinji Takaki, Junichi Yamagishi, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In the state-of-the-art statistical parametric speech synthesis system, a speech analysis module, e.g. STRAIGHT spectral analysis, is generally used for obtaining accurate and stable spectral envelopes, and then low-dimensional acoustic features extracted from obtained spectral envelopes are used for training acoustic models. However, a spectral envelope estimation algorithm used in such a speech analysis module includes various processing derived from human knowledge. In this paper, we investigate a deep auto-encoder based, non-linear, data-driven and unsupervised low-dimensional feature extraction using FFT spectral envelopes for statistical parametric speech synthesis. Experimental results have shown that a text-to-speech synthesis system using a deep auto-encoder based low-dimensional feature extraction from FFT spectral envelopes is indeed a promising approach. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Statistical parametric speech synthesis / DNN / Deep Auto-encoder / Spectral envelope / Vocoder |
Paper # | SP2015-81 |
Date of Issue | 2015-11-25 (SP) |
Conference Information | |
Committee | NLC / IPSJ-NL / SP / IPSJ-SLP |
---|---|
Conference Date | 2015/12/2(3days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Nagoya Inst of Tech. |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | The Second Natural Language Processing Symposium & The 17th Spoken Language Symposium |
Chair | Koichi Takeuchi(Okayama Univ.) / Kentaro Inui(Tohoku Univ.) / Kazunori Mano(Shibaura Inst. of Tech.) / Koichi Shinoda(東工大) |
Vice Chair | Hiroshi Kanayama(IBM) / Makoto Ichise(NTT DoCoMo) / / Norihide Kitaoka(Tokushima Univ.) |
Secretary | Hiroshi Kanayama(Univ. of Tokyo/Hottolink) / Makoto Ichise(Ryukoku Univ.) / (Osaka Univ.) / Norihide Kitaoka(Tohoku Univ.) / (Mixi Co. Ltd.) |
Assistant | Kazutaka Shimada(Kyushu Inst. of Tech.) / Ryuichiro Higashinaka(NTT) / / Takashi Nose(Tohoku Univ.) / Taichi Asami(NTT) |
Paper Information | |
Registration To | Technical Committee on Natural Language Understanding and Models of Communication / Special Interest Group on Natural Language / Technical Committee on Speech / Special Interest Group on Spoken Language Processing |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Deep Auto-encoder based Low-dimensional Feature Extraction using FFT Spectral Envelopes in Statistical Parametric Speech Synthesis |
Sub Title (in English) | |
Keyword(1) | Statistical parametric speech synthesis |
Keyword(2) | DNN |
Keyword(3) | Deep Auto-encoder |
Keyword(4) | Spectral envelope |
Keyword(5) | Vocoder |
1st Author's Name | Shinji Takaki |
1st Author's Affiliation | National Institute of Informatics(NII) |
2nd Author's Name | Junichi Yamagishi |
2nd Author's Affiliation | National Institute of Informatics(NII) |
Date | 2015-12-03 |
Paper # | SP2015-81 |
Volume (vol) | vol.115 |
Number (no) | SP-346 |
Page | pp.pp.99-104(SP), |
#Pages | 6 |
Date of Issue | 2015-11-25 (SP) |