Paper Abstract and Keywords |
Presentation |
2021-03-04 09:00
Optimization source-filtere based speech waveform generation using adversarial training Hayato Mitsui, Yosuke Sugiura, Nozomiko Yasui, Tetsuya Shimamura (Saitama Univ.) SIS2020-35 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
This research aims to improve the accuracy of the source-filter based speech waveform generation model using deep learning. While the source-filter based speech waveform generation model can be implemented with lower computational cost compared with WaveNet based on Pixel CNN, this model produces a low-quality speech. To maintain the naturalness of the generated speech, we introduce a mutli-task training architecture using the adversarial training. In the proposed method, we use the architecture of MelGAN as the adversarial training. From the experimental results, we reveal that the proposed method can obtain the dynamics of speech which was lost in the case of the conventional method. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
Deep Learning / Speech synthesis / Source-Filter theory / Adversarial training / / / / |
Reference Info. |
IEICE Tech. Rep., vol. 120, no. 415, SIS2020-35, pp. 1-4, March 2021. |
Paper # |
SIS2020-35 |
Date of Issue |
2021-02-25 (SIS) |
ISSN |
Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
SIS2020-35 |
|