Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SIS |
2021-03-04 09:00 |
Online |
Online |
Optimization source-filtere based speech waveform generation using adversarial training Hayato Mitsui, Yosuke Sugiura, Nozomiko Yasui, Tetsuya Shimamura (Saitama Univ.) SIS2020-35 |
This research aims to improve the accuracy of the source-filter based speech waveform generation model using deep learni... [more] |
SIS2020-35 pp.1-4 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 14:05 |
Online |
Online |
[Poster Presentation]
A unified source-filter network for neural vocoder Reo Yoneyama, Yi-Chiao Wu, Tomoki Toda (Nagoya Univ.) EA2020-69 SIP2020-100 SP2020-34 |
In this paper, we propose a method to develop a neural vocoder using a single network based on the source-filter theory.... [more] |
EA2020-69 SIP2020-100 SP2020-34 pp.57-62 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 14:05 |
Online |
Online |
[Poster Presentation]
Investigation of DNN-based speech synthesis utilizing oral reading skills obtained from large scale subjective evaluation Shun Akui (UTokyo), Yusuke Ijima (NTT), Daisuke Saito, Nobuaki Minematsu (UTokyo) EA2020-71 SIP2020-102 SP2020-36 |
So far, we have been suggested the value of `oral reading skill' based on a listening evaluation experiment as a quantit... [more] |
EA2020-71 SIP2020-102 SP2020-36 pp.68-73 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 17:10 |
Online |
Online |
An investigation of rhythm-based speaker embeddings for phoneme duration modeling Kenichi Fujita, Atsushi Ando, Yusuke Ijima (NTT) EA2020-77 SIP2020-108 SP2020-42 |
In this study, we propose a speaker embedding method suitable for modeling phoneme duration length for each individual i... [more] |
EA2020-77 SIP2020-108 SP2020-42 pp.103-108 |
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] |
2020-12-02 13:50 |
Online |
Online |
Multi-Modal Emotion Recognition by Integrating of Acoustic and Linguistic Features Ryotaro Nagase, Takahiro Fukumori, Yoichi Yamashita (Ritsumeikan Univ.) NLC2020-14 SP2020-17 |
In recent years, the advanced techique of deep learning has improved the performance of Speech Emotional Recognition as ... [more] |
NLC2020-14 SP2020-17 pp.7-12 |
WIT, SP, IPSJ-SLP [detail] |
2020-10-22 13:00 |
Online |
Online |
[Invited Talk]
NHK's activities on Japanese end-to-end speech synthesis Kiyoshi Kurihara (NHK) SP2020-11 WIT2020-12 |
The main business of NHK (Japan Broadcasting Corporation) is the production and broadcasting of programs. Many programs ... [more] |
SP2020-11 WIT2020-12 pp.19-20 |
EA, ASJ-H |
2020-07-21 11:00 |
Online |
Online |
Possibilities of Gamification for Learning How to Use an Interactive Speech Synthesizer "Voice Pad" Daiki Goto (Hokkai Gakuen Univ.), Naofumi Aoki, Keisuke Ai (Hokkaido Univ.), Kunitoshi Motoki (Hokkai Gakuen Univ.) EA2020-11 |
This study has developed an interactive speech synthesizer that can enable users to synthesize speech as playing musical... [more] |
EA2020-11 pp.63-66 |
WIT, IPSJ-AAC |
2020-03-15 15:20 |
Ibaraki |
Tsukuba University of Technology (Cancelled but technical report was issued) |
Developing a communication system for an ALS patient with his voice.
-- Towards the patient's and Caretakers' QOL improvement -- Akemi Ishii Iida, Daishi Miura, Yuko Yamashita (SIT), Satoshi Watanabe (HTS Tokyo), Chen Feng, Midori Sugaya (SIT) WIT2019-66 |
This paper describes our ongoing work on developing a communication assistive system for an ALS patient who already had ... [more] |
WIT2019-66 pp.171-176 |
ET |
2020-03-07 10:35 |
Kagawa |
National Institute of Technology, Kagawa Collage (Cancelled but technical report was issued) |
Implementation and evaluation of language learning support module applying speech recognition Yusuke Kawamura, Chunxiang Chen, Renfeng Hou (PUH) ET2019-87 |
The development of speech recognition and speech synthesis technology has been remarkable due to the development of deep... [more] |
ET2019-87 pp.63-67 |
SP, EA, SIP |
2020-03-02 09:20 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
Investigation of neural speech rate conversion with multi-speaker WaveNet vocoder Takuma Okamoto (NICT), Keisuke Matsubara (Kobe Univ./NICT), Tomoki Toda (Nagoya Univ./NICT), Yoshinori Shiga, Hisashi Kawai (NICT) EA2019-101 SIP2019-103 SP2019-50 |
Speech rate conversion technology, which can expand or compress speech waveforms without changing pitch of sound, is con... [more] |
EA2019-101 SIP2019-103 SP2019-50 pp.1-6 |
SP, EA, SIP |
2020-03-02 13:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
The Effectiveness of Additional Context in DNN-based Spontaneous Speech Synthesis Yuki Yamashita, Tomoki Koriyama, Yuki Saito, Shinnosuke Takamichi (UTokyo), Yusuke Ijima, Ryo Masumura (NTT), Hiroshi Saruwatari (UTokyo) EA2019-112 SIP2019-114 SP2019-61 |
In DNN-based speech synthesis, contexts, which are input features of DNN, can be used not only for the representation of... [more] |
EA2019-112 SIP2019-114 SP2019-61 pp.65-70 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
[Poster Presentation]
Initial analysis of oral reading skills obtained from large scale subjective evaluation Takuya Ozuru (Univ. of Tokyo), Yusuke Ijima (NTT), Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) EA2019-135 SIP2019-137 SP2019-84 |
Speech of professional newscasters easily suggest us his/her occupation, that is newscaster. So far, we have analyzed pr... [more] |
EA2019-135 SIP2019-137 SP2019-84 pp.195-200 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
[Poster Presentation]
An Educational Study on Prosodic Symbols and Their Acoustic Realization Using Japanese End-to-end Speech Synthesis Fuki Yoshizawa (UTokyo), Tadashi Kumano (NHK), Nobuaki Minematsu (UTokyo), Kiyoshi Kurihara (NHK) EA2019-137 SIP2019-139 SP2019-86 |
In order to examine the educational effect of presenting prosodic symbols to learners of Japanese, a method was proposed... [more] |
EA2019-137 SIP2019-139 SP2019-86 pp.207-212 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
A Study for HMM-based embedded speech synthesis using a large-scale speech corpus Nobuyuki Nishizawa, Tomohiro Obara, Hiromi Ishizaki (KDDI Research, Inc.) EA2019-141 SIP2019-143 SP2019-90 |
This study shows that our speech synthesis system based on HMM speech synthesis for embedded devices can perform real-ti... [more] |
EA2019-141 SIP2019-143 SP2019-90 pp.231-236 |
ITE-HI, IE, ITS, ITE-MMS, ITE-ME, ITE-AIT [detail] |
2020-02-27 11:25 |
Hokkaido |
Hokkaido Univ. (Cancelled but technical report was issued) |
A speech synthesis from electromyography using CNN Kiyotaka Miyasaka, Yuji sakamoto (Hokkaido Univ.) |
Speech communication can provide various expressions intuitively. However, when communicate with aphonic people or in no... [more] |
|
SP |
2020-01-29 11:30 |
Toyama |
|
Application of Deep Gaussian Process to Multi-Speaker Text-to-Speech Synthesis using Speaker Codes Kentaro Mitsui, Tomoki Koriyama, Hiroshi Saruwatari (UTokyo) SP2019-49 |
Speaker codes are widely used to achieve multi-speaker text-to-speech synthesis.
Conventionally, Deep Neural Network (D... [more] |
SP2019-49 pp.31-36 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2019-12-06 10:35 |
Tokyo |
NHK Science & Technology Research Labs. |
[Invited Talk]
Progress and prospects of statistical speech synthesis Keiichi Tokuda (Nagoya Inst. of Tech.) SP2019-35 |
The basic problem of statistical speech synthesis is quite simple: we have a speech database for training, i.e., a set o... [more] |
SP2019-35 pp.11-12 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2019-12-06 13:55 |
Tokyo |
NHK Science & Technology Research Labs. |
[Poster Presentation]
Effectiveness of sequence-to-sequence acoustic modeling by using automatic generated labels Kiyoshi Kurihara, Nobumasa Seiyama, Tadashi Kumano (NHK) SP2019-37 |
We have proposed a method that uses yomigana (Japanese character readings) and prosodic symbols as input for sequence-to... [more] |
SP2019-37 pp.49-54 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2019-12-06 13:55 |
Tokyo |
NHK Science & Technology Research Labs. |
[Poster Presentation]
Synthetic speech-based sound masking for privacy protection when speaking to smartphones in public space Takahiro Tsugui, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda (Nagoya Inst. of Tech.) SP2019-38 |
In this paper, we propose a synthetic speech-based sound masking method that protects the privacy when speaking to smart... [more] |
SP2019-38 pp.55-60 |
SP |
2019-08-28 14:40 |
Kyoto |
Kyoto Univ. |
[Poster Presentation]
Analysis of prosodic differences between a newscaster and amateur speakers using partial-substituted synthetic speech Takuya Ozuru (Univ. of Tokyo), Yusuke Ijima (NTT), Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) SP2019-11 |
This paper analyzes prosodic differences between a professional newscaster and amateur speakers which affects listeners’... [more] |
SP2019-11 pp.13-18 |