Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SP, EA, SIP |
2020-03-02 09:20 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
Investigation of neural speech rate conversion with multi-speaker WaveNet vocoder Takuma Okamoto (NICT), Keisuke Matsubara (Kobe Univ./NICT), Tomoki Toda (Nagoya Univ./NICT), Yoshinori Shiga, Hisashi Kawai (NICT) EA2019-101 SIP2019-103 SP2019-50 |
Speech rate conversion technology, which can expand or compress speech waveforms without changing pitch of sound, is con... [more] |
EA2019-101 SIP2019-103 SP2019-50 pp.1-6 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
Evaluation of vocal personality and expression for speech synthesized by non-parallel voice conversion with narrative speech Ryotaro Nagase, Keisuke Imoto, Ryosuke Yamanishi, Yoichi Yamashita (Ritsumeikan Univ.) EA2019-138 SIP2019-140 SP2019-87 |
In the technology of voice conversion, reproduction of emotion and intonation, pause is one of the research issues. Howe... [more] |
EA2019-138 SIP2019-140 SP2019-87 pp.213-218 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
LARGE-CONTEXT POINTER-GENERATOR NETWORKS FOR SPOKEN-TO-WRITTEN STYLE CONVERSION Mana Ihori, Akihiko Takashima, Ryo Masumura (NTT) EA2019-142 SIP2019-144 SP2019-91 |
This paper introduces a spoken-to-written style conversion method that is suitable for handling a series of text such as... [more] |
EA2019-142 SIP2019-144 SP2019-91 pp.237-242 |
SP |
2018-08-27 11:35 |
Kyoto |
Kyoto Univ. |
[Poster Presentation]
An Experimental Study on Transforming the Emotion in Speech using GAN Kenji Yasuda, Ryohei Orihara, Yuichi Sei, Yasuyuki Tahara, Akihiko Ohsuga (UEC) SP2018-26 |
In domain transfer task deep learning has made it possible to generate more natural and highly accurate output. Especial... [more] |
SP2018-26 pp.19-22 |
AI |
2018-07-02 15:50 |
Hokkaido |
|
Transforming the Emotion in Speech using CycleGAN Kenji Yasuda, Ryohei Orihara, Yuichi Sei, Yasuyuki Tahara, Akihiko Ohsuga (UEC) AI2018-11 |
In domain transfer task deep learning makes it possible to generate more natural and highly accurate output. Especially ... [more] |
AI2018-11 pp.61-66 |
PRMU, SP |
2018-06-28 14:40 |
Nagano |
|
Study of improving speech intelligibility for glossectomy patients via voice conversion with sound and lip movement. Seiya Ogino, Hiroki Murakami, Sunao Hara, Masanobu Abe (Okayama Univ.) PRMU2018-23 SP2018-3 |
In this paper, we propose the multimodal voice conversion based on Deep Neural Network using audio and lip movement info... [more] |
PRMU2018-23 SP2018-3 pp.7-12 |
SIP, EA, SP, MI (Joint) [detail] |
2018-03-20 09:00 |
Okinawa |
|
[Poster Presentation]
A Hybrid Approach on Electrolaryngeal Speech Enhancement based on Spectral Differential Features and Noise Suppression Mohammad Eshghi, Kazuhiro Kobayashi, Tomoki Toda (Nagoya Univ.) EA2017-141 SIP2017-150 SP2017-124 |
This work presents a hybrid approach for enhancing the quality of the electrolaryngeal (EL) speech. Current hybrid enhan... [more] |
EA2017-141 SIP2017-150 SP2017-124 pp.221-226 |
HCS |
2018-03-13 17:05 |
Miyagi |
Research Institute of Electrical Communication, Tohoku University |
Effect of video-to-audio synchronization in real-time speech rate converted conversations Kazuki Osanai, Hiroko Tokunaga, Naoki Mukawa, Hiroto Saito (Tokyo Denki Univ.) HCS2017-105 |
Speech rate conversion (SRC) is a technology that converts playback speed of speeches while maintaining their vocal pitc... [more] |
HCS2017-105 pp.71-76 |
SP, ASJ-H |
2018-01-20 13:50 |
Tokyo |
The University of Tokyo |
DNN Based Voice Conversion Method Considering Outputs of Multiple Networks Takuya Fujioka, Sun Qinghua (Hitachi) SP2017-68 |
In many conventional statistical voice conversion methods, the relations of source and target speech on all frames are e... [more] |
SP2017-68 pp.11-15 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2017-12-21 12:50 |
Tokyo |
Waseda Univ. Green Computing Systems Research Organization |
[Poster Presentation]
An Evaluation of Speech Waveform Modification Methods towards Improvement of Speech Intelligibility in Noisy Environment Tomohiro Takeyama, Kazuhiro Kobayashi, Tomoki Toda, Kazuya Takeda (Nagoya Univ.) SP2017-57 |
In this research, in order to improve speech intelligibility for a listener under the noisy environment, we propose a te... [more] |
SP2017-57 pp.11-16 |
MBE, NC (Joint) |
2017-11-24 13:50 |
Miyagi |
Tohoku University |
Analyses of Neural Language Model and Its Application to Transformation of Individuality in Speech Tatsuya Takeuchi, Masafumi Hagiwara (Keio Univ.) NC2017-29 |
In this research, we aim to convert the output indirectly by changing the internal state of the neural language model us... [more] |
NC2017-29 pp.13-18 |
SP, IPSJ-SLP (Joint) |
2017-07-27 14:30 |
Miyagi |
Akiu Resort Hotel Crescent |
[Invited Talk]
Synthesis, Recognition and Conversion of Various Speech Using Deep Learning and Their Applications Takashi Nose (Tohoku Univ.) SP2017-16 |
This paper focuses on synthesis, recognition and conversion of various speech in the speech processing using deep learni... [more] |
SP2017-16 pp.3-8 |
PRMU, SP |
2017-06-22 14:45 |
Miyagi |
|
Postfiltering of STFT Spectrograms Based on Generative Adversarial Networks Takuhiro Kaneko (NTT), Shinji Takaki (NII), Hirokazu Kameoka (NTT), Junichi Yamagishi (NII) PRMU2017-28 SP2017-4 |
This paper presents postfiltering of short-term Fourier transform (STFT) spectrograms based on Generative Adversarial Ne... [more] |
PRMU2017-28 SP2017-4 pp.17-22 |
HCS, HIP, HI-SIGCOASTER [detail] |
2017-05-17 10:20 |
Okinawa |
Okinawa Industry Support Center |
Comparison of audio-visual synchronous/asynchronous playback in real-time speech rate converted conversations
-- Will the speaker's addressing strength change? -- Kazuki Osanai (Graduate School of Tokyo Denki Univ.), Hiroko Tokunaga, Naoki Mukawa, Hiroto Saito (Tokyo Denki Univ.) HCS2017-29 HIP2017-29 |
Speech rate conversion (SRC) is a technology that converts playback speed of speeches while maintaining their vocal pitc... [more] |
HCS2017-29 HIP2017-29 pp.189-194 |
HCS |
2017-03-16 15:45 |
Miyagi |
|
Design of turn talking behaviors for participants in speech rate converted conversation
-- Effects of delay time visualization on next speaker's utterances -- Kosuke Kumagai, Hiroko Tokunaga, Naoki Mukawa, Hiroto Saito (Tokyo Denki Univ.) HCS2016-117 |
Speech rate conversion (SRC) is one of the most potential techniques for supporting hearing impaired. SRC is a technique... [more] |
HCS2016-117 pp.155-160 |
SP |
2017-01-21 11:00 |
Tokyo |
The University of Tokyo |
[Poster Presentation]
A Study on Singer-Independent Singing Voice Conversion Using Read Speech Based on Neural Network Harunori Koike, Takashi Nose, Akinori Ito (Tohoku Univ.) SP2016-67 |
There is a problem that the conventional method requires the speech of the source speaker for training. We proposed a me... [more] |
SP2016-67 pp.17-22 |
SP |
2017-01-21 11:00 |
Tokyo |
The University of Tokyo |
[Poster Presentation]
Evaluation of DNN-Based Voice Conversion Deceiving Anti-spoofing Verification Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari (UT) SP2016-69 |
This paper proposes a novel training algorithm for high-quality Deep Neural Network (DNN)-based voice conversion. To imp... [more] |
SP2016-69 pp.29-34 |
SP |
2016-08-24 16:15 |
Kyoto |
ACCMS, Kyoto Univ. |
[Poster Presentation]
Joint Enhancement of Spectral and Cepstral Sequences of Noisy Speech Li Li (Univ.Tsukuba), Hirokazu Kameoka, Takuya Higuchi (NTT), Hiroshi Saruwatari (Univ.Tokyo), Shoji Makino (Univ.Tsukuba) SP2016-32 |
While spectral domain speech enhancement algorithms using non-negative matrix factorization (NMF) are powerful in terms ... [more] |
SP2016-32 pp.29-32 |
HCS, HIP, HI-SIGCOASTER [detail] |
2016-05-19 11:15 |
Okinawa |
Okinawa Industry Support Center |
System Design and Evaluations of Visualizing Hearing Termination Time among participants for real-time speech rate converted conversations Kosuke Kumagai, Hiroko Tokunaga, Naoki Mukawa, Hiroto Saito (Tokyo Denki Univ.) HCS2016-19 HIP2016-19 |
Speech rate conversion (SRC) is a technique that converts playback speed of speeches while maintaining the vocal pitches... [more] |
HCS2016-19 HIP2016-19 pp.145-150 |
EA, SP, SIP |
2016-03-29 09:00 |
Oita |
Beppu International Convention Center B-ConPlaza |
[Poster Presentation]
Quality improvement of HMM-based synthesized speech based on decomposition of naturalness and intelligibility using asymmetric bilinear model with non-negative matrix factorization Anh-Tuan Dinh, Masato Akagi (JAIST) EA2015-113 SIP2015-162 SP2015-141 |
HMM-based synthesized voices are intelligible but not natural especially in limited data condition because of over-smoot... [more] |
EA2015-113 SIP2015-162 SP2015-141 pp.261-266 |