Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
WIT, SP, IPSJ-SLP [detail] |
2020-10-22 13:00 |
Online |
Online |
[Invited Talk]
NHK's activities on Japanese end-to-end speech synthesis Kiyoshi Kurihara (NHK) SP2020-11 WIT2020-12 |
The main business of NHK (Japan Broadcasting Corporation) is the production and broadcasting of programs. Many programs ... [more] |
SP2020-11 WIT2020-12 pp.19-20 |
SIS |
2020-03-06 15:00 |
Saitama |
Saitama Hall (Cancelled but technical report was issued) |
Adversarial Training using Self-Attention Architecture for Speech Enhancement Network Yosuke Sugiura, Shimamura Tetsuya (Saitama Univ.) SIS2019-59 |
In this paper, we propose a new adversarial training for improving performance of the speech enhancement network.
In th... [more] |
SIS2019-59 pp.125-129 |
SP, EA, SIP |
2020-03-02 09:20 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
Investigation of neural speech rate conversion with multi-speaker WaveNet vocoder Takuma Okamoto (NICT), Keisuke Matsubara (Kobe Univ./NICT), Tomoki Toda (Nagoya Univ./NICT), Yoshinori Shiga, Hisashi Kawai (NICT) EA2019-101 SIP2019-103 SP2019-50 |
Speech rate conversion technology, which can expand or compress speech waveforms without changing pitch of sound, is con... [more] |
EA2019-101 SIP2019-103 SP2019-50 pp.1-6 |
SP, EA, SIP |
2020-03-02 13:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
Data augmentation for ASR system by using locally time-reversed speech
-- Temporal inversion of feature sequence -- Takanori Ashihara, Tomohiro Tanaka, Takafumi Moriya, Ryo Masumura, Yusuke Shinohara, Makio Kashino (NTT) EA2019-110 SIP2019-112 SP2019-59 |
Data augmentation is one of the techniques to mitigate overfitting and improve robustness against several acoustic varia... [more] |
EA2019-110 SIP2019-112 SP2019-59 pp.53-58 |
SP, EA, SIP |
2020-03-02 15:45 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
Performance evaluation of distilling knowledge using encoder-decoder for CTC-based automatic speech recognition systems Takafumi Moriya, Hiroshi Sato, Tomohiro Tanaka, Takanori Ashihara, Ryo Masumura, Yusuke Shinohara (NTT) EA2019-131 SIP2019-133 SP2019-80 |
We present a novel training approach for connectionist temporal classification (CTC) -based automatic speech recognition... [more] |
EA2019-131 SIP2019-133 SP2019-80 pp.175-180 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
[Poster Presentation]
An Educational Study on Prosodic Symbols and Their Acoustic Realization Using Japanese End-to-end Speech Synthesis Fuki Yoshizawa (UTokyo), Tadashi Kumano (NHK), Nobuaki Minematsu (UTokyo), Kiyoshi Kurihara (NHK) EA2019-137 SIP2019-139 SP2019-86 |
In order to examine the educational effect of presenting prosodic symbols to learners of Japanese, a method was proposed... [more] |
EA2019-137 SIP2019-139 SP2019-86 pp.207-212 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
Evaluation of vocal personality and expression for speech synthesized by non-parallel voice conversion with narrative speech Ryotaro Nagase, Keisuke Imoto, Ryosuke Yamanishi, Yoichi Yamashita (Ritsumeikan Univ.) EA2019-138 SIP2019-140 SP2019-87 |
In the technology of voice conversion, reproduction of emotion and intonation, pause is one of the research issues. Howe... [more] |
EA2019-138 SIP2019-140 SP2019-87 pp.213-218 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
[Poster Presentation]
A Comparison of Language Models for a Design of Reduced Phoneme Set Shuji Komeiji, Toshihisa Tanaka (TUAT), Koichi Shinoda (titech) EA2019-152 SIP2019-154 SP2019-101 |
Language models for a design of reduced phoneme set are compared each other.
The reduction of the phoneme set improves ... [more] |
EA2019-152 SIP2019-154 SP2019-101 pp.295-300 |
SP |
2020-01-29 11:30 |
Toyama |
|
Application of Deep Gaussian Process to Multi-Speaker Text-to-Speech Synthesis using Speaker Codes Kentaro Mitsui, Tomoki Koriyama, Hiroshi Saruwatari (UTokyo) SP2019-49 |
Speaker codes are widely used to achieve multi-speaker text-to-speech synthesis.
Conventionally, Deep Neural Network (D... [more] |
SP2019-49 pp.31-36 |
EA |
2019-12-12 14:25 |
Fukuoka |
Kyushu Inst. Tech. |
Performance improvement of speech enhancement network by multitask learning including noise information Haruki Tanaka (NITTC), Yosuke Sugiura, Nozomiko Yasui, Tetsuya Shimamura (Saitama Univ.), Ryoichi Miyazaki (NITTC) EA2019-70 |
In the signal processing field, there is a growing interest in speech enhancement.Recently, a lot of speech enhancement ... [more] |
EA2019-70 pp.31-36 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2019-12-06 10:35 |
Tokyo |
NHK Science & Technology Research Labs. |
[Invited Talk]
Progress and prospects of statistical speech synthesis Keiichi Tokuda (Nagoya Inst. of Tech.) SP2019-35 |
The basic problem of statistical speech synthesis is quite simple: we have a speech database for training, i.e., a set o... [more] |
SP2019-35 pp.11-12 |
WIT, HI-SIGACI |
2019-12-04 14:55 |
Tokyo |
AIST Tokyo Waterfront (TBD) |
Development of language function training support system for medical welfare and education Mio Sakuma (NIT, Sendai College), Shigeharu Ono (JAIST), Chie Sakuma (Kanagami Hospital), Takahiro Yonamine (NIT, Okinawa college) WIT2019-37 |
We have developed the language function training support system using Android tablet-type devices to reduce the burden o... [more] |
WIT2019-37 pp.39-44 |
WIT, SP |
2019-10-26 17:00 |
Kagoshima |
Daiichi Institute of Technology |
Neural Whispered Speech Detection with Imbalanced Learning Takanori Ashihara, Yusuke Shinohara, Hiroshi Sato, Takafumi Moriya, Kiyoaki Matsui, Yoshikazu Yamaguchi (NTT) SP2019-26 WIT2019-25 |
In this paper, we present a neural whispered-speech detection technique that offers utterance-level classification of wh... [more] |
SP2019-26 WIT2019-25 pp.51-56 |
SP |
2019-08-28 14:40 |
Kyoto |
Kyoto Univ. |
[Poster Presentation]
An investigation on training of WaveNet vocoder in end-to-end text-to-speech Kazuki Yasuhara, Tomoki Hayashi, Tomoki Toda (Nagoya Univ.) SP2019-14 |
In this paper, we investigate the training of WaveNet vocoder in end-to-end text-to-speech. Tacotron 2, which is an end-... [more] |
SP2019-14 pp.31-36 |
EA, SIP, SP |
2019-03-14 13:30 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
[Poster Presentation]
Use and evaluation of Tacotron and context features in rakugo speech synthesis Shuhei Kato (SOKENDAI/NII), Shinji Takaki, Junichi Yamagishi (NII), Yusuke Yasuda (SOKENDAI/NII), Xin Wang (NII) EA2018-126 SIP2018-132 SP2018-88 |
We have been working on constructing rakugo (a traditional Japanese verbal entertainment) speech synthesis toward speech... [more] |
EA2018-126 SIP2018-132 SP2018-88 pp.161-166 |
EA, SIP, SP |
2019-03-15 13:30 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
[Poster Presentation]
A Design of Reduced Phoneme Set Based on a Language Model Shuji Komeiji, Toshihisa Tanaka (Tokyo Univ. of Agriculture and Tech.) EA2018-134 SIP2018-140 SP2018-96 |
A design of reduced phoneme set based on a language model is proposed. The reduction of the phoneme set improves discrim... [more] |
EA2018-134 SIP2018-140 SP2018-96 pp.205-210 |
EA, SIP, SP |
2019-03-15 13:30 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
[Poster Presentation]
Data augmentation using multiple databases for end-to-end dysarthric speech recognition Yuki Takashima, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.) EA2018-156 SIP2018-162 SP2018-118 |
We present in this paper an end-to-end speech recognition system for a Japanese person with an articulation disorder res... [more] |
EA2018-156 SIP2018-162 SP2018-118 pp.335-340 |
SP |
2019-01-27 11:30 |
Ishikawa |
Kanazawa-Harmonie |
Multimodal Data Augmentation for Visual Speech Recognition using Deep Canonical Correlation Analysis Masaki Shimonishi, Satoshi Tamura, Satoru Hayamizu (Gifu University) SP2018-60 |
This paper proposes ta new data augmentation strategy for deep learning, in which feature vectors in one modality can be... [more] |
SP2018-60 pp.41-45 |
AI |
2018-12-07 15:55 |
Fukuoka |
|
Toyoaki Kuwahara, Yuichi Sei, Yasuyuki Tahara, Akihiko Ohsuga (UEC) AI2018-30 |
The emotion estimation by speech makes it possible to estimate with higher precision with the development of deep learni... [more] |
AI2018-30 pp.25-29 |
WIT, SP |
2018-10-27 13:50 |
Fukuoka |
Kyushu Institute of Technology(Kitakyushu) |
Proposal of Esophageal Speech Training Device with Myoelectric Signal
-- Identification of Myoelectric Signal Detection Spot for Training Device -- Katsutoshi Oe (DIT), Ryoya Nakamura (Kyutech), Kazutaka Hosokawa (DIT) SP2018-34 WIT2018-22 |
The patients who undergo the laryngectomy lose their voice. One of the speech production substitutes that are used by vo... [more] |
SP2018-34 WIT2018-22 pp.13-16 |