IEICE Technical Committee Submission System
Conference Schedule
Online Proceedings
[Sign in]
Tech. Rep. Archives
    [Japanese] / [English] 
( Committee/Place/Topics  ) --Press->
 
( Paper Keywords:  /  Column:Title Auth. Affi. Abst. Keyword ) --Press->

All Technical Committee Conferences  (Searched in: All Years)

Search Results: Conference Papers
 Conference Papers (Available on Advance Programs)  (Sort by: Date Descending)
 Results 1 - 20 of 247  /  [Next]  
Committee Date Time Place Paper Title / Authors Abstract Paper #
SIP, SP, EA, IPSJ-SLP [detail] 2024-03-01
09:30
Okinawa
(Primary: On-site, Secondary: Online)
An experimental survey on speaker embedding spaces for controlling speaker identity in speech synthesis system
Wakuto Morita, Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) EA2023-93 SIP2023-140 SP2023-75
This study investigated the influence of the discriminability of speaker encoders on speech synthesis models that can co... [more] EA2023-93 SIP2023-140 SP2023-75
pp.190-195
SIP, SP, EA, IPSJ-SLP [detail] 2024-03-01
09:30
Okinawa
(Primary: On-site, Secondary: Online)
Multi-Dialect Speech Synthesis with Interpretable Accent latent Variable based on VQ-VAE
Kazuki Yamauchi, Yuki Saito, Hiroshi Saruwatari (UTokyo) EA2023-98 SIP2023-145 SP2023-80
In this paper, we address two tasks: "Intra-dialect Text-to-Speech (TTS)," aiming to synthesize speech in the same diale... [more] EA2023-98 SIP2023-145 SP2023-80
pp.220-225
SIP, SP, EA, IPSJ-SLP [detail] 2024-03-01
10:40
Okinawa
(Primary: On-site, Secondary: Online)
Intermediate speaker speech synthesis between two speakers using x-vector speaker space
Sota Hosoi, Takahiro Kinouchi, Yukoh Wakabayashi, Norihide Kitaoka (TUT) EA2023-103 SIP2023-150 SP2023-85
Recent advancements in speech synthesis technologies have enabled the synthesis of speeches of speakers not in the train... [more] EA2023-103 SIP2023-150 SP2023-85
pp.250-255
SIP, SP, EA, IPSJ-SLP [detail] 2024-03-01
10:40
Okinawa
(Primary: On-site, Secondary: Online)
An Investigation on the Speech Recovery from EEG Signals Using Transformer
Tomoaki Mizuno (The Univ. of Electro-Communications), Takuya Kishida (Aichi Shukutoku Univ.), Natsue Yoshimura (Tokyo Tech), Toru Nakashika (The Univ. of Electro-Communications) EA2023-108 SIP2023-155 SP2023-90
Synthesizing full speech from ElectroEncephaloGraphy(EEG) signals is a challenging task. In this paper, speech reconstru... [more] EA2023-108 SIP2023-155 SP2023-90
pp.277-282
SIP, SP, EA, IPSJ-SLP [detail] 2024-03-01
16:35
Okinawa
(Primary: On-site, Secondary: Online)
Discrimination of rotation direction of virtual sound source in binaural synthesis using sound source radiation characteristics
Orie Nishiyama (Chiba Institute of Technology), Toshiharu Horiuchi, Shota Okubo (KDDI Research, Inc.), Yoshifumi Chisaki (Chiba Institute of Technology) EA2023-125 SIP2023-172 SP2023-107
In order to provide the sensation of being there, research has been conducted on realistic communication that acquires, ... [more] EA2023-125 SIP2023-172 SP2023-107
pp.376-381
SP, NLC, IPSJ-SLP, IPSJ-NL [detail] 2023-12-03
10:00
Tokyo Kikai-Shinko-Kaikan Bldg.
(Primary: On-site, Secondary: Online)
Improvement of Tacotron2 text-to-speech model based on masking operation and positional attention mechanism
Tong Ma, Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) NLC2023-17 SP2023-37
 [more] NLC2023-17 SP2023-37
pp.19-24
SP, NLC, IPSJ-SLP, IPSJ-NL [detail] 2023-12-03
11:05
Tokyo Kikai-Shinko-Kaikan Bldg.
(Primary: On-site, Secondary: Online)
[Poster Presentation] Self-supervised learning model based emotion transfer and intensity control technology for expressive speech synthesis
Wei Li, Nobuaki Minematsu, Daisuke Saito (Univ. of Tokyo) NLC2023-21 SP2023-41
Emotion transfer techniques, which transfersba the speaking style from the reference speech to the target speech, are wi... [more] NLC2023-21 SP2023-41
pp.43-48
PRMU, IPSJ-CVIM, IPSJ-DCC, IPSJ-CGVI 2023-11-17
09:20
Tottori
(Primary: On-site, Secondary: Online)
Co-speech Gesture Generation with Variational Auto Encoder
Shihichi Ka, Koichi Shinoda (Tokyo Tech) PRMU2023-29
Co-speech gesture generation is the study of generating gestures from speech. In prior works, deterministic methods lear... [more] PRMU2023-29
pp.74-79
SP, IPSJ-MUS, IPSJ-SLP [detail] 2023-06-23
13:50
Tokyo
(Primary: On-site, Secondary: Online)
[Poster Presentation] MS-Harmonic-Net++ vs SiFi-GAN: Comparison of fundamental frequency controllable fast neural waveform generative models.
Sota Shimizu (Kobe Univ./NICT), Takuma Okamoto (NICT), Ryoichi Takashima (Kobe Univ.), Yamato Ohtani (NICT), Tetsuya Takiguchi (Kobe Univ.), Tomoki Toda (Nagoya Univ./NICT), Hisashi Kawai (NICT) SP2023-5
Although Harmonic-Net+ has been proposed as a fundamental frequency (fo) and speech rate (SR) controllable fast neural v... [more] SP2023-5
pp.20-25
SP, IPSJ-MUS, IPSJ-SLP [detail] 2023-06-24
13:50
Tokyo
(Primary: On-site, Secondary: Online)
Fast Neural Waveform Generation Model With Fully Connected Upsampling
Haruki Yamashita (Kobe cniv/NICT), Takuma Okamoto (NICT), Ryoichi Takashima (Kobe Univ), Yamato Ohtani (NICT), Tetsuya Takiguchi (Kobe Univ), Tomoki Toda (Nagoya Univ/NICT), Hisashi Kawai (NICT) SP2023-15
In recent years, in text-to-speech synthesis, it is required to improve the inference speed while keeping the quality.
... [more]
SP2023-15
pp.73-78
SP, IPSJ-MUS, IPSJ-SLP [detail] 2023-06-24
13:50
Tokyo
(Primary: On-site, Secondary: Online)
Effect of pause length ratio in speech length on the perception of speech rate induced by speech length
Maho Tamakawa, Shuichi Sakamoto (Tohoku Univ.) SP2023-23
The goal of this study is to investigate the mechanism of the perception of speech rate. In this preliminary study, we i... [more] SP2023-23
pp.114-118
SP, IPSJ-MUS, IPSJ-SLP [detail] 2023-06-24
13:50
Tokyo
(Primary: On-site, Secondary: Online)
Evaluation of multi-speaker text-to-speech synthesis using a corpus for speech recognition with x-vectors for various speech styles
Koki Hida (Wakayama Univ/NICT), Takuma Okamoto (NICT), Ryuichi Nisimura (Wakayama Univ), Yamato Ohtani (NICT), Tomoki Toda (Nagoya Univ/NICT), Hisashi Kawai (NICT) SP2023-25
We have implemented multi-speaker end-to-end text-to-speech synthesis based on JETS using x-vectors as speaker embedding... [more] SP2023-25
pp.125-130
SP, IPSJ-SLP, EA, SIP [detail] 2023-02-28
09:10
Okinawa
(Primary: On-site, Secondary: Online)
Comparison of fundamental frequency controllable fast neural waveform generative models.
Sota Shimizu (Kobe Univ./NICT), Takuma Okamoto (NICT), Ryoichi Takashima, Tetsuya Takiguchi (Kobe Univ.), Tomoki Toda (Nagoya Univ./NICT), Hisashi Kawai (NICT) EA2022-75 SIP2022-119 SP2022-39
Neural vocoders, which reconstruct speech waveforms from acoustic features with deep neural networks, have significantly... [more] EA2022-75 SIP2022-119 SP2022-39
pp.1-6
SP, IPSJ-SLP, EA, SIP [detail] 2023-02-28
09:30
Okinawa
(Primary: On-site, Secondary: Online)
MS-FC-HiFiGAN : Fast Neural Waveform Generation Model With Learnable Lightweight Upsampling
Haruki Yamashita (Kobe Univ/NICT), Takuma Okamoto (NICT), Ryoichi Takashima, Tetsuya Takiguchi (Kobe Univ), Tomoki Toda (Nagoya Univ/NICT), Hisashi Kawai (NICT) EA2022-76 SIP2022-120 SP2022-40
In recent years, in text-to-speech synthesis, it is required to improve the inference speed while keeping the quality.
... [more]
EA2022-76 SIP2022-120 SP2022-40
pp.7-12
SP, IPSJ-SLP, EA, SIP [detail] 2023-02-28
09:50
Okinawa
(Primary: On-site, Secondary: Online)
End-to-End Speech Synthesis Based on Articulatory Movements Captured by Real-time MRI
Yuto Otani, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada (Tokyo Univ. Sci.) EA2022-77 SIP2022-121 SP2022-41
We propose an end-to-end deep learning model for speech synthesis based on articulatory movements captured by real-time ... [more] EA2022-77 SIP2022-121 SP2022-41
pp.13-18
SP, IPSJ-SLP, EA, SIP [detail] 2023-02-28
13:00
Okinawa
(Primary: On-site, Secondary: Online)
[Invited Talk] Multiple sound spot synthesis meets multilingual speech synthesis -- Implementation is really all we need --
Takuma Okamoto (NICT) EA2022-87 SIP2022-131 SP2022-51
A multilingual multiple sound spot synthesis system is implemented as a user interface for real-time speech translation ... [more] EA2022-87 SIP2022-131 SP2022-51
pp.73-76
SP, IPSJ-SLP, EA, SIP [detail] 2023-03-01
11:00
Okinawa
(Primary: On-site, Secondary: Online)
Representation and Prediction of Accent Phrase Prosodic Features in Japanese Text-to-Speech
Masaki Sato, Shinnosuke Takamichi, Hiroshi Saruwatari (The Univ. of Tokyo) EA2022-108 SIP2022-152 SP2022-72
In order to use speech synthesis in a variety of situations such as dialogue systems and emotional expression in audiobo... [more] EA2022-108 SIP2022-152 SP2022-72
pp.197-202
SP, IPSJ-SLP, EA, SIP [detail] 2023-03-01
11:20
Okinawa
(Primary: On-site, Secondary: Online)
An Investigation of Text-to-Speech Synthesis Using Voice Conversion and x-vector Embedding Sympathizing Emotion of Input Audio for Spoken Dialogue Systems
Shunichi Kohara, Masanobu Abe, Sunao Hara (Okayama Univ.) EA2022-109 SIP2022-153 SP2022-73
In this paper, we propose a Text-to-Speech synthesis method to synthesize the same emotional expression as the input spe... [more] EA2022-109 SIP2022-153 SP2022-73
pp.203-208
EA, US
(Joint)
2022-12-22
13:30
Hiroshima Satellite Campus Hiroshima [Poster Presentation] Quality Improvement of Children's Speech with Multiple Inputs of Speaker Vectors in a General Purpose Vocoder
Satoshi Yoshida, Ken'ichi Furuya (Oita Univ.), Hideyuki Mizuno (SUS) EA2022-64
Neural vocoders used in speech synthesis are capable of synthesizing high-quality speech that is indistinguishable from ... [more] EA2022-64
pp.18-23
NLC, IPSJ-NL, SP, IPSJ-SLP [detail] 2022-11-30
15:30
Tokyo
(Primary: On-site, Secondary: Online)
Semi-supervised joint training of text to speech and automatic speech recognition using unpaired text data
Naoki Makishima, Satoshi Suzuki, Atsushi Ando, Ryo Masumura (NTT) NLC2022-14 SP2022-34
This paper presents a novel joint training of text to speech (TTS) and automatic speech recognition (ASR) with small amo... [more] NLC2022-14 SP2022-34
pp.27-32
 Results 1 - 20 of 247  /  [Next]  
Choose a download format for default settings. [NEW !!]
Text format pLaTeX format CSV format BibTeX format
Copyright and reproduction : All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034)


[Return to Top Page]

[Return to IEICE Web Page]


The Institute of Electronics, Information and Communication Engineers (IEICE), Japan