Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
EA |
2024-05-22 14:15 |
Online |
Online |
未定
-- 未定 -- Tsubasa Ochiai (NTT), Kazuma Iwamoto (Doshisha Univ.), Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki (NTT), Shigeru Katagiri (Doshisha Univ.) |
(To be available after the conference date) [more] |
|
EA |
2024-05-22 16:50 |
Online |
Online |
[Invited Talk]
Fundamentals of Diffusion-based Generative Models and their Application to Speech Enhancement and Separation Scheibler Robin (LY Corp.) |
(To be available after the conference date) [more] |
|
CAS, CS |
2024-03-14 13:30 |
Okinawa |
|
Characterization of Semantic Communications in Speech Signal Transmission Futo Iwanaga, Daisuke Umehara (Kyoto Inst. of Tech.) CAS2023-118 CS2023-111 |
In recent years, the volume of data in data communication has surged, Characterization of Semantic Communications in Spe... [more] |
CAS2023-118 CS2023-111 pp.41-46 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-02-29 15:45 |
Okinawa |
(Primary: On-site, Secondary: Online) |
|
We have developed automatic speech recognition and dialect identification techniques by using COJADS, a corpus of Japane... [more] |
|
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
An experimental survey on speaker embedding spaces for controlling speaker identity in speech synthesis system Wakuto Morita, Daisuke Saito, Nobuaki Minematsu (Univ. of Tokyo) EA2023-93 SIP2023-140 SP2023-75 |
This study investigated the influence of the discriminability of speaker encoders on speech synthesis models that can co... [more] |
EA2023-93 SIP2023-140 SP2023-75 pp.190-195 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
SELECTING N-LOWEST SCORES FOR TRAINING MOS PREDICTION MODELS Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko (NTT) EA2023-94 SIP2023-141 SP2023-76 |
Automatic speech quality assessment (SQA) is a task to evaluate the quality of speech samples without resorting to time-... [more] |
EA2023-94 SIP2023-141 SP2023-76 pp.196-201 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Improving training recipe of Remixed2Remixed for speech enhancement Li Li, Shogo Seki (CyberAgent) EA2023-95 SIP2023-142 SP2023-77 |
In the use of deep learning for speech enhancement, supervised learning models that use pairs of clean speech and artifi... [more] |
EA2023-95 SIP2023-142 SP2023-77 pp.202-207 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Multi-Dialect Speech Synthesis with Interpretable Accent latent Variable based on VQ-VAE Kazuki Yamauchi, Yuki Saito, Hiroshi Saruwatari (UTokyo) EA2023-98 SIP2023-145 SP2023-80 |
In this paper, we address two tasks: "Intra-dialect Text-to-Speech (TTS)," aiming to synthesize speech in the same diale... [more] |
EA2023-98 SIP2023-145 SP2023-80 pp.220-225 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 10:40 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Intermediate speaker speech synthesis between two speakers using x-vector speaker space Sota Hosoi, Takahiro Kinouchi, Yukoh Wakabayashi, Norihide Kitaoka (TUT) EA2023-103 SIP2023-150 SP2023-85 |
Recent advancements in speech synthesis technologies have enabled the synthesis of speeches of speakers not in the train... [more] |
EA2023-103 SIP2023-150 SP2023-85 pp.250-255 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 10:40 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Substitution of Implicit Linguistic Information in Beam Search Decoding Using CTC-based Speech Recognition Models Tatsunari Takagi, Yukoh Wakabayashi (TUT), Atsunori Ogawa (NTT), Norihide Kitaoka (TUT) EA2023-106 SIP2023-153 SP2023-88 |
The rise of neural networks in the field of automatic speech recognition has notably improved the accuracy of speech rec... [more] |
EA2023-106 SIP2023-153 SP2023-88 pp.268-273 |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 16:35 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Simulation Evaluation of Speech Detection Based on Distributed Sound-to-Light Conversion Device Blinkies Satoshi Motoyama, Natsuki Ueno, Masahiro Yasuda (TMU), Yuma Kinoshita (Tokai Univ.), Nobutaka Ono (TMU) EA2023-126 SIP2023-173 SP2023-108 |
The purpose of this study is speech detection using the distributed sound-to-light conversion device Blinkies. As an ini... [more] |
EA2023-126 SIP2023-173 SP2023-108 pp.382-387 |
SP, NLC, IPSJ-SLP, IPSJ-NL [detail] |
2023-12-02 16:00 |
Tokyo |
Kikai-Shinko-Kaikan Bldg. (Primary: On-site, Secondary: Online) |
Development and effects of English speech training drills to improve perception and production skills seamlessly with interactive gamification Nobuaki Minematsu, Yingxiang Gao (UTokyo), Noriko Nakanishi (KGU), Yusuke Inoue, Hiroaki Mizuno (Carriage) NLC2023-15 SP2023-35 |
To improve aural/oral proficiency in English, various skills have to be acquired such as 1) spoken word perception, 2) m... [more] |
NLC2023-15 SP2023-35 pp.7-12 |
EMM, EA, ASJ-H |
2023-11-23 13:00 |
Toyama |
|
[Poster Presentation]
** , (**) |
As a study of speech intelligibility estimation methods using speech recognition, we simulated a subjective evaluation t... [more] |
EA2023-45 EMM2023-76 pp.93-97 |
PRMU, IPSJ-CVIM, IPSJ-DCC, IPSJ-CGVI |
2023-11-17 09:20 |
Tottori |
(Primary: On-site, Secondary: Online) |
Co-speech Gesture Generation with Variational Auto Encoder Shihichi Ka, Koichi Shinoda (Tokyo Tech) PRMU2023-29 |
Co-speech gesture generation is the study of generating gestures from speech. In prior works, deterministic methods lear... [more] |
PRMU2023-29 pp.74-79 |
WIT, SP, IPSJ-SLP [detail] |
2023-10-14 16:40 |
Fukuoka |
Kyushu Institute of Technology (Primary: On-site, Secondary: Online) |
Sequence-to-sequence Voice Conversion for Electrolaryngeal Speech Enhancement with Multi-stage Pretraining and Fine-tuning Techniques Ding Ma, Lester Phillip Violeta, Kazuhiro Kobayashi, Tomoki Toda (Nagoya Univ.) SP2023-32 WIT2023-23 |
Sequence-to-sequence (seq2seq) voice conversion (VC) models have great potential for electrolaryngeal (EL) speech to nor... [more] |
SP2023-32 WIT2023-23 pp.27-32 |
WIT, SP, IPSJ-SLP [detail] |
2023-10-14 17:05 |
Fukuoka |
Kyushu Institute of Technology (Primary: On-site, Secondary: Online) |
Electrolaryngeal Speech Enhancement through Strong Linguistic Encoding Methods Lester Phillip Violeta, Wen-Chin Huang, Ding Ma, Ryuichi Yamamoto, Kazuhiro Kobayashi, Tomoki Toda (Nagoya Univ.) SP2023-33 WIT2023-24 |
Although pretraining and fine-tuning approaches have proven to work well in speech intelligibility enhancement, various ... [more] |
SP2023-33 WIT2023-24 pp.33-38 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-23 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
[Poster Presentation]
Generation of colored subtitle images based on emotional information of speech utterances Fumiya Nakamura (Kobe Univ.), Ryo Aihara (Mitsubishi Electric), Ryoichi Takashima, Tetsuya Takiguchi (Kobe Univ.), Yusuke Itani (Mitsubishi Electric) SP2023-11 |
Conventional automatic subtitle generation systems based on speech recognition do not take into account paralinguistic i... [more] |
SP2023-11 pp.54-59 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-23 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Streaming End-to-End speech recognition using a CTC decoder with substituted linguistic information Tatsunari Takagi (TUT), Atsunori Ogawa (NTT), Norihide Kitaoka, Yukoh Wakabayashi (TUT) SP2023-12 |
Speech recognition technology has been employed in various fields due to the enhancement of speech recognition model acc... [more] |
SP2023-12 pp.60-64 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-24 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Domain adaptation of speech recognition models based on self-supervised learning using target domain speech Takahiro Kinouchi (TUT), Atsunori Ogawa (NTT), Yuko Wakabayashi, Norihide Kitaoka (TUT) SP2023-19 |
In this study, we propose a domain adaptation method using only speech data in the target domain without using transcrib... [more] |
SP2023-19 pp.91-96 |
ET |
2023-03-14 14:10 |
Tokushima |
Tokushima University (Primary: On-site, Secondary: Online) |
HMD-type customer service training support system using eye tracking Takeru Oue, Yukihiro Matsubara, Kousuke Mouri, Masaru Okamoto (Hiroshima City Univ.) ET2022-71 |
In this paper, customer service training support system using HMD and eye tracking approach are developed. By using this... [more] |
ET2022-71 pp.73-78 |