Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SIP, SP, EA, IPSJ-SLP [detail] |
2024-03-01 09:30 |
Okinawa |
(Primary: On-site, Secondary: Online) |
Evaluation of Automatic Speech Recognition for Deaf and Hard-of-Hearing People by Speaker Adaptation. Kaito Takahashi, Takahiro Kinouchi, Yukoh Wakabayashi (TUT), Kengo Ohta (NITAC), Akio Kobayashi (Yamato Univ.), Norihide Kitaoka (TUT) EA2023-102 SIP2023-149 SP2023-84 |
Communication between normal-hearing people and the deaf is generally used sign language, written communication, and spe... [more] |
EA2023-102 SIP2023-149 SP2023-84 pp.244-249 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2023-06-24 13:50 |
Tokyo |
(Primary: On-site, Secondary: Online) |
Evaluation of multi-speaker text-to-speech synthesis using a corpus for speech recognition with x-vectors for various speech styles Koki Hida (Wakayama Univ/NICT), Takuma Okamoto (NICT), Ryuichi Nisimura (Wakayama Univ), Yamato Ohtani (NICT), Tomoki Toda (Nagoya Univ/NICT), Hisashi Kawai (NICT) SP2023-25 |
We have implemented multi-speaker end-to-end text-to-speech synthesis based on JETS using x-vectors as speaker embedding... [more] |
SP2023-25 pp.125-130 |
EA, US (Joint) |
2022-12-22 16:50 |
Hiroshima |
Satellite Campus Hiroshima |
[Poster Presentation]
Data augmentation method for machine learning on speech data Tsubasa Maruyama (Tokyo Tech), Tsutomu Ikegami (AIST), Toshio Endo (Tokyo Tech), Takahiro Hirofuchi (AIST) EA2022-68 |
In machine learning, data augmentation is a method to enhance the number and diversity of data by adding transformations... [more] |
EA2022-68 pp.42-48 |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-17 15:00 |
Online |
Online |
Representation and analytical normalization for vocal-tract-length transformation by group theory Atsushi Miyashita, Tomoki Toda (Nagoya Univ) SP2022-11 |
In automatic speech recognition, a recognition result should be invariant with respect to acoustic changes caused by dif... [more] |
SP2022-11 pp.41-46 |
CNR, BioX |
2022-03-04 14:20 |
Online |
Online |
Synthesizing Deep Master Voices for Wolf Attacking on Speaker Recognition Systems Jun Tsuchiya, Masakatsu Nishigaki, Tetsushi Ohki (Shizuoka Univ.) BioX2021-53 CNR2021-34 |
In this paper, we propose an attack on speaker verification systems by Deep Master Voice using GAN-based wolf voice.GAN-... [more] |
BioX2021-53 CNR2021-34 pp.33-38 |
SP, IPSJ-SLP, IPSJ-MUS |
2021-06-18 15:00 |
Online |
Online |
Protection method with audio processing against Audio Adversarial Example Taisei Yamamoto, Yuya Tarutani, Yukinobu Fukusima, Tokumi Yokohira (Okayama Univ) SP2021-4 |
Machine learning technology has improved the recognition accuracy of voice recognition, and demand for voice recognition... [more] |
SP2021-4 pp.19-24 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 17:10 |
Online |
Online |
An investigation of rhythm-based speaker embeddings for phoneme duration modeling Kenichi Fujita, Atsushi Ando, Yusuke Ijima (NTT) EA2020-77 SIP2020-108 SP2020-42 |
In this study, we propose a speaker embedding method suitable for modeling phoneme duration length for each individual i... [more] |
EA2020-77 SIP2020-108 SP2020-42 pp.103-108 |
EA, US, SP, SIP, IPSJ-SLP [detail] |
2021-03-03 17:35 |
Online |
Online |
[Short Paper]
Comparison of End-to-End Models for Joint Speaker and Speech Recognition Kak Soky (Kyoto Univ.), Sheng Li (NICT), Masato Mimura, Chenhui Chu, Tatsuya Kawahara (Kyoto Univ.) EA2020-78 SIP2020-109 SP2020-43 |
In this paper, we investigate the effectiveness of using speaker information on the performance of speaker-imbalanced au... [more] |
EA2020-78 SIP2020-109 SP2020-43 pp.109-113 |
OCS, PN, NS (Joint) |
2019-06-21 13:20 |
Iwate |
MALIOS(Morioka) |
Propose an approach to defend against the attack for smart speaker Yuya Tarutani (Okayama Univ.), Kensuke Ueda, Yoshiaki Kato (Mitsubishi Electric) NS2019-39 |
Smart speakers have become widespread as voice operation interface. Users can operate the various function such as home ... [more] |
NS2019-39 pp.23-28 |
EA, SIP, SP |
2019-03-15 10:00 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
Consideration on Effectiveness of Relative Phase from Residual Speech for Speaker Recognition Seiichi Nakagawa, Kazumasa Yamamoto, Kazumasa Yamamoto (Chubu Univ.) EA2018-130 SIP2018-136 SP2018-92 |
We have focused on phase spectrum for speaker recognition. So we proposed relative phase as a feature parameter for spea... [more] |
EA2018-130 SIP2018-136 SP2018-92 pp.185-190 |
EA, SIP, SP |
2019-03-15 13:30 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
[Poster Presentation]
Simultaneous Japanese Flexible-Keyword Detection and Speaker Recognition for Low-Resource Devices Hiroshi Fujimura (TOSHIBA) EA2018-157 SIP2018-163 SP2018-119 |
We propose a novel method of simultaneous keyword detection and speaker recognition for low resource devices in this stu... [more] |
EA2018-157 SIP2018-163 SP2018-119 pp.341-346 |
MBE |
2019-02-01 14:35 |
Saga |
Saga University |
A Speaker Recognition Framework Using Sound Spectrogram and Convolutional Neural Network Based Deep Learning Technique and Performance Evaluation with a Large-Scale Dataset of Human Speech Ikumi Osaki, Masaki Kyoso (Tokyo City Univ.) MBE2018-81 |
In order to realize highly accurate speaker recognition, it is necessary to have universal applicability to various spea... [more] |
MBE2018-81 pp.117-121 |
SP |
2019-01-27 11:05 |
Ishikawa |
Kanazawa-Harmonie |
A Speaker Recognition Performance Measure based on the Adaptation Quickness and Final Accuracy for Spoken Dialog Systems Junko Takami, Takeshi Kawabata (KGU) SP2018-59 |
For constructing user friendly spoken dialog system, it is important to recognize "Who is the user?" and to choose appro... [more] |
SP2018-59 pp.35-40 |
EA, US (Joint) |
2019-01-22 14:00 |
Kyoto |
Doshisha Univ. |
[Poster Presentation]
On speaker identification under multiple-talker condition using frequency domain binaural model Kai Kiyota, Irwansyah, Kousuke Matsuoka, Tsuyoshi Usagawa (Kumamoto Univ.) EA2018-94 |
In order to realize the speech recognition system suitable for a small meeting logging with speaker identification, it i... [more] |
EA2018-94 pp.7-12 |
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2017-12-21 12:50 |
Tokyo |
Waseda Univ. Green Computing Systems Research Organization |
[Poster Presentation]
Development of Speaker/Environment-Dependent Acoustic Model for Non-Audible Murmur Recognition Based on DNN Adaptation Seita Noda, Tomoki Hayashi, Tomoki Toda, Kazuya Takeda (Nagoya Univ.) SP2017-56 |
In this research, we aim to improve the performance of non-audible murmur (NAM) recognition towards the development of s... [more] |
SP2017-56 pp.7-10 |
SP, IPSJ-SLP (Joint) |
2017-07-28 11:15 |
Miyagi |
Akiu Resort Hotel Crescent |
Speaker Diarization for Face-to-Face Dialog of Service Counters Based on Appearance Pattern of Speakers Mizuki Watabe (NTT DOCOMO), Atsushi Ando, Hosana Kamiyama, Satoshi Kobashikawa, Yushi Aono (NTT), Takanobu Oba, Yoshinori Isoda (NTT DOCOMO) SP2017-19 |
This paper proposes a speaker diarization method for face-to-face dialogue of service counters using appearance pattern ... [more] |
SP2017-19 pp.21-26 |
SP, SIP, EA |
2017-03-02 09:00 |
Okinawa |
Okinawa Industry Support Center |
[Poster Presentation]
Hardware Speech Sensor Based on Deep Neural Network Feature Extractor and Template Matching Yi Liu, Boyu Qian, Jian Wang, Takahiro Shinozaki (Titech) EA2016-135 SIP2016-190 SP2016-130 |
We explore the possibility of combination of a DNN-based feature extractor and template based matching for keyword detec... [more] |
EA2016-135 SIP2016-190 SP2016-130 pp.297-300 |
SP, IPSJ-SLP, NLC, IPSJ-NL (Joint) [detail] |
2016-12-20 11:20 |
Tokyo |
NTT Musashino R&D |
Speaker Recognition Based on Features through 1-Dimensional Convolutional Neural Network Shohei Sonoda, Yufu Kasahara, Masato Inoue (Waseda Univ) SP2016-52 |
Most of the speaker recognition methods utilize the voice features of the mel-frequency cepstrum coefficients (MFCCs) an... [more] |
SP2016-52 pp.17-21 |
SP, IPSJ-SLP, NLC, IPSJ-NL (Joint) [detail] |
2016-12-20 11:45 |
Tokyo |
NTT Musashino R&D |
Study on i-vector based speaker verification using rank for short utterances Misaki Tsujikawa (Panasonic/Sokendai), Tsuyoki Nishikawa (Panasonic), Tomoko Matsui (ISM) SP2016-53 |
Generally, short utterance test data seriously degrades the accuracy of speaker verification. However, in many voice-ope... [more] |
SP2016-53 pp.23-26 |
SP, IPSJ-SLP, NLC, IPSJ-NL (Joint) [detail] |
2016-12-20 15:10 |
Tokyo |
NTT Musashino R&D |
[Poster Presentation]
Deep Neural Network Using Fundamental Frequency For Noise Robust Speaker Recognition Yoshihiro Suzuki, Yosuke Sugiura, Tetsuya Shimamura (Saitama Univ.) SP2016-58 |
In this paper, we propose a neural network architecture for speaker recognition to simplify learning process. In the pro... [more] |
SP2016-58 pp.53-56 |