Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
LARGE-CONTEXT POINTER-GENERATOR NETWORKS FOR SPOKEN-TO-WRITTEN STYLE CONVERSION Mana Ihori, Akihiko Takashima, Ryo Masumura (NTT) EA2019-142 SIP2019-144 SP2019-91 |
This paper introduces a spoken-to-written style conversion method that is suitable for handling a series of text such as... [more] |
EA2019-142 SIP2019-144 SP2019-91 pp.237-242 |
SP, EA, SIP |
2020-03-03 09:00 |
Okinawa |
Okinawa Industry Support Center (Cancelled but technical report was issued) |
[Poster Presentation]
A Comparison of Language Models for a Design of Reduced Phoneme Set Shuji Komeiji, Toshihisa Tanaka (TUAT), Koichi Shinoda (titech) EA2019-152 SIP2019-154 SP2019-101 |
Language models for a design of reduced phoneme set are compared each other.
The reduction of the phoneme set improves ... [more] |
EA2019-152 SIP2019-154 SP2019-101 pp.295-300 |
EMM |
2020-01-27 13:00 |
Miyagi |
Tohoku Univ. |
Suppression of Dialog System Speech by Embedding Marker Signal into High Frequency Band Shunsuke Saga, Akinori Ito (Tohoku Univ.) EMM2019-94 |
Spoken dialog systems have become popular and are used in a home environment, such as smart speakers. A problem will occ... [more] |
EMM2019-94 pp.1-6 |
NLP, NC (Joint) |
2020-01-25 10:10 |
Okinawa |
Miyakojima Marine Terminal |
Application of Chaotic Neural Network Reservoir to Speech Recognition Maakito Inoue, Keisuke Fukuda, Yoshihiko Horio (Tohoku Univ.) NLP2019-103 |
The neural network reservoir is a learning network model using the recurrent neural network. The chaotic neural network ... [more] |
NLP2019-103 pp.95-98 |
HCGSYMPO (2nd) |
2019-12-11 - 2019-12-13 |
Hiroshima |
Hiroshima-ken Joho Plaza (Hiroshima) |
Crosslingual Emotion Recognition using English and Japanese Speech Data Yuta Nirasawa, Atom Scotto, Ryota Sakuma, Yuki Hujita, Keiich Zempo (Tsukuba Univ.) |
Since reasearch in Speech Emotion Recognition(SER) is performed with mostly English data, applying these models to Japan... [more] |
|
NLC, IPSJ-NL, SP, IPSJ-SLP (Joint) [detail] |
2019-12-06 16:25 |
Tokyo |
NHK Science & Technology Research Labs. |
An evaluation of representation learning using phoneme posteriorgrams and data augmentation in speech emotion recognition Shintaro Okada (Nagoya Univ.), Atsushi Ando (Nagoya Univ./NTT), Tomoki Toda (Nagoya Univ.) SP2019-43 |
This paper presents a new speech emotion recognition method based on representation learning and data augmentation.
To ... [more] |
SP2019-43 pp.91-96 |
WIT, SP |
2019-10-27 09:00 |
Kagoshima |
Daiichi Institute of Technology |
Extraction of linguistic representation and syllable recognition from EEG signal of speech-imagery Kentaro Fukai, Hidefumi Ohmura, Kouichi Katsurada (Tokyo Univ. of Science), Satoka Hirata, Yurie Iribe (Aichi Prefectural Univ.), Mingchua Fu, Ryo Taguchi (Nagoya Inst. of Technology), Tsuneo Nitta (Waseda Univ./Toyohashi Univ. of Technology) SP2019-28 WIT2019-27 |
Speech imagery recognition from Electroencephalogram (EEG) is one of the challenging technologies for non-invasive brain... [more] |
SP2019-28 WIT2019-27 pp.63-68 |
WIT, SP |
2019-10-27 09:20 |
Kagoshima |
Daiichi Institute of Technology |
Word Recognition using word likelihood vector from speech-imagery EEG Satoka Hirata, Yurie Iribe (Aichi Prefectual Univ.), Kentaro Fukai, Kouichi Katsurada (Tokyo Univ. of Science), Tsuneo Nitta (Waseda Univ./Toyohashi Univ. of Tech.) SP2019-29 WIT2019-28 |
Previous research suggests that humans manipulate the machine using their electroencephalogram called BCI (Brain Compute... [more] |
SP2019-29 WIT2019-28 pp.69-73 |
WIT, SP |
2019-10-27 10:30 |
Kagoshima |
Daiichi Institute of Technology |
A Method to Reduce Ambiguity in Identifying the Muscle Activation Time of Each EMG Channel in Isolated Inaudible Single Syllable Recognition Hidetoshi Nagai (KIT) SP2019-32 WIT2019-31 |
In inaudible speech recognition using surface EMG, consonant recognition is one of the difficult problems. When phonemes... [more] |
SP2019-32 WIT2019-31 pp.87-92 |
EA, ASJ-H, ASJ-AA |
2019-07-17 16:20 |
Hokkaido |
SAPPORO COMMUNITY PLAZA |
Audio database construction based on video and subtitle data posted on the Web Yuko Takai, Naofumi Aoki, Yoshinori Dobashi (Hokkaido Univ.) EA2019-22 |
Using sound data contained in the video posted on the website and user created subtitle data attached to the video, the ... [more] |
EA2019-22 pp.113-116 |
OCS, PN, NS (Joint) |
2019-06-21 13:20 |
Iwate |
MALIOS(Morioka) |
Propose an approach to defend against the attack for smart speaker Yuya Tarutani (Okayama Univ.), Kensuke Ueda, Yoshiaki Kato (Mitsubishi Electric) NS2019-39 |
Smart speakers have become widespread as voice operation interface. Users can operate the various function such as home ... [more] |
NS2019-39 pp.23-28 |
EA, SIP, SP |
2019-03-15 10:00 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
Consideration on Effectiveness of Relative Phase from Residual Speech for Speaker Recognition Seiichi Nakagawa, Kazumasa Yamamoto, Kazumasa Yamamoto (Chubu Univ.) EA2018-130 SIP2018-136 SP2018-92 |
We have focused on phase spectrum for speaker recognition. So we proposed relative phase as a feature parameter for spea... [more] |
EA2018-130 SIP2018-136 SP2018-92 pp.185-190 |
EA, SIP, SP |
2019-03-15 10:25 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
Neural Language Models based on Conditional Hierarchical Recurrent Encoder-Decoder for Multi-Party Conversational Speech Recognition Ryo Masumura, Tomohiro Tanaka, Atsushi Ando, Takanobu Oba, Yushi Aono (NTT) EA2018-131 SIP2018-137 SP2018-93 |
This paper presents fully neural network based language models (LMs) that can leverage long-range conversational context... [more] |
EA2018-131 SIP2018-137 SP2018-93 pp.191-196 |
EA, SIP, SP |
2019-03-15 13:30 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
[Poster Presentation]
A Design of Reduced Phoneme Set Based on a Language Model Shuji Komeiji, Toshihisa Tanaka (Tokyo Univ. of Agriculture and Tech.) EA2018-134 SIP2018-140 SP2018-96 |
A design of reduced phoneme set based on a language model is proposed. The reduction of the phoneme set improves discrim... [more] |
EA2018-134 SIP2018-140 SP2018-96 pp.205-210 |
EA, SIP, SP |
2019-03-15 13:30 |
Nagasaki |
i+Land nagasaki (Nagasaki-shi) |
[Poster Presentation]
Data augmentation using multiple databases for end-to-end dysarthric speech recognition Yuki Takashima, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.) EA2018-156 SIP2018-162 SP2018-118 |
We present in this paper an end-to-end speech recognition system for a Japanese person with an articulation disorder res... [more] |
EA2018-156 SIP2018-162 SP2018-118 pp.335-340 |
WIT, IPSJ-AAC |
2019-03-09 16:40 |
Ibaraki |
Tsukuba University of Technology |
[Special Talk]
Applications of Well Well-being Information Technology to Broadcasting Service Tomoyasu Komori (NHK STRL) WIT2018-75 |
NHK has researched and developed the human-friendly broadcasting services that are commentary broadcasting and teletext ... [more] |
WIT2018-75 pp.105-106 |
NLC, IPSJ-IFAT |
2019-02-07 15:30 |
Kyoto |
Ryukoku University Omiya Campus |
[Invited Talk]
ForeSight Voice Mining, a voice mining system for contact centers Kazuhiro Arai (NTT-TX) NLC2018-40 |
This paper describes ForeSight Voice Mining that NTT TechnoCross Corp. provides for contact centers. ForeSight Voice Min... [more] |
NLC2018-40 pp.27-32 |
SP |
2019-01-26 16:25 |
Ishikawa |
Kanazawa-Harmonie |
[Fellow Memorial Lecture]
Machine, human and sound communication Akinori Ito (Tohoku Univ.) SP2018-55 |
Speech is the most important modality for human-human communication. From invention of electrical speech communication, ... [more] |
SP2018-55 p.19 |
SP |
2019-01-27 11:30 |
Ishikawa |
Kanazawa-Harmonie |
Multimodal Data Augmentation for Visual Speech Recognition using Deep Canonical Correlation Analysis Masaki Shimonishi, Satoshi Tamura, Satoru Hayamizu (Gifu University) SP2018-60 |
This paper proposes ta new data augmentation strategy for deep learning, in which feature vectors in one modality can be... [more] |
SP2018-60 pp.41-45 |
HCGSYMPO (2nd) |
|
Mie |
Sinfonia Technology Hibiki Hall Ise |
Mood Improvement by Multiple Personality Assistant Agent in Speech Recognition Failure Takehiro Hondo, Ippei Naganuma, Kazuki Kobayashi (Shinshu Univ.) |
This paper proposes a method to create a mood in a human agent speech interaction and investigates the created mood by a... [more] |
|