Presentation | 2020-12-02 Multi-Modal Emotion Recognition by Integrating of Acoustic and Linguistic Features Ryotaro Nagase, Takahiro Fukumori, Yoichi Yamashita, |
---|---|
PDF Download Page | ![]() |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In recent years, the advanced techique of deep learning has improved the performance of Speech Emotional Recognition as well as speech synthesis or speech recognition. Moreover, multi-mordal emotion recognition which integrates linguistic or facial image features with acoustic features has outperformed conventional methods as well. In this paper, we propose a method of SER by using acoustic and linguistic features at the utterance level. Firstly, speech and text emotion recognition are trained with Japanese emotional speech corpus. Then, we aim to improve accuracy by using early-fusion which fuses linguistic and acoustic features and late-fusion which fuses predicted values by each model. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Speech Emotion Recognition / Text Emotion Recognition / Multi-Modal / early-fusion / late-fusion |
Paper # | NLC2020-14,SP2020-17 |
Date of Issue | 2020-11-25 (NLC, SP) |
Conference Information | |
Committee | NLC / IPSJ-NL / SP / IPSJ-SLP |
---|---|
Conference Date | 2020/12/2(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Online |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Kazutaka Shimada(Kyushu Inst. of Tech.) / 関根 聡(理研) / Hisashi Kawai(NICT) / 北岡 教英(豊技大) |
Vice Chair | Mitsuo Yoshida(Toyohashi Univ. of Tech.) / Takeshi Kobayakawa(NHK) |
Secretary | Mitsuo Yoshida(Univ. of Tokyo) / Takeshi Kobayakawa(Hiroshima Univ. of Economics) / (デンソーITラボ) / (小樽商科大) / (茨城大) |
Assistant | Kanjin Takahashi(Sansan) / Ko Mitsuda(NTT) / / Yusuke Ijima(NTT) |
Paper Information | |
Registration To | Technical Committee on Natural Language Understanding and Models of Communication / Special Interest Group on Natural Language / Technical Committee on Speech / Special Interest Group on Spoken Language Processing |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Multi-Modal Emotion Recognition by Integrating of Acoustic and Linguistic Features |
Sub Title (in English) | |
Keyword(1) | Speech Emotion Recognition |
Keyword(2) | Text Emotion Recognition |
Keyword(3) | Multi-Modal |
Keyword(4) | early-fusion |
Keyword(5) | late-fusion |
1st Author's Name | Ryotaro Nagase |
1st Author's Affiliation | Ritsumeikan University(Ritsumeikan Univ.) |
2nd Author's Name | Takahiro Fukumori |
2nd Author's Affiliation | Ritsumeikan University(Ritsumeikan Univ.) |
3rd Author's Name | Yoichi Yamashita |
3rd Author's Affiliation | Ritsumeikan University(Ritsumeikan Univ.) |
Date | 2020-12-02 |
Paper # | NLC2020-14,SP2020-17 |
Volume (vol) | vol.120 |
Number (no) | NLC-270,SP-271 |
Page | pp.pp.7-12(NLC), pp.7-12(SP), |
#Pages | 6 |
Date of Issue | 2020-11-25 (NLC, SP) |