［ポスター講演］大規模事前学習モデルを用いたEnd-to-End音声認識による日本語単語了解度推定

服部 真稀; 近藤 和弘

講演名	2023-11-23 ［ポスター講演］大規模事前学習モデルを用いたEnd-to-End音声認識による日本語単語了解度推定服部真稀(山形大), 近藤和弘(山形大),
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	音声認識を利用した音声了解度推定方法の検討として, 大規模事前学習モデルに基づくEnd-to-Endな音声認識で主観評価試験を模擬し, その出力から単語了解度を推定した. 本稿では少数のデータセットで事前学習モデルにファインチューニングすることで目的のタスクを実現し, 特定の試験単語に限定したモデルとして基礎検討を行った. 主観評価との相関や誤差では先行研究を上回る評価が得られ, 将来の汎用的な了解度予測モデルとして期待できる.
抄録(英)	As a study of speech intelligibility estimation methods using speech recognition, we simulated a subjective evaluation test using end-to-end speech recognition models based on large-scale pre-training models, and estimated word intelligibility from the output of the models. In this paper, the target task was realized by fine-tuning the pre-trained models with a small number of data sets, and a basic study was conducted as a predictive model limited to specific test words. The correlation and errors with the subjective evaluation are better than the previous studies, and it is expected to be a general-purpose model for predicting intelligibility in the future.
キーワード(和)	音声了解度 / 音声認識
キーワード(英)	Speech intelligibility / Speech recognition
資料番号	EA2023-45,EMM2023-76
発行日	2023-11-16 (EA, EMM)

研究会情報
研究会	EMM / EA / ASJ-H
開催期間	2023/11/23(から2日開催)
開催地（和）	大学コンソーシアム富山「駅前キャンパス」研修室1
開催地（英）
テーマ（和）	＜ビギナーズセッション＞応用／電気音響，コンテンツ処理，情報ハイディング，聴覚，一般
テーマ（英）	[Beginners Session] Engineering/Electro Acoustics, Content Processing, Digital Watermarking, Psychological and Physiological Acoustics, and Related Topics
委員長氏名（和）	新見道治(九工大) / 小野順貴(都立大)
委員長氏名（英）	Michiharu Niimi(Kyushu Inst. of Tech.) / Junki Ono(Tokyo Metropolitan Univ.)
副委員長氏名（和）	薗田光太郎(長崎大) / 姜玄浩(東京高専) / 西浦敬信(立命館大) / 梶川嘉延(関西大)
副委員長氏名（英）	Kotaro Sonoda(Nagasaki Univ.) / Hyunho Kang(NIT, Tokyo) / Takanobu Nishiura(RitsumeikanUniv.) / Yoshinobu Kajikawa(Kansai Univ.)
幹事氏名（和）	梶山朋子(広島市大) / 酒澤茂之(大阪工大) / 若山圭吾(NTT) / 伊藤信貴(東大)
幹事氏名（英）	Tomoko Kajiyama(Hiroshima City Univ.) / Shieyuki Sakazawa(Osaka Inst. of Tech.) / Keigo Wakayama(NTT) / Nobutaka Ito(Univ. of Tokyo)
幹事補佐氏名（和）	青木直史(北大) / 中村和晃(東京理科大) / 中山雅人(阪産大) / 矢田部浩平(東京農工大)
幹事補佐氏名（英）	Naofumi Aoki(Hokkaido Univ.) / Kazuaki Nakamura(Tokyo Univ. of Science) / Masato Nakayama(OSU) / Kouhei Yatabe(TUAT)

講演論文情報詳細
申込み研究会	Technical Committee on Enriched MultiMedia / Technical Committee on Engineering Acoustics / Auditory Research Meeting
本文の言語	JPN
タイトル（和）	［ポスター講演］大規模事前学習モデルを用いたEnd-to-End音声認識による日本語単語了解度推定
サブタイトル（和）
タイトル（英）	[Poster Presentation] **
サブタイトル（和）
キーワード(1)（和/英）	音声了解度 / Speech intelligibility
キーワード(2)（和/英）	音声認識 / Speech recognition
第 1 著者氏名（和/英）	服部真稀
第 1 著者所属（和/英）	山形大学(略称：山形大) (略称：)
第 2 著者氏名（和/英）	近藤和弘
第 2 著者所属（和/英）	山形大学(略称：山形大) (略称：)
発表年月日	2023-11-23
資料番号	EA2023-45,EMM2023-76
巻番号（vol）	vol.123
号番号（no）	EA-278,EMM-279
ページ範囲	pp.93-97(EA), pp.93-97(EMM),
ページ数	5
発行日	2023-11-16 (EA, EMM)