音声認識された議事録間の関係抽出における特徴語の選定法と文書間の類似度計算法

伊藤 本気; 西田 誠幸

講演名	2015-06-05 音声認識された議事録間の関係抽出における特徴語の選定法と文書間の類似度計算法伊藤本気(拓殖大), 西田誠幸(拓殖大),
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	Collective Entity Resolution の適用によって音声認識により自動生成された議事録テキスト間の類似度を算出する手法が提案されている．この手法は，議事録テキストに誤認識が含まれることを前提としたものである．先行研究の評価実験では，期待通りの類似度算出結果が得られず，提案手法の中の一部の処理について改善の必要があることが分かった．本稿では，先行研究の手法を基盤として，手法の中のクラスタリングに適した語句を選択するより良い方法と，より正確に類似度を算出する方法を提案する．
抄録(英)	There was a study of a technique to extract relationship of minutes generated by speech recognition system using collective entity resolution. This technique assumes that its input data, meeting minutes, include error words caused by mis-recognition of the system. The empirical evaluation in this earlier study found that the technique did not work fine. This paper describes improvement of the technique by refining two steps in the technique, keyword extraction and similarity calculation.
キーワード(和)	音声認識 / 議事録 / テキストマイニング / エンティティレゾリューション
キーワード(英)	Speech recognition / meeting minutes / text mining / entity resolution
資料番号	TL2015-9,NLC2015-9
発行日	2015-05-28 (TL, NLC)

研究会情報
研究会	NLC / TL
開催期間	2015/6/4(から2日開催)
開催地（和）	徳島大学
開催地（英）	The University of Tokushima
テーマ（和）	言語処理・言語分析の社会応用，および一般
テーマ（英）	Application of natural language proessing and linguistic analysis, and general topic of NLP
委員長氏名（和）	竹内孔一(岡山大) / 近藤公久(工学院大)
委員長氏名（英）	Koichi Takeuchi(Okayama Univ.) / Tadahisa Kondo(Kogakuin Univ.)
副委員長氏名（和）	金山博(日本IBM) / 市瀬眞(NTTドコモ) / 久保村千明(山野美容芸術短大) / 鈴木雅実(KDDI研)
副委員長氏名（英）	Hiroshi Kanayama(IBM) / Makoto Ichise(NTT DoCoMo) / Chiaki Kubomura(Yamano College of Aesthetics) / Masami Suzuki(KDDI R&D Labs.)
幹事氏名（和）	榊剛史(東大/ホットリンク) / 渡辺靖彦(龍谷大) / 乾孝司(筑波大) / 黒田航(杏林大)
幹事氏名（英）	Takeshi Sakaki(Univ. of Tokyo/Hottolink) / Yasuhiko Watanabe(Ryukoku Univ.) / Takashi Inui(Univ. of Tsukuba) / Ko Kuroda(Kyorin Univ.)
幹事補佐氏名（和）	嶋田和孝(九工大) / 東中竜一郎(NTT) / 富田英司(愛媛大) / 坪田康(京大)
幹事補佐氏名（英）	Kazutaka Shimada(Kyushu Inst. of Tech.) / Ryuichiro Higashinaka(NTT) / Eiji Tomida(Ehime Univ.) / Yasushi Tsubota(Kyoto Univ.)

講演論文情報詳細
申込み研究会	Technical Committee on Natural Language Understanding and Models of Communication / Technical Committee on Thought and Language
本文の言語	JPN
タイトル（和）	音声認識された議事録間の関係抽出における特徴語の選定法と文書間の類似度計算法
サブタイトル（和）
タイトル（英）	Methods for Selecting Keywords and Calculating Similarities of Documents for Extracting Relationship of Meeting Minutes Generated by Speech Recognition.
サブタイトル（和）
キーワード(1)（和/英）	音声認識 / Speech recognition
キーワード(2)（和/英）	議事録 / meeting minutes
キーワード(3)（和/英）	テキストマイニング / text mining
キーワード(4)（和/英）	エンティティレゾリューション / entity resolution
第 1 著者氏名（和/英）	伊藤本気 / Motoki Ito
第 1 著者所属（和/英）	拓殖大学(略称：拓殖大) Takushoku University(略称：Takushoku Univ.)
第 2 著者氏名（和/英）	西田誠幸 / Seikoh Nishita
第 2 著者所属（和/英）	拓殖大学(略称：拓殖大) Takushoku University(略称：Takushoku Univ.)
発表年月日	2015-06-05
資料番号	TL2015-9,NLC2015-9
巻番号（vol）	vol.115
号番号（no）	TL-69,NLC-70
ページ範囲	pp.49-54(TL), pp.49-54(NLC),
ページ数	6
発行日	2015-05-28 (TL, NLC)