大語彙連続音声認識エンジンJulius ver. 4(システム,第9回音声言語シンポジウム)

李 晃伸

講演名	2007/12/13 大語彙連続音声認識エンジンJulius ver. 4(システム,第9回音声言語シンポジウム) 李晃伸,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	大語彙連続音声認識エンジンJuliusは2007年12月にバージョンver.4がリリースされた.7年ぶりのメジャーバージョン更新となるver.4では,内部構造のモジュール化およびソースの全面的な再構成が行われ,可搬性と柔軟性が大幅に向上された.その結果,エンジン本体がライブラリ化された他のアプリケーションに組み込めるようになったほか,コールバック・プラグイン等の外部との連携の仕組みが整備され,機能の拡弾や構成の変更が容易に行えるようになった.言語モデルも単語N-gramおよび文法を単一バイナリで同等に扱えるようになりJulianはJuliusに統一された.さらに,複数の言語モデルと音響モデルを任意に組み合わせて,1エンジンで並列認識を行うマルチデコーディングも可能となった.また,基本性能についても拡張と強化が行われた.言語モデルとして孤立単語認識が新たに追加されたほか,4-gram以上の任意長N-gramへの対応、ユーザ関数による外部言語制約の組込み、GMM-based VADおよびデコーダベースVAD、confusion networkの生成など大幅な機能強化が行われた。性能は従来バージョンと同等を維持しており、かつメモリ量の削減も行われている。
抄録(英)	The new version 4.0 of large vocabulary continuous speech recognition engine "Julius" has been released at December 2007, as a major version up from version 3.0. An anatomical analysis and data stcuture re-organization has been accomplished for the whole codes to improve its modularity and flexibility. Its improved structure now enables Julius to be compiled as a external library to be incorpolated into various user applications. A simple callback API and plugin facilities are newly built to be controlled directly and lively from outer applications, which enables easy but tight integration with other applications. Also, grammar-based recognizer Julian has been incorpolated into Julius and the N-gram and grammar can be treated at the same executable. Furthermodre, It supports fully multi-decoding using multiple LMs, AMs and their arbitral combinations. It now supports long N-gram (N unlimited),user-defined LM function, GMM-based and a newly proposed decoder-based VAD, confusion network generation, and many other new functions. The memory requirement has also been improved, while keeping the same accuracy.
キーワード(和)	大語彙連続音声認識 / Julius / N-gram / マルチデコーディング / 音声区間検出
キーワード(英)	LVCSR / Julius / N-gram / VAD
資料番号	NLC2007-85,SP2007-148
発行日

研究会情報
研究会	NLC
開催期間	2007/12/13(から1日開催)
開催地（和）
開催地（英）
テーマ（和）
テーマ（英）
委員長氏名（和）
委員長氏名（英）
副委員長氏名（和）
副委員長氏名（英）
幹事氏名（和）
幹事氏名（英）
幹事補佐氏名（和）
幹事補佐氏名（英）

講演論文情報詳細
申込み研究会	Natural Language Understanding and Models of Communication (NLC)
本文の言語	JPN
タイトル（和）	大語彙連続音声認識エンジンJulius ver. 4(システム,第9回音声言語シンポジウム)
サブタイトル（和）
タイトル（英）	Large Vocabulary Continuous Speech Recognition Engine Julius ver. 4
サブタイトル（和）
キーワード(1)（和/英）	大語彙連続音声認識 / LVCSR
キーワード(2)（和/英）	Julius / Julius
キーワード(3)（和/英）	N-gram / N-gram
キーワード(4)（和/英）	マルチデコーディング / VAD
キーワード(5)（和/英）	音声区間検出
第 1 著者氏名（和/英）	李晃伸 / Akinobu LEE
第 1 著者所属（和/英）	名古屋工業大学大学院工学研究科 Faculty of Engineering, Nagoya Institute of Technology
発表年月日	2007/12/13
資料番号	NLC2007-85,SP2007-148
巻番号（vol）	vol.107
号番号（no）	405
ページ範囲	pp.-
ページ数	6
発行日