ファジイc-means法を用いたオーディオ信号の分割・分類法 : 音声及び音楽クラス間の距離の定義に関する考察(認識, ITS画像処理, 映像メディア及び一般)

二反田 直己; 長谷山 美紀; 北島 秀夫

講演名	2005-02-03 ファジイc-means法を用いたオーディオ信号の分割・分類法 : 音声及び音楽クラス間の距離の定義に関する考察(認識, ITS画像処理, 映像メディア及び一般) 二反田直己, 長谷山美紀, 北島秀夫,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	ビデオ信号とオーディオ信号を統合した映像信号の検索システムを構築する際, 前処理として映像信号の分割・分類が必要となる.我々は, 以前, 映像信号のオーディオ部に着目し, オーディオ信号が切り換わる時刻(オーディオカット)を検出し, オーディオカットを境界とするセグメントを無音, 音声, 音楽, 音楽付き音声, 雑音付き音声の5種類のクラスに分類する手法を提案した.本稿では, 音楽付き音声に着目し, 音楽付き音声-音声間, 及び音楽付き音声-音楽間の距離(クラス間距離)をファジィc-means法より得られる帰属度を用いて表現する手法を提案する.提案手法により算出されるクラス間距離を用いることで, 音楽付き音声が音声, 音楽のどちらに類似した信号であるかを調べることが可能となる.
抄録(英)	Automatic segmentation and classification technique of audio signal is required for audiovisual indexing, and we have been proposed an audio signal segmentation and classification method. This method segments the audio signal into different audio signals at their boundaries, and classifies them into five audio classes, which are silence, speech, music, speech with music, and speech with noise. This paper defines a distance between speech and music class in order to judge that a speech with music class is similar to which speech or music class. The proposed method consists of three steps : (1) audio features, which represent the characteristic of speech, music, and speech with music signal, are extracted ; (2) principal component analysis is applied to the extracted audio features ; (3) fuzzy c-means clustering is applied to the principal components, and distance can be computed by using membership values, which are obtained from fuzzy clustering. Experimental results performed by applying the proposed method to real audio signal are shown to verify its high performance.
キーワード(和)	オーディオ信号 / 分割 / 分類 / インデキシング / ファジィc-means法
キーワード(英)	audio signal / segmentation / classification / indexing / fuzzy c-means
資料番号	ITS2004-49,IE2004-183
発行日

研究会情報
研究会	ITS
開催期間	2005/1/27(から1日開催)
開催地（和）
開催地（英）
テーマ（和）
テーマ（英）
委員長氏名（和）
委員長氏名（英）
副委員長氏名（和）
副委員長氏名（英）
幹事氏名（和）
幹事氏名（英）
幹事補佐氏名（和）
幹事補佐氏名（英）

講演論文情報詳細
申込み研究会	Intelligent Transport Systems Technology (ITS)
本文の言語	JPN
タイトル（和）	ファジイc-means法を用いたオーディオ信号の分割・分類法 : 音声及び音楽クラス間の距離の定義に関する考察(認識, ITS画像処理, 映像メディア及び一般)
サブタイトル（和）
タイトル（英）	Audio Signal Segmentation and Classification using Fuzzy C-Means Clustering : A Study on Definition of Distance between Speech and Music Class
サブタイトル（和）
キーワード(1)（和/英）	オーディオ信号 / audio signal
キーワード(2)（和/英）	分割 / segmentation
キーワード(3)（和/英）	分類 / classification
キーワード(4)（和/英）	インデキシング / indexing
キーワード(5)（和/英）	ファジィc-means法 / fuzzy c-means
第 1 著者氏名（和/英）	二反田直己 / Naoki NITANDA
第 1 著者所属（和/英）	北海道大学大学院情報科学研究科 Graduate School of Information Science and Technology, Hokkaido University
第 2 著者氏名（和/英）	長谷山美紀 / Miki HASEYAMA
第 2 著者所属（和/英）	北海道大学大学院情報科学研究科 Graduate School of Information Science and Technology, Hokkaido University
第 3 著者氏名（和/英）	北島秀夫 / Hideo KITAJIMA
第 3 著者所属（和/英）	北海道大学大学院情報科学研究科 Graduate School of Information Science and Technology, Hokkaido University
発表年月日	2005-02-03
資料番号	ITS2004-49,IE2004-183
巻番号（vol）	vol.104
号番号（no）	646
ページ範囲	pp.-
ページ数	6
発行日