朗読音声-歌声音声の特徴量変換と話者適応を用いた歌詞認識の性能向上の検討(音声認識,第16回音声言語シンポジウム)

川井 大陸; 山本 一公; 中川 聖一

Presentation	2014/12/8 朗読音声-歌声音声の特徴量変換と話者適応を用いた歌詞認識の性能向上の検討(音声認識,第16回音声言語シンポジウム) ,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	As a first step, we consider Japanese lyrics recognition in monophonic singing that contains no musical instruments. To express singing well, we attempt to use an n-gram language model using a lyrics corpus, singing-adapted GMM-HMM-based acoustic models and plural pronunciation lexicons for vowel-lengthening. We attempted to adapt the read-speech AMs to sung-speech AMs using two approaches. One is MAP adaptation and the other is neural network-based feature transformation. For adapting to singing, we use 40 pieces of music sung by 40 male singers. For adapting to speaker, we use a piece of music sung by a male singer who is the same speaker as a singer of a test data. To deal with the property of singing offten involving lengthening the duration of each vowel, we augment the pronunciation variations. Evaluation is performed on a test set that contains 7 pieces of commercial music sung by 7 male singers. As a result of experiments, our system showed syllable accuracy of 46.1% (phoneme accuracy of 59.0%) and word accuracy of 25.9% in male monophonic Japanese singing. This result showed higher accuracy than a conventional system based on the newspaper LM and the read-speech AM.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	lyrics recognition / read-sung speech transformation / MAP adaptation / vowel-lengthening
Paper #	Vol.2014-SLP-104 No.2
Date of Issue

Conference Information
Committee	SP
Conference Date	2014/12/8(1days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To	Speech (SP)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)
Sub Title (in English)
Keyword(1)	lyrics recognition
Keyword(2)	read-sung speech transformation
Keyword(3)	MAP adaptation
Keyword(4)	vowel-lengthening
1st Author's Name
1st Author's Affiliation	Toyohashi Uniersity of Technology()
Date	2014/12/8
Paper #	Vol.2014-SLP-104 No.2
Volume (vol)	vol.114
Number (no)	365
Page	pp.pp.-
#Pages	6
Date of Issue