Diversity-driven Semi-supervised Ensemble DNN Acoustic Model Training

Presentation	2016-08-25 Diversity-driven Semi-supervised Ensemble DNN Acoustic Model Training Sheng Li, Xugang Lu, Shinsuke Sakai, Tatsuya Kawahara,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	We focus on effective training DNN (Deep Neural Network) acoustic models for Chinese spoken lectures with only limited labeled speech and abundant unlabeled speech. Unlike selectively using the unlabeled data in most semi-supervised DNN training methods and working only under supervised setting in previous ensemble DNN training methods, we work on more generalized ensemble training method for both labeled and unlabeled data. In our proposed method, a pair of models is trained in parallel with diverse labels generated for unlabeled data. Together with the standard cross entropy, the KL divergence between each individual model over unlabeled data is incorporated during training. Experiments show that our proposed method can effectively utilize unlabeled data and outperforms other well-established semi-supervised methods.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Speech recognitionAcoustic modelDNNSemi-superivsed trainingEnsemble training
Paper #	SP2016-40
Date of Issue	2016-08-17 (SP)

Conference Information
Committee	SP
Conference Date	2016/8/24(2days)
Place (in Japanese)	(See Japanese page)
Place (in English)	ACCMS, Kyoto Univ.
Topics (in Japanese)	(See Japanese page)
Topics (in English)	Audio event processing, etc.
Chair	Kazunori Mano(Shibaura Inst. of Tech.)
Vice Chair	Hiroki Mori(Utsunomiya Univ.)
Secretary	Hiroki Mori(Kobe Univ.)
Assistant	Taichi Asami(NTT) / Kei Hashimoto(Nagoya Inst. of Tech.)

Paper Information
Registration To	Technical Committee on Speech
Language	ENG
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Diversity-driven Semi-supervised Ensemble DNN Acoustic Model Training
Sub Title (in English)
Keyword(1)	Speech recognitionAcoustic modelDNNSemi-superivsed trainingEnsemble training
1st Author's Name	Sheng Li
1st Author's Affiliation	Kyoto University(Kyoto Univ.)
2nd Author's Name	Xugang Lu
2nd Author's Affiliation	National Institute of Information and Communications Technology(NICT)
3rd Author's Name	Shinsuke Sakai
3rd Author's Affiliation	Kyoto University(Kyoto Univ.)
4th Author's Name	Tatsuya Kawahara
4th Author's Affiliation	Kyoto University(Kyoto Univ.)
Date	2016-08-25
Paper #	SP2016-40
Volume (vol)	vol.116
Number (no)	SP-189
Page	pp.pp.71-76(SP),
#Pages	6
Date of Issue	2016-08-17 (SP)