Presentation 2012-07-21
WFST-based Structured Classification of Features Extracted by Using Deep Neural Networks
Yotaro KUBO, Takaaki HORI, Atsushi NAKAMURA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Multilayer perceptions, which include more than 2 hidden layers, are known to be efficient for modeling of complex classification processes. However, due to the local optima and plateaus in their training objective functions, these perceptrons had not been used in practice. Recently, a heuristic method that involves the use of initial value obtained by applying unsupervised training of neural networks have enabled the practical use of such perceptrons. By introducing multiple hidden layers, the total number of needed units to accurately model the nonlinear classification processes would become smaller than that in single hidden layer networks. Consequently, we can analyze that the main contribution of introducing deep processings is enhancement in feature representations. On the other hand, an approach called structured classification have been collecting attention of speech researchers since it realizes direct modeling of sequence-to-sequence classification. However, it is known that the feature transformation is important in this approach since it typically considers the sequence classification as linear classification processes. In this paper, we attempt to combine these two approaches in order to enhance the both sides; feature representations and label representations. Specifically, we introduced the structured classification method based on weighted finite-state transducers into the multilayer perceptron-based speech recognition systems.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Automatic Speech Recognition / Structured Classification / Deep Learning / Temporal Features
Paper # SP2012-57
Date of Issue

Conference Information
Committee SP
Conference Date 2012/7/12(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) WFST-based Structured Classification of Features Extracted by Using Deep Neural Networks
Sub Title (in English)
Keyword(1) Automatic Speech Recognition
Keyword(2) Structured Classification
Keyword(3) Deep Learning
Keyword(4) Temporal Features
1st Author's Name Yotaro KUBO
1st Author's Affiliation NTT Communication Science Laboratories()
2nd Author's Name Takaaki HORI
2nd Author's Affiliation NTT Communication Science Laboratories
3rd Author's Name Atsushi NAKAMURA
3rd Author's Affiliation NTT Communication Science Laboratories
Date 2012-07-21
Paper # SP2012-57
Volume (vol) vol.112
Number (no) 141
Page pp.pp.-
#Pages 6
Date of Issue