Presentation | 2004/12/13 Mixtures of Probabilistic Principal Component Analyzers in Speech Recognition Mike SCHUSTER, Takaaki HORI, Atsushi NAKAMURA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper describes the application of Mixtures of Probabilistic Principal Component Analyzers (MPPCA) for modeling the observation distributions in a speech recognition system. The MPPCA model is a mixture of Gaussians with a constrained covariance approximating a full covariance with less effective parameters whose complexity can be controlled by the user. The paper summarizes the necessary basics of the MPPCA model, describes a simple extension of the basic model to set the user-defined complexity of the constrained covariance in a more automatic way and describes how to deal with numerical problems occuring for typical speech recognition systems. The MPPCA model is tested against a diagonal covariance and a full covariance model for our so far best acoustic model with 5000 quinphone clustered states and 80000 Gaussians total on a large, spontaneous Japanese speech task. Results show that we can improve error rates on the standard test set from 22.2% to 19.7% by moving to full covariances. For several MPPCA models tested we reach the same error rates with less effective parameters but fail to improve over using full covariances, for which possible reasons are discussed. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Speech recognition / covariance modeling / Probabilistic Principal Component Analysis |
Paper # | NLC2004-52,SP2004-92 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2004/12/13(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Mixtures of Probabilistic Principal Component Analyzers in Speech Recognition |
Sub Title (in English) | |
Keyword(1) | Speech recognition |
Keyword(2) | covariance modeling |
Keyword(3) | Probabilistic Principal Component Analysis |
1st Author's Name | Mike SCHUSTER |
1st Author's Affiliation | Nippon Telegraph and Telephone Corporation, NTT Communication Science Laboratories() |
2nd Author's Name | Takaaki HORI |
2nd Author's Affiliation | Nippon Telegraph and Telephone Corporation, NTT Communication Science Laboratories |
3rd Author's Name | Atsushi NAKAMURA |
3rd Author's Affiliation | Nippon Telegraph and Telephone Corporation, NTT Communication Science Laboratories |
Date | 2004/12/13 |
Paper # | NLC2004-52,SP2004-92 |
Volume (vol) | vol.104 |
Number (no) | 538 |
Page | pp.pp.- |
#Pages | 5 |
Date of Issue |