Paper Abstract and Keywords |
Presentation |
2018-02-16 13:10
The effect of increasing the number of channels with multi-channel non-negative matrix factorization for noisy speech recognition Takanobu Uramoto (Oita Univ.), Youhei Okato, Toshiyuki Hanazawa (Mitsubishi Electric), Iori Miura, Shingo Uenohara, Ken'ich Furuya (Oita Univ.) EA2017-99 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
Nonnegative Matrix Factorization (NMF) factorizes a non-negative matrix into two non-negative matrices. In the field of acoustics, Multichannel NMF (MNMF) has been proposed. MNMF utilizes spatial information by multichannel expansion of NMF and can perform high-accurate sound source separation. However, the conventional MNMF tends to be trapped by local minima because their models have too many free parameters and this causes initial value dependencies of the separation performance. In addition, as the number of channels increases, this initial value dependency becomes more significant. This paper focuses on spatial correlation matrices that have the most significant initial-value dependencies. We propose initial-value methods of spatial correlation matrices from binary-mask based separated data and from the steering vector obtained by mask emphasis based on EM algorithm. Automatic speech recognition experiments of the signals observed at 2ch and 6ch under noisy showed a decrease in word error rate for both channels over random initialization.Since high recognition performance was obtained with 6 ch, we confirmed the effectiveness of initialization by the proposed method and the effect of improving recognition performance by increasing the number of channels. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
sound source separation / noise reduction / nonnegative matrix factorization (NMF) / multi-channel NMF / automatic speech recognition / / / |
Reference Info. |
IEICE Tech. Rep., vol. 117, no. 430, EA2017-99, pp. 33-38, Feb. 2018. |
Paper # |
EA2017-99 |
Date of Issue |
2018-02-08 (EA) |
ISSN |
Print edition: ISSN 0913-5685 Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
EA2017-99 |
|