Paper Abstract and Keywords |
Presentation |
2019-10-26 17:00
Neural Whispered Speech Detection with Imbalanced Learning Takanori Ashihara, Yusuke Shinohara, Hiroshi Sato, Takafumi Moriya, Kiyoaki Matsui, Yoshikazu Yamaguchi (NTT) SP2019-26 WIT2019-25 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
In this paper, we present a neural whispered-speech detection technique that offers utterance-level classification of whispered and non-whispered speech exhibiting imbalanced data distributions.
Previous studies have shown that machine learning models trained on a large amount of whispered and non-whispered utterances perform remarkably well for whispered speech detection.
However, it is often difficult to collect large numbers of whispered utterances.
In this paper, we propose a method to train neural whispered speech detectors from a small amount of whispered utterances in combination with a large amount of non-whispered utterances.
In doing so, special care is taken to ensure that severely imbalanced datasets can effectively train neural networks.
Specifically, we use a class-aware sampling method for training neural networks.
To evaluate the networks, we gather test samples recorded by both condenser and smartphone microphones at different distances from the speakers to simulate practical environments.
Experiments show the importance of imbalanced learning in enhancing the performance of utterance level classifiers. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
whispered speech / vocal effort / deep neural networks / imbalanced learning / class-aware sampling / / / |
Reference Info. |
IEICE Tech. Rep., vol. 119, no. 250, SP2019-26, pp. 51-56, Oct. 2019. |
Paper # |
SP2019-26 |
Date of Issue |
2019-10-19 (SP, WIT) |
ISSN |
Print edition: ISSN 0913-5685 Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
SP2019-26 WIT2019-25 |
Conference Information |
Committee |
WIT SP |
Conference Date |
2019-10-26 - 2019-10-27 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
Daiichi Institute of Technology |
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
|
Paper Information |
Registration To |
SP |
Conference Code |
2019-10-WIT-SP |
Language |
Japanese |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
Neural Whispered Speech Detection with Imbalanced Learning |
Sub Title (in English) |
|
Keyword(1) |
whispered speech |
Keyword(2) |
vocal effort |
Keyword(3) |
deep neural networks |
Keyword(4) |
imbalanced learning |
Keyword(5) |
class-aware sampling |
Keyword(6) |
|
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Takanori Ashihara |
1st Author's Affiliation |
NTT Corporation (NTT) |
2nd Author's Name |
Yusuke Shinohara |
2nd Author's Affiliation |
NTT Corporation (NTT) |
3rd Author's Name |
Hiroshi Sato |
3rd Author's Affiliation |
NTT Corporation (NTT) |
4th Author's Name |
Takafumi Moriya |
4th Author's Affiliation |
NTT Corporation (NTT) |
5th Author's Name |
Kiyoaki Matsui |
5th Author's Affiliation |
NTT Corporation (NTT) |
6th Author's Name |
Yoshikazu Yamaguchi |
6th Author's Affiliation |
NTT Corporation (NTT) |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
Speaker |
Author-1 |
Date Time |
2019-10-26 17:00:00 |
Presentation Time |
20 minutes |
Registration for |
SP |
Paper # |
SP2019-26, WIT2019-25 |
Volume (vol) |
vol.119 |
Number (no) |
no.250(SP), no.251(WIT) |
Page |
pp.51-56 |
#Pages |
6 |
Date of Issue |
2019-10-19 (SP, WIT) |
|