Scream Enhancement Using Wave-U-Net

Riku Kasai; Noboru Hayasaka; Takuya Futagami; Yoshikazu Miyanaga

Summary

International Workshop on Smart Info-Media Systems in Asia

2021

Session Number:SS1

Session:

Number:SS1-2

Scream Enhancement Using Wave-U-Net

Riku Kasai, Noboru Hayasaka, Takuya Futagami, Yoshikazu Miyanaga,

pp.5-8

Publication Date:2021/9/20

Online ISSN:2188-5079

DOI:10.34385/proc.66.SS1-2

PDF download

Summary:

We previously proposed a scream-detection system for crime prevention using sound because sound information can be more privacy-friendly than image information. Our scream-detection system performed well even in various noisy environments. However, it was not possible to clarify noisy screams because the system does not involve noise-reduction processing. Since recorded screams will be heard by humans during trials and investigations, scream clarification is essential. Therefore, a noise-reduction method is necessary for the scream detection system. We applied a deep-noise-reduction method (Wave-U-Net) to noisy screams. Wave-U-Net is a method of sound-source separation, but it has also been used for noise reduction. However, it has only been applied to speech, and there have been no reports of its application to screams. As a result of simulations in various noisy environments, Wave-U-Net improved segmental signal-to-noise ratio by 14 dB or more, and it was confirmed that it is an effective method for scream enhancement. We also found that Wave-U-Net improved perceptual evaluation of speech quality.