Development and commercialization of low distortion noise suppressors and noise cancellers

  The Third Generation Partnership Project (3GPP), which is in charge of international mobile network standards, tried to establish an international standard on a single channel noise suppression (noise suppressor) in view of the debut of 3G mobile phone handsets in the 2000s. However, the six proposed technologies were still immature to satisfy all the requirements. 3GPP instead completed an international standard on the requirements and conformance assessment process [1]. The awardees successfully developed a noise suppressor, NS-WiNE (Noise Suppressor with WeIghted Noise Estimation), that satisfies all the 3GPP requirements and, for a basic algorithm [2] and a low-complexity algorithm [3], obtained 3GPP endorsements [4].
  The key to high speech quality is weighted noise estimation [2]. An input signal power is distributed in proportion to an estimated signal-to-noise ratio (SNR). It is effective for nonstationary noise which had been conventionally challenging, leading to a 90% reduction of residual noise. They also redesigned a highpass filter at the input to operate after Fourier transformation and introduced nonuniform subband decomposition whose bandwidth is set wider in the insensitive high frequencies to achieve a 55% reduction in total computations without compromising the speech quality. The award recipients developed a downlink-playback model [5] which achieves low distortion for receive-side signals in the mobile network and playback signals from recording media. Maximum suppression is adaptively controlled by speech presence and a long-term SNR to reduce speech distortion. A shared spectral gain among multiple channels guarantees no shift of the sound image and required computations to process multichannel input that is inevitable in remote conferencing and voice recorders. A user-controllable spectral gain for best balance between the residual noise and speech distortion is incorporated in an MPEG (The Moving Picture Experts Group) SAOC (spatial audio object coding) standard [6].
  NS-WiNE enabled the world’s first mobile phone handset with a 3GPP endorsed noise suppressor and recorded nearly 30M shipments. It has also been used in a voice recorder with the number-one world market share and marked nearly 3M shipments. It further evolved to suppress auto-focusing noise in mobile phone handsets and zooming noise in digital cameras and served as an essential component therein. NS-WiNE received the 2002 IEICE Best Paper Award, the 2010 Advanced Technology Award, "The Prize of the Sankei Shimbun," the 2013 Electrical Science and Engineering Promotion Award, Local Commendation for Invention of Kanto 2011 and forms one chapter in a book [7].
  The award recipients also developed a 2-channel noise cancellation technology (noise canceller) that simultaneously provides low speech distortion and small residual noise at low SNRs. Speech distortion after noise cancellation was reduced by introducing adaptive control of coefficient update using an estimated SNR with a pilot filter (PF) combined with the conventional structure [8]. Nonnegligible crosstalk, which are leak-in speech components to noise, can be coped with the CCPF (Cross-Coupled PF) structure with an adaptive filter dedicated to crosstalk [9]. Nevertheless, when the input-signal SNR varies in a wide range, the accuracy of SNR estimation is degraded. The awardees developed a generalized CCPF (GCCPF) structure which controls the coefficient adaptation of pilot filters using a power ratio of the primary input signal to the auxiliary input signal as an approximate SNR and reduced both the residual noise and the speech distortion by more than 90% [10]. A speech recognition system equipped with a GCCPF noise canceller improved the correct speech recognition rate by as much as 65% in an adverse noise environment. A GCCPF noise canceller equipped with in a childcare robot to play with multiple kids successfully demonstrated its performance at EXPO 2005 Aichi for six months and verified, for the first time in the world, that speech recognition in an exhibit hall is feasible. The GCCPF noise canceller was further implemented in a single-chip speech dialogue module of a credit-card size and enabled a dialogue robot on the palm. GCCPF received the 2005 Workshop Award for Excellence by the Japanese Society for Artificial Intelligence and was included in a book [9] as one chapter.
  The low-distortion noise suppressors/cancellers developed by the recipients have demonstrated their outstanding performance through deployment in multiple products and long-term demonstration in the real environment at EXPO2005 and are widely recognized by several awards and contributions to international standards. Thus, the awardees’ achievements are significant and deserve the IEICE Achievement Award.

Fig.1 Noise Suppressor

Fig.2 Noise Canceller

References

"Minimum performance requirements for noise suppresser application to the AMR speech encoder," 3GPP TS 06.77 V8.1.1, Apr. 2001.
M. Kato, A. Sugiyama, M. Serizawa, "Noise suppression with high speech quality based on weighted noise estimation and MMSE STSA," IEICE Trans. Fund., vol. E85-A, No.7, pp.1710-1718, Jul. 2002.
M. Kato and A. Sugiyama, "A low-complexity noise suppressor with nonuniform subbands and a frequency-domain highpass filter," Proc. ICASSP2006, pp. 473-476, May 2006.
"TSG SA WG4 status report at TSG SA#17," TSGS#17-020431, Sep. 2002.
K. Yamato, A. Sugiyama, M. Kato, "Implementation of a multipurpose noise suppressor based on a novel scalable framework," Proc. ICASSP2007, pp. 337-340, Apr. 2007.
ISO/IEC 23003-2:2010 Information technology -- MPEG audio technologies -- Part 2: Spatial Audio Object Coding (SAOC), Oct. 2010.
J. Benesty, S. Makino, and J. Chen, Eds., "Speech enhancement," A. Sugiyama, M. Kato, and M. Serizawa, Chap. 6, "Single-microphone noise suppression for 3G handsets based on weighted noise estimation," Springer, Berlin, pp.115-134, Mar. 2005.
S. Ikeda and A. Sugiyama, "An adaptive noise canceler with low signal-distortion for speech codecs," IEEE Transactions on Signal Processing, vol.47, No.3, pp.665-674, Mar. 1999.
E. Hänsler and G. Schmidt, Eds., "Speech and audio processing in adverse environments," A. Sugiyama, Ch. 7, "Low distortion noise cancellers―Revival of a classical technique," Springer, Berlin, pp.229-264, Aug. 2008.
M. Sato, A. Sugiyama, S. Ohnaka, "An adaptive noise canceller with low signal-distortion based on variable stepsize subfilters for human-robot communication," IEICE Trans. Fund., vol. E88-A, No.8, pp.2055-2061, Aug. 2005.

Achievement Award

Akihiko SUGIYAMA, Masanori KATO, Miki SATO

References