Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SP, IPSJ-MUS, IPSJ-SLP [detail] |
2022-06-17 13:00 |
Online |
Online |
A Study of Speech Recognition Result Correction Using BERT for Speech Translation Tadashi Ogura, Masakiyo Fujimoto, Peng Shen, Xugang Lu, Hisashi Kawai (NICT) SP2022-4 |
Speech translation (ST) technology consists of automatic speech recognition (ASR) and machine translation technologies. ... [more] |
SP2022-4 pp.10-13 |
SP, IPSJ-SLP (Joint) |
2018-07-26 16:45 |
Shizuoka |
Sago-Royal-Hotel (Hamamatsu) |
Single channel noisy speech recognition based on combination of noisy speech and enhanced speech Masakiyo Fujimoto, Hisashi Kawai (NICT) SP2018-19 |
In many cases, single channel speech enhancement seriously deteriorates speech recognition accuracy due to the influence... [more] |
SP2018-19 pp.15-20 |
SP, IPSJ-SLP (Joint) |
2017-07-27 16:45 |
Miyagi |
Akiu Resort Hotel Crescent |
Noise robust speech recognition based on factored deep convolutional neural networks Masakiyo Fujimoto (NICT) SP2017-18 |
[more] |
SP2017-18 pp.15-20 |
SP, IPSJ-SLP (Joint) |
2015-07-17 11:10 |
Nagano |
Katakura Suwako Hotel |
[Invited Talk]
Aspects of feature extraction in DNN acoustic models Takuya Yoshioka, Marc Delcroix, Masakiyo Fujimoto, Tomohiro Nakatani (NTT) SP2015-46 |
Since the advent of acoustic models based on deep neural networks (DNNs), a vast amount of efforts have been made to fur... [more] |
SP2015-46 pp.61-65 |
SP, IPSJ-MUS |
2014-05-24 11:30 |
Tokyo |
|
[研究紹介] A spectrogram-patch-input DNN model for detection and classification of acoustic events robust to speech overlapping scenarios Miquel Espi, Masakiyo Fujimoto, Yotaro Kubo, Tomohiro Nakatani (NTT) SP2014-17 |
This paper presents an acoustic event detection and classification method that learns features from spectrogram patches ... [more] |
SP2014-17 pp.171-176 |
SP |
2013-01-30 15:45 |
Kyoto |
Doshisha Univ. |
A Study on Speaker Recognition Based on Decomposition of Periodic and Aperiodic Components Yuki Ishikawa, Masafumi Nishida (Doshisha Univ.), Masakiyo Fujimoto (NTT), Seiichi Yamamoto (Doshisha Univ.) SP2012-102 |
In conventional researches, mel-frequency cepstral coefficients (MFCC) are widely used for a feature parameter which app... [more] |
SP2012-102 pp.25-30 |
EA |
2012-12-14 10:40 |
Tokyo |
National Institute of Informatics |
Under-Determined Audio Source Separation Based on MAP Spectral Estimation Using Log-Spectral Prior Yasuaki Iwata (Nagoya Univ.), Tomohiro Nakatani, Masakiyo Fujimoto, Takuya Yoshioka (NTT CS Labs), Hirofumi Saito (Nagoya Univ.) EA2012-114 |
Assuming speech to be non-stationary Gaussian process, maximum likelihood spectral estimation has been studied as an eff... [more] |
EA2012-114 pp.29-34 |
SP, NLC, IPSJ-SLP [detail] |
2011-12-20 09:00 |
Tokyo |
|
Simultaneous application of speaker adaptation and noise mixture model estimation for noise suppression Masakiyo Fujimoto, Shinji Watanabe, Tomohiro Nakatani (NTT) NLC2011-46 SP2011-91 |
In this paper, we propose a joint processing method for a model-based noise suppression that simultaneously achieves spe... [more] |
NLC2011-46 SP2011-91 pp.113-118 |
EA, SIP, SP |
2011-05-12 10:50 |
Osaka |
Ritsumeikan Univ. |
A Robust On-line Estimation Method of Noise Mixture Model for Noise Suppression Masakiyo Fujimoto, Tomohiro Nakatani, Shinji Watanabe (NTT) EA2011-2 SIP2011-2 SP2011-2 |
In this paper, we propose a robust on-line estimation method of noise mixture model for the statistical model-based nois... [more] |
EA2011-2 SIP2011-2 SP2011-2 pp.7-12 |
EA, SIP, SP |
2011-05-12 15:00 |
Osaka |
Ritsumeikan Univ. |
[Invited Talk]
Microphone array speech processing techniques for conversation scene analysis Shoko Araki, Masakiyo Fujimoto, Takuya Yoshioka, Takaaki Hori, Tomohiro Nakatani (NTT) EA2011-15 SIP2011-15 SP2011-15 |
Recognition and understanding of conversation scenes has recently been tackled to achieve a variety of tasks such as aut... [more] |
EA2011-15 SIP2011-15 SP2011-15 pp.83-88 |
NLC, SP (Joint) [detail] |
2010-12-20 16:30 |
Tokyo |
National Olympics Memorial Youth Center |
Noise suppression method based on noise bias-residual decomposition and optimization Masakiyo Fujimoto, Shinji Watanabe, Tomohiro Nakatani (NTT Corp.) NLC2010-18 SP2010-91 |
In this paper, we propose a non-stationary noise estimation method based on bias-residual component decomposition, and s... [more] |
NLC2010-18 SP2010-91 pp.43-48 |
SP |
2010-06-17 14:00 |
Fukuoka |
Kyushu University |
[Tutorial Invited Lecture]
The Fundamentals and Recent Progress of Voice Activity Detection Masakiyo Fujimoto (NTT Corp.) SP2010-23 |
This paper describes the fundamentals and recent progress of voice activity detection (VAD).
First topics are explanati... [more] |
SP2010-23 pp.7-12 |
SP, NLC |
2008-12-09 10:50 |
Tokyo |
Waseda Univ. |
Noisy speech recognition using integrated method of statistical model-based voice activity detection and noise suppression Masakiyo Fujimoto, Kentaro Ishizuka, Tomohiro Nakatani (NTT Corporation) NLC2008-26 SP2008-81 |
This paper addresses robust front-end processing for automatic speech recognition in noise. The proposed method integrat... [more] |
NLC2008-26 SP2008-81 pp.13-18 |
SP, NLC |
2008-12-09 11:40 |
Tokyo |
Waseda Univ. |
Speaker diarization of multi-party conversations based on audio and visual information integration Kentaro Ishizuka, Shoko Araki, Kazuhiro Otsuka, Masakiyo Fujimoto, Tomohiro Nakatani (NTT) NLC2008-28 SP2008-83 |
This paper proposes a speaker diarization method, which detects “who spoke when” in multi-party conversations, based on ... [more] |
NLC2008-28 SP2008-83 pp.25-30 |
PRMU, MVE |
2008-11-27 11:40 |
Osaka |
Osaka Univ. |
A realtime Multimodal System toward Multiparty Conversation Scene Analysis
-- Integrating Face Pose Tracking and Speaker Diarization using Multimodal Omnidirectional Sensors -- Kazuhiro Otsuka, Shoko Araki, Kentaro Ishizuka, Masakiyo Fujimoto, Junji Yamato (NTT) PRMU2008-119 MVE2008-68 |
[more] |
PRMU2008-119 MVE2008-68 pp.55-62 |
EA |
2008-07-18 14:45 |
Nara |
|
Speaker diarization for meetings by integrating speech presence probability estimation and time-frequency domain direction of arrival estimation Shoko Araki, Masakiyo Fujimoto, Kentaro Ishizuka, Tomohiro Nakatani, Hiroshi Sawada, Shoji Makino (NTT) EA2008-40 |
This paper presents a meeting diarization system that estimates who spoke when in a meeting. Our proposed system is real... [more] |
EA2008-40 pp.19-24 |
SP |
2008-07-17 - 2008-07-19 |
Iwate |
Iwate Prefectural Univ. |
An evaluation and an examination of integration method of statistical model-based voice activity detection and noise suppression Masakiyo Fujimoto, Kentaro Ishizuka, Tomohiro Nakatani (NTT CS Labs.) SP2008-45 |
This paper addresses robust front-end processing for automatic speech recognition (ASR) in noisy environments.
Usually... [more] |
SP2008-45 pp.13-18 |