EA, US, SP, SIP, IPSJ-SLP [detail] 2021-03-03
Online Online [Poster Presentation] Comparison of speech intelligibility results between laboratory and crowd-sourcing experiments
Ayako Yamamoto, Toshio Irino (Wakayama Univ.), Kenichi Arai, Shoko Araki, Atunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani (NTT) EA2020-73 SIP2020-104 SP2020-38
(To be available after the conference date) [more] EA2020-73 SIP2020-104 SP2020-38
EA, US, SP, SIP, IPSJ-SLP [detail] 2021-03-04
Online Online Evaluation of Attention Fusion based Audio-Visual Target Speaker Extraction on Real Recordings
Hiroshi Sato, Tsubasa Ochiai, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki (NTT) EA2020-88 SIP2020-119 SP2020-53
(To be available after the conference date) [more] EA2020-88 SIP2020-119 SP2020-53
SIS, ITE-BCT 2020-10-02
Online Online [Invited Talk] Target speech extraction in speech mixtures with SpeakerBeam
Marc Delcroix (NTT), Katerina Zmolikova (BUT), Keisuke Kinoshita, Tsubasa Ochiai, Tomohiro Nakatani, Shoko Araki (NTT) SIS2020-26
 [more] SIS2020-26
SIP 2020-08-27
Online Online [Invited Talk] Recent advances in conversational speech recognition -- source separation, diarizatoin, and end-to-end speech recognition --
Keisuke Kinoshita, Marc Delcroix (NTT), Thilo von Neumann (PUB), Tomohiro Nakatani, Shoko Araki (NTT) SIP2020-29
 [more] SIP2020-29
SP, EA, SIP 2020-03-02
Okinawa Okinawa Industry Support Center
(Cancelled but technical report was issued)
[Invited Talk] Target speech extraction in speech mixtures with SpeakerBeam
Marc Delcroix (NTT), Katerina Zmolikova (BUT), Keisuke Kinoshita, Tsubasa Ochiai, Tomohiro Nakatani, Shoko Araki (NTT) EA2019-105 SIP2019-107 SP2019-54
 [more] EA2019-105 SIP2019-107 SP2019-54
(Joint) [detail]
Okinawa   [Poster Presentation] Intelligibility of speech with additive bubble noise and enhancement under hearing impairment simulation
Narumi Ohashi, Naoko Yomura, Katsuhiko Yamamoto (Wakayama Univ.), Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani (NTT), Toshio Irino (Wakayama Univ.) EA2017-116 SIP2017-125 SP2017-99
Subjective experiments were performed to develop speech intelligibility (SI) prediction metrics for both hearing-impaire... [more] EA2017-116 SIP2017-125 SP2017-99
EA 2014-10-24
Tokyo Central Research Laboratory, Hitachi, Ltd. [Invited Talk] Speech enhancement techniques in multi-speaker spontaneous speech recognition for conversation scene analysis
Shoko Araki, Takaaki Hori, Tomohiro Nakatani (NTT) EA2014-25
This paper illustrates speech enhancement techniques for multi-speaker distant-talk speech recognition, where a conversa... [more] EA2014-25
EA 2013-10-11
Kyoto NTT CS Lab. Source number estimation under reverberant and underdetermined conditions based on clustering of source activity sequences
Nobutaka Ito, Ingrid Jafari, Shoko Araki, Tomohiro Nakatani (NTT) EA2013-66
 [more] EA2013-66
SP, EA, SIP 2013-05-16
Okayama   Permutation-free clustering-based source separation based on time-varying mixture weights
Nobutaka Ito, Shoko Araki, Tomohiro Nakatani (NTT) EA2013-2 SIP2013-2 SP2013-2
To avoid the permutation problem in clustering-based source separation, we introduce a mixture model with time-varying, ... [more] EA2013-2 SIP2013-2 SP2013-2
Yamagata Hotel Takinoyu (Yamagata Pref.) [Invited Talk] Research on Meeting Analysis and Its Perspective
Takaaki Hori, Shoko Araki, Kazuhiro Otsuka, Tomohiro Nakatani, Atsushi Nakamura, Junji Yamato (NTT) SP2012-52
 [more] SP2012-52
EA 2011-11-18
Kumamoto Kumamoto Univ. Underdetermined BSS in noisy environments with new analytical update rule for TDOA inference
Takuro Maruyama (Tsukuba Univ), Shoko Araki, Tomohiro Nakatani (NTT), Shigeki Miyabe, Takeshi Yamada, Shoji Makino (Tsukuba Univ), Atsushi Nakamura (NTT) EA2011-86
 [more] EA2011-86
EA, SIP, SP 2011-05-12
Osaka Ritsumeikan Univ. [Invited Talk] Microphone array speech processing techniques for conversation scene analysis
Shoko Araki, Masakiyo Fujimoto, Takuya Yoshioka, Takaaki Hori, Tomohiro Nakatani (NTT) EA2011-15 SIP2011-15 SP2011-15
Recognition and understanding of conversation scenes has recently been tackled to achieve a variety of tasks such as aut... [more] EA2011-15 SIP2011-15 SP2011-15
EA 2010-12-10
Ibaraki Univ. of Tsukuba [Invited Talk] Convolutive blind source separation using time-frequency masks
Hiroshi Sawada, Shoko Araki (NTT) EA2010-104
A blind source separation method for convolutive mixtures is presented. The method is based on time-frequency masks and... [more] EA2010-104
SP, NLC 2008-12-09
Tokyo Waseda Univ. Speaker diarization of multi-party conversations based on audio and visual information integration
Kentaro Ishizuka, Shoko Araki, Kazuhiro Otsuka, Masakiyo Fujimoto, Tomohiro Nakatani (NTT) NLC2008-28 SP2008-83
This paper proposes a speaker diarization method, which detects “who spoke when” in multi-party conversations, based on ... [more] NLC2008-28 SP2008-83
PRMU, MVE 2008-11-27
Osaka Osaka Univ. A realtime Multimodal System toward Multiparty Conversation Scene Analysis -- Integrating Face Pose Tracking and Speaker Diarization using Multimodal Omnidirectional Sensors --
Kazuhiro Otsuka, Shoko Araki, Kentaro Ishizuka, Masakiyo Fujimoto, Junji Yamato (NTT) PRMU2008-119 MVE2008-68
 [more] PRMU2008-119 MVE2008-68
EA 2008-07-18
Nara   Speaker diarization for meetings by integrating speech presence probability estimation and time-frequency domain direction of arrival estimation
Shoko Araki, Masakiyo Fujimoto, Kentaro Ishizuka, Tomohiro Nakatani, Hiroshi Sawada, Shoji Makino (NTT) EA2008-40
This paper presents a meeting diarization system that estimates who spoke when in a meeting. Our proposed system is real... [more] EA2008-40
