Committee |
Date Time |
Place |
Paper Title / Authors |
Abstract |
Paper # |
SIS |
2024-03-14 14:00 |
Kanagawa |
Kanagawa Institute of Technology (Primary: On-site, Secondary: Online) |
Consideration on divisions and combinations of Learning Data for Speaker Diarization in Multiple Speakers Kaito Uemura, Keiichi Horio (Kyushu Institute of Technology) SIS2023-48 |
Today, the importance of a speech segment detection technique called speaker diarization is increasing, mainly in the fi... [more] |
SIS2023-48 pp.17-20 |
MIKA (3rd) |
2023-10-11 14:30 |
Okinawa |
Okinawa Jichikaikan (Primary: On-site, Secondary: Online) |
[Poster Presentation]
Voice Recognition AR System Using Edge Computing Taito Baba, Ryo Midorikawa, Takumi Senaha, Toma Uruizaka, Takuya Asaka (Tokyo Metropolitan Univ.) |
People with unilateral hearing loss struggle with sound localization and voice recognition. Additionally, users of noise... [more] |
|
PRMU, IBISML, IPSJ-CVIM [detail] |
2023-03-03 16:50 |
Hokkaido |
Future University Hakodate (Primary: On-site, Secondary: Online) |
Parallel-Data-Free Japanese Singer Conversion using CycleGAN Considering Perceptual Loss in Singing Phoneme Sequences Kanade Gemmoto, Nobutaka Shimada, Tadashi Matsuo (Ritsumeikan Univ) PRMU2022-114 IBISML2022-121 |
This paper proposes a one-to-one Japanese Singing Voice Conversion (SVC) method without using parallel data.
Our method... [more] |
PRMU2022-114 IBISML2022-121 pp.293-298 |
LOIS, ICM |
2023-01-19 17:15 |
Fukuoka |
Kitakyushu International Conference Center (Primary: On-site, Secondary: Online) |
Consider Crowdsourcing Support for Automated Minute Taking Shun Kuroiwa, Kazumu Nakahira, Takahiro Koita (Doshisha Univ.) ICM2022-40 LOIS2022-40 |
Minutes of meeting requires a huge amount of cost to record all the conversations in a meeting. In recent years,many res... [more] |
ICM2022-40 LOIS2022-40 pp.54-58 |
EA, US (Joint) |
2022-12-22 16:50 |
Hiroshima |
Satellite Campus Hiroshima |
[Poster Presentation]
Data augmentation method for machine learning on speech data Tsubasa Maruyama (Tokyo Tech), Tsutomu Ikegami (AIST), Toshio Endo (Tokyo Tech), Takahiro Hirofuchi (AIST) EA2022-68 |
In machine learning, data augmentation is a method to enhance the number and diversity of data by adding transformations... [more] |
EA2022-68 pp.42-48 |
NS, SR, RCS, SeMI, RCC (Joint) |
2022-07-13 14:50 |
Ishikawa |
The Kanazawa Theatre + Online (Primary: On-site, Secondary: Online) |
Investigation of noise removal using U-Net and voice recognition performance improvement
-- for train running noise -- Jian Lin, Shota Sano, Yuusuke Kawakita, Tsuyoshi Miyazaki, Hiroshi Tanaka (KAIT) SeMI2022-26 |
A method for converting noisy sound into images to remove the noise has been proposed. We are attempting to remove train... [more] |
SeMI2022-26 pp.34-39 |
SeMI, IPSJ-MBL, IPSJ-DPS, IPSJ-ITS |
2021-05-27 09:50 |
Online |
Online |
Investigation and Evaluation Experiment of Noise Removal for Voice Recognition in Specific Noisy Environment Shota Sano, Fumitaka Murakami, Yuusuke Kawakita, Tsuyoshi Miyazaki, Hiroshi Tanaka (KAIT) SeMI2021-2 |
In this manuscript, the noise removal performance and speech recognition accuracy is described when noise is removed by ... [more] |
SeMI2021-2 pp.5-10 |
MVE, IMQ, IE, CQ (Joint) [detail] |
2021-03-01 11:35 |
Online |
Online |
Examination of voice input and treatment detection by video analysis for endoscopic findings creation support Chihiro Takigami, Mai Fujie (Chiba Univ.), Yuichiro Yoshimura (Toyama Univ.), Toshiya Nakaguchi (Chiba Univ.) IMQ2020-12 IE2020-52 MVE2020-44 |
Currently, endoscopy findings are mainly recorded by manual entry after the examination, which is burdensome due to the ... [more] |
IMQ2020-12 IE2020-52 MVE2020-44 pp.13-16 |
ICM, LOIS |
2021-01-21 15:30 |
Online |
Online |
[Encouragement Talk]
Proposal and Usability Testing of the Voice Command System for End-User Koya Hidetaka, Makoto Komiyama, Akira Kataoka, Haruo Oishi (NTT) ICM2020-41 LOIS2020-29 |
Voice User Interfaces (VUI) is increasingly being used in a variety of fields due to the dramatic improvement of voice r... [more] |
ICM2020-41 LOIS2020-29 pp.39-44 |
WIT, IPSJ-AAC |
2020-03-13 15:20 |
Ibaraki |
Tsukuba University of Technology (Cancelled but technical report was issued) |
Effects of speech reverberation on location recognition Yuji Sone, Sawako Nakajima, Kazutaka Mitobe (Akita Univ.) WIT2019-47 |
In recent years, Internet-related services have evolved greatly, and lots of personal experiences can be recorded and st... [more] |
WIT2019-47 pp.51-54 |
SeMI |
2020-01-31 09:00 |
Kagawa |
|
Proposal and Initial Evaluation of Refrigerator Door Identification Method Using Door Opening and Closing Sound Yudai Mitsukude, Kenta Hayashi, Shigemi Ishida, Yutaka Arakawa, Akira Fukuda (Kyushu Univ.) SeMI2019-115 |
In recent years, the development of sensing technology has enabled sensing in various situations.IoT products have been ... [more] |
SeMI2019-115 pp.63-68 |
HCS |
2020-01-26 16:40 |
Oita |
Room407, J:COM HorutoHall OITA (Oita) |
Children's emotion recognition based on vocal cues
-- A review of the literature on vocal emotion recognition -- Naomi Watanabe, Tessei Kobayashi (NTT) HCS2019-82 |
(To be available after the conference date) [more] |
HCS2019-82 pp.163-166 |
HCGSYMPO (2nd) |
2019-12-11 - 2019-12-13 |
Hiroshima |
Hiroshima-ken Joho Plaza (Hiroshima) |
Crosslingual Emotion Recognition using English and Japanese Speech Data Yuta Nirasawa, Atom Scotto, Ryota Sakuma, Yuki Hujita, Keiich Zempo (Tsukuba Univ.) |
Since reasearch in Speech Emotion Recognition(SER) is performed with mostly English data, applying these models to Japan... [more] |
|
HCGSYMPO (2nd) |
2019-12-11 - 2019-12-13 |
Hiroshima |
Hiroshima-ken Joho Plaza (Hiroshima) |
English-use support system using multimodal interface Kaito Aoki, Fumihiko Ishida (NIT, Toyama Col.) |
In recent years, we frequently use English due to globalization. In actual English communication, even if you study Engl... [more] |
|
SRW, SeMI, CNR (Joint) |
2019-11-05 16:10 |
Tokyo |
Kozo Keisaku Engineering Inc. |
[Invited Talk]
Robot AI Platform x "unibo" Toshikazu Kanaoka (Fujitsu) |
Fujitsu developed "robot AI platform" as a service platform providing natural communication between people and robots. T... [more] |
|
KBSE |
2019-03-02 12:30 |
Kyoto |
Doshisha University Kambaikan |
Toward Automatic Generation of Meeting Minutes by Visualization of Voice Information Yusuke Nakamura, Takeshi Nakase, Satoshi Yajima, Yutaka Matsuno (Nihon Univ.) KBSE2018-54 |
Today, the minutes at meetings in organizations such as companies are important. However, it is impossible to see the mi... [more] |
KBSE2018-54 pp.1-5 |
NLC, IPSJ-IFAT |
2019-02-07 15:30 |
Kyoto |
Ryukoku University Omiya Campus |
[Invited Talk]
ForeSight Voice Mining, a voice mining system for contact centers Kazuhiro Arai (NTT-TX) NLC2018-40 |
This paper describes ForeSight Voice Mining that NTT TechnoCross Corp. provides for contact centers. ForeSight Voice Min... [more] |
NLC2018-40 pp.27-32 |
HCGSYMPO (2nd) |
|
Mie |
Sinfonia Technology Hibiki Hall Ise |
Mood Improvement by Multiple Personality Assistant Agent in Speech Recognition Failure Takehiro Hondo, Ippei Naganuma, Kazuki Kobayashi (Shinshu Univ.) |
This paper proposes a method to create a mood in a human agent speech interaction and investigates the created mood by a... [more] |
|
SIS, ITE-BCT |
2018-10-25 15:40 |
Kyoto |
Kyoto University Clock Tower Centennial Hall |
[Tutorial Lecture]
Development and Application of Voice User Interface for Device Operation with High Usability Noboru Hayasaka (OECU) SIS2018-15 |
A speech recognition framework in smart speakers and smart phones require a network connection, which increases the burd... [more] |
SIS2018-15 pp.57-62 |
SIP, EA, SP, MI (Joint) [detail] |
2018-03-19 10:25 |
Okinawa |
|
Non-parallel and Many-to-Many Voice Conversion Using Variational Autoencoder Conditioned by Phonetic Posteriorgrams and d-vectors Yuki Saito (NTT/Univ. of Tokyo), Yusuke Ijima, Kyosuke Nishida (NTT), Shinnosuke Takamichi (Univ. of Tokyo) EA2017-105 SIP2017-114 SP2017-88 |
This paper proposes novel frameworks for non-parallel and many-to-many voice conversion (VC) using variational autoencod... [more] |
EA2017-105 SIP2017-114 SP2017-88 pp.21-26 |