Fri, Jun 18 AM 09:20 - 09:30 |
(1) |
09:20-09:30 |
|
Fri, Jun 18 AM 09:30 - 10:40 |
(2) |
09:30-10:40 |
|
|
10:40-10:50 |
Break ( 10 min. ) |
Fri, Jun 18 AM 10:50 - 12:00 |
(3) |
10:50-12:00 |
|
|
12:00-13:00 |
Break ( 60 min. ) |
Fri, Jun 18 PM 13:00 - 15:00 |
(4) |
13:00-15:00 |
|
(5) |
13:00-15:00 |
|
(6) |
13:00-15:00 |
|
(7) |
13:00-15:00 |
|
(8) |
13:00-15:00 |
|
(9) |
13:00-15:00 |
|
(10) |
13:00-15:00 |
|
(11) SP |
13:00-15:00 |
Tools and practice for supporting recommended protocol for acoustic recording of speech data for high usability
-- Application of a cascaded all-pass filters with randomized center frequencies and phase polarities -- |
Hideki Kawahara (Wakayama Univ.), Kohei Yatabe (Waseda Univ.), Ken-Ichi Sakakibara (Health Sci. Univ. Hokkaido), Mitsunori Mizumachi (Kyushu Inst. Tech), Masanori Morise (Meiji Univ.), Hideki Banno (Meijo Univ.), Toshio Irino (Wakayama Univ.) |
(12) |
13:00-15:00 |
|
(13) |
13:00-15:00 |
|
(14) |
13:00-15:00 |
|
(15) |
13:00-15:00 |
|
(16) |
13:00-15:00 |
|
(17) |
13:00-15:00 |
|
(18) |
13:00-15:00 |
|
(19) SP |
13:00-15:00 |
F0 estimation of speech based on l2-norm regularized TV-CAR analysis |
Keiichi Funaki (Univ. of the Ryukyus) |
Fri, Jun 18 PM 15:00 - 17:00 |
(20) |
15:00-17:00 |
|
(21) |
15:00-17:00 |
|
(22) |
15:00-17:00 |
|
(23) |
15:00-17:00 |
|
(24) |
15:00-17:00 |
|
(25) |
15:00-17:00 |
|
(26) |
15:00-17:00 |
|
(27) |
15:00-17:00 |
|
(28) |
15:00-17:00 |
|
(29) |
15:00-17:00 |
|
(30) SP |
15:00-17:00 |
A Beginner's Introduction to Sound Programming for Digital Stomp Boxes |
Naofumi Aoki (Hokkaido Univ.) |
(31) SP |
15:00-17:00 |
Protection method with audio processing against Audio Adversarial Example |
Taisei Yamamoto, Yuya Tarutani, Yukinobu Fukusima, Tokumi Yokohira (Okayama Univ) |
(32) SP |
15:00-17:00 |
Speech Intelligibility Experiments using crowdsourcing
-- from designing Web page to Data screening -- |
Ayako Yamamoto, Toshio Irino (Wakayama Univ.), Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani (NTT) |
(33) |
15:00-17:00 |
|
(34) SP |
15:00-17:00 |
[Poster Presentation]
Scream detection based on deep learning using time-sequential spectral and cepstral features |
Takahiro Fukumori (Ritsumeikan Univ.) |
|
17:00-17:10 |
Break ( 10 min. ) |
Fri, Jun 18 PM 17:10 - 18:20 |
(35) SP |
17:10-18:20 |
[Invited Talk]
Spoken Dialogue System for Android ERICA
-- A Multimodal Turing Test Challenge -- |
Koji Inoue (Kyoto Univ.) |
Sat, Jun 19 AM 09:30 - 10:40 |
(36) SP |
09:30-10:40 |
[Invited Talk]
Toward a Unification of Various Speech Processing Tasks Based on End-to-End Neural networks |
Shinji Watanabe (CMU) |
|
10:40-10:50 |
Break ( 10 min. ) |
Sat, Jun 19 AM 10:50 - 12:00 |
(37) |
10:50-12:00 |
|
|
12:00-13:00 |
Break ( 60 min. ) |
Sat, Jun 19 PM 13:00 - 15:00 |
(38) |
13:00-15:00 |
|
(39) |
13:00-15:00 |
|
(40) |
13:00-15:00 |
|
(41) |
13:00-15:00 |
|
(42) |
13:00-15:00 |
|
(43) SP |
13:00-15:00 |
Creating of Japanese Phoneme Balanced Sentences for Speech Synthesis |
Yuko Takai, Naofumi Aoki, Yoshinori Dobashi (Hokkaido Univ.) |
(44) SP |
13:00-15:00 |
Verifying the Method to Generate Stage Data for Rhythm Game Using Machine Learning |
Atsuhito Udo, Naofumi Aoki, Yoshinori Dobashi (Hokkaido Univ.) |
(45) SP |
13:00-15:00 |
Low Loss Machine Learning for Digital Modeling of Distortion Stomp Boxes. |
Yuto Matsunaga, Naofumi Aoki, Yoshinori Dobashi (Hokkaido Univ.), Tetsuya Kojima (NITTC) |
(46) SP |
13:00-15:00 |
A Study on Error Correction for Improving the Accuracy of Acoustic Models |
Saki Anazawa, Naofumi Aoki, Yoshinori Dobashi (Hokkaido Univ.) |
(47) |
13:00-15:00 |
|
(48) SP |
13:00-15:00 |
A Research Related to the Fricative Sound Determination in Digital Pattern Playback |
Hiroki Otake, Naofumi Aoki, Kosei Ozeki, Yoshinori Dobashi (Hokkaido Univ.) |
(49) |
13:00-15:00 |
|
(50) |
13:00-15:00 |
|
(51) SP |
13:00-15:00 |
Study on the background cancellation system for speech privacy |
Jiangning Huang, Akinori Ito (Tohoku Univ.) |
Sat, Jun 19 PM 15:00 - 17:00 |
(52) |
15:00-17:00 |
|
(53) SP |
15:00-17:00 |
Simulation of Body-conducted Speech and Synthesis of One's Own Voice with a Sound-proof Earmuff and Bone-conduction Microphones |
Chen Ruiyan, Nishimura Tazuko, Minematsu Nobuaki, Saito Daisuke (UTokyo) |
(54) SP |
15:00-17:00 |
How logical properties in speech are processed in the brian
-- Digital Linguistics -- |
Kumon Tokumaru (Writer) |
(55) |
15:00-17:00 |
|
(56) SP |
15:00-17:00 |
Investigation on fine-tuning with image classification networks for deep neural network-based musical instrument classification |
Yuki Shiroma, Yuma Kinoshita, Sayaka Shiota, Hitoshi Kiya (TMU) |
(57) SP |
15:00-17:00 |
Dynamic Display of Guidelines in Interactive Speech Synthesizer |
Daiki Goto (Hokkai Gakuen Univ.), Naofumi Aoki, Keisuke ai (Hokkaido Univ.), Kunitoshi Motoki (Hokkai Gakuen Univ.) |
(58) SP |
15:00-17:00 |
Preliminary study on synthesizing relaxing voices
-- from a perspective of recognized/evoked emotions and acoustic features -- |
Yuki Watanabe, Shuichi Sakamoto (Tohoku Univ.), Takayuki Hoshi, Yoshiki Nagatani, Manabu Nakano (Pixie Dust Technologies) |
(59) SP |
15:00-17:00 |
Unseen speaker's Voice Conversion by FaderNetVC with Speaker Feature Extractor |
Takumi Isako, Takuya Kishida, Toru Nakashika (UEC) |
(60) |
15:00-17:00 |
|
(61) SP |
15:00-17:00 |
Development of ultrasonic signal classification system using deep learning |
Kosei Ozeki, Naofumi Aoki, Yoshinori Dobashi (Hokkaido Univ.), Kenichi Ikeda, Hiroshi Yasuda (SST) |
(62) SP |
15:00-17:00 |
Source Separation for Asynchronous Recordings of Conversation Using Time-Frequency Masking and Independent Vector Analysis |
Haruki Nammoku, Kouei Yamaoka, Yukoh Wakabayashi, Nobutaka Ono (TMU) |
(63) SP |
15:00-17:00 |
Neural speech synthesis using local phrase dependency structure information |
Nobuyoshi Kaiki, Sakriani Sakti, Satoshi Nakamura (NIST) |
(64) |
15:00-17:00 |
|
|
17:00-17:10 |
Break ( 10 min. ) |
Sat, Jun 19 PM 17:10 - 18:20 |
(65) |
17:10-18:20 |
|
Sat, Jun 19 PM 18:20 - 18:30 |
(66) |
18:20-18:30 |
|