Thu, Jul 16 PM 13:30 - 14:30 |
(1) SP |
13:30-14:00 |
Acoustic data-driven pronunciation lexicon for non-native speech recognition |
Satoshi Tsujioka (NAIST), Liang Lu (University of Edinburgh), Sakriani Sakti, Graham Neubig, Tomoki Toda, Satoshi Nakamura (NAIST) |
(2) SP |
14:00-14:30 |
A Spoken term detection method matching at a frame level |
Ryota Konno, Kazunori Kojima (IPU), Shi-wook Lee (AIST), Kazuyo Tanaka (Univ. of Tsukuba), Yoshiaki Itoh (IPU) |
Thu, Jul 16 PM 15:10 - 15:40 |
(3) SP |
15:10-15:40 |
A study on discriminative approach for estimation of the divergence between distributions and its application to language identification |
Yosuke Kashiwagi, Congying Zhang, Daisuke Saito, Nobuaki Minematsu (Tokyo Univ.) |
Thu, Jul 16 PM 16:20 - 18:20 |
(4) SP |
16:20-16:50 |
Sequence Discriminative Training for Low-Rank Deep Neural Networks |
Yuuki Tachioka (Mitsubishi Electric), Shinji Watanabe, Jonathan Le Roux, John Hershey (MERL) |
(5) SP |
16:50-17:20 |
A Feature-Space Adaptation Technique using Regression Tree-based Multiple Transformation Matrices |
Hiroki Kanagawa, Yuuki Tachioka (Mitsubishi Electric Corp.), Shinji Watanabe (MERL), Jun Ishii (Mitsubishi Electric Corp.) |
(6) SP |
17:20-17:50 |
Experimental evaluation of network size effect in speaker adaptive trained DNNs embedding linear transformation networks |
Tsubasa Ochiai (Doshisha Univ./NICT), Shigeki Matsuda (Doshisha Univ.), Hideyuki Watanabe, Xugang Lu, Hisashi Kawai (NICT), Shigeru Katagiri (Doshisha Univ.) |
(7) SP |
17:50-18:20 |
Speaker Adaptation Technique for Speech Recognition using a Feature Augmentation Framework |
Hiroshi Fujimura, Takashi Masuko (TOSHIBA) |
Fri, Jul 17 AM 09:00 - 10:00 |
(8) SP |
09:00-09:30 |
Spoken Language Identification based on Language Modeling of Tandem-MLP Features |
Ryo Masumura, Taichi Asami, Hirokazu Masataki, Sumitaka Sakauchi (NTT) |
(9) SP |
09:30-10:00 |
Multiple Feed-forward Deep Neural Networks for Statistical Parametric Speech Synthesis |
Shinji Takaki (NII), SangJin Kim (Naver Labs), Junichi Yamagishi (NII), JongJin Kim (Naver Labs) |
Fri, Jul 17 AM 10:10 - 12:10 |
(10) SP |
10:10-11:10 |
[Invited Talk]
Image feature extraction and transfer learning using deep convolutional neural networks |
Hideki Nakayama (Univ. of Tokyo) |
(11) SP |
11:10-12:10 |
[Invited Talk]
Aspects of feature extraction in DNN acoustic models |
Takuya Yoshioka, Marc Delcroix, Masakiyo Fujimoto, Tomohiro Nakatani (NTT) |
Fri, Jul 17 PM 13:10 - 14:40 |
(12) SP |
13:10-13:40 |
A study on effectiveness of pop noise for speaker verification |
Shiori Nakano, Ryosuke Nakanishi, Sayaka Shiota, Hitoshi Kiya (Tokyo Metro Univ.) |
(13) SP |
13:40-14:10 |
Voice liveness detection based on frequency characteristics for speaker verification |
Sayaka Shiota (Tokyo Metro. Univ.), Fernando Villaviencio, Junichi Yamagishi, Nobutaka Ono, Isao Echizen (NII), Tomoko Matsui (ISM) |
(14) SP |
14:10-14:40 |
Investigation of privacy-preserving sounds to degrade automatic speaker verification performance |
Kei Hashimoto (NITECH), Junichi Yamagishi, Isao Echizen (NII) |