|
- |
|
Sat, May 24 AM 08:50 - 09:00 |
(1) SP |
08:50-09:00 |
"Ongaku" Symposium 2014: The 2nd Symposium on Any Topics Related to Acoustics, Audition and Natural Language |
Hirokazu Kameoka (Univ. of Tokyo/NTT), Eriko Aiba (UEC), Yasunori Ohishi (NTT), Tetsuro Kitahara (Nihon Univ.), Tatsuya Kitamura (Konan Univ.), Shoei Sato (NHK), Masahito Togami (Hitachi), Tomoki Toda (NAIST), Kazuyoshi Yoshii (Kyoto Univ.) |
Sat, May 24 AM 09:00 - 09:45 |
(2) SP |
09:00-09:45 |
[Invited Talk]
Speaker adaptation technologies for speech synthesis and its application to assistive technology |
Junichi Yamagishi (NII) |
Sat, May 24 AM 09:45 - 10:30 |
(3) |
09:45-10:30 |
|
Sat, May 24 AM 10:30 - 11:15 |
(4) SP |
10:30-11:15 |
[Invited Talk]
Infinite data analysis and Bayesian nonparametrics for audio signal processing |
Masahiro Nakano (NTT) |
Sat, May 24 AM 11:15 - 15:30 |
|
- |
|
|
- |
|
Sat, May 24 PM 15:30 - 16:15 |
(5) SP |
15:30-16:15 |
[Invited Talk]
From multimodal spatial hearing to engineering applications to cope with severe disasters
-- Our recent research restuls on spatial acoustic information sciences -- |
Yo-iti Suzuki, Shuichi Sakamoto (Tohoku Univ.) |
Sat, May 24 PM 16:15 - 17:00 |
(6) |
16:15-17:00 |
|
Sat, May 24 PM 17:00 - 17:45 |
(7) |
17:00-17:45 |
|
|
- |
|
|
- |
|
|
- |
|
Sun, May 25 AM 09:00 - 09:45 |
(8) SP |
09:00-09:45 |
[Invited Talk]
Behavioral neurosciences of vocal control and learning
-- using the songbird as a model system -- |
Ryosuke O. Tachibana (Univ. of Tokyo) |
Sun, May 25 AM 09:45 - 10:30 |
(9) SP |
09:45-10:30 |
[Invited Talk]
Machine Translation
-- Why couldn't we do it? Why are we starting to be able to now? -- |
Graham Neubig (NAIST) |
Sun, May 25 AM 10:30 - 11:15 |
(10) SP |
10:30-11:15 |
[Invited Talk]
Applications and Advances of Deep Learning for Automatic Speech Recognition |
Yotaro Kubo (Amazon) |
Sun, May 25 AM 11:15 - 15:30 |
|
- |
|
|
- |
|
Sun, May 25 PM 15:30 - 16:15 |
(11) SP |
15:30-16:15 |
[Invited Talk]
R&D of Music Information Retrieval Technology and Issues for its Deployment to Practical Applications |
Keiichiro Hoashi (KDDI Labs) |
Sun, May 25 PM 16:15 - 17:00 |
(12) SP |
16:15-17:00 |
[Invited Talk]
What Higher-Order Statistics Tell Us?
-- Acoustic Signal Processing Based on Unsupervised Learning -- |
Hiroshi Saruwatari (Univ. of Tokyo) |
Sun, May 25 PM 17:00 - 17:45 |
(13) |
17:00-17:45 |
|
Sun, May 25 PM 17:45 - 18:00 |
|
- |
|
|
- |
|
Sat, May 24 AM 11:30 - 15:30 |
(14) |
11:30-15:30 |
|
(15) |
11:30-15:30 |
|
(16) |
11:30-15:30 |
|
(17) |
11:30-15:30 |
|
(18) |
11:30-15:30 |
|
(19) |
11:30-15:30 |
|
(20) |
11:30-15:30 |
|
(21) |
11:30-15:30 |
|
(22) |
11:30-15:30 |
|
(23) |
11:30-15:30 |
|
(24) |
11:30-15:30 |
|
(25) SP |
11:30-15:30 |
A Consideration of Evaluation Measurements in Spoken Term Detection |
Satoshi Oshima, Yoshiaki Itoh (Iwate Prefectural Univ.) |
(26) SP |
11:30-15:30 |
Robustness of Speaker Identification Using Pseudo Pitch Synchronized Phase Information |
Yuta Kawakami, Longbiao Wang (Nagaoka Univ. of Tech.), Atsuhiko Kai (Shizuoka Univ.), Seiichi Nakagawa (Toyohashi Univ. of Tech.) |
(27) SP |
11:30-15:30 |
Visualization of World Englishes pronunciations from a speaker's self-centered viewpoint using attributes of accent, gender, and age |
Yuji Kawase, Nobuaki Minematsu, Daisuke Saito, Keikichi Hirose (UTokyo), Han-Ping Shen (NCKU) |
(28) |
11:30-15:30 |
|
(29) SP |
11:30-15:30 |
Native language recognition using machine learning |
Ryota Sakagami, Kouki Takeshita, Longbiao Wang, Masahiro Iwahashi (Nagaoka Univ. of Tech) |
(30) SP |
11:30-15:30 |
Language recognition in reverberant environments |
Kouki Takeshita, Ryota Sakagami, Longbiao Wang, Masahiro Iwahashi (Nagaoka Univ. of Tech.) |
(31) SP |
11:30-15:30 |
Discriminative training of acoustic models for system combination |
Yuuki Tachioka (Mitsubishi Electric), Shinji Watanabe, Jonathan Le Roux, John R. Hershey (MERL) |
(32) SP |
11:30-15:30 |
Distant-talking Speech Recognition with Asynchronous Speech Recording |
Shunta Teraoka, Yuma Ueda (Shizuoka Univ.), Longbiao Wang (Nagaoka Univ. of Tech.), Atsuhiko Kai, Taku Fukushima (Shizuoka Univ.) |
(33) |
11:30-15:30 |
|
(34) |
11:30-15:30 |
|
(35) SP |
11:30-15:30 |
[研究紹介] A spectrogram-patch-input DNN model for detection and classification of acoustic events robust to speech overlapping scenarios |
Miquel Espi, Masakiyo Fujimoto, Yotaro Kubo, Tomohiro Nakatani (NTT) |
(36) SP |
11:30-15:30 |
Development of environmental sound collection system using smart devices based on crowd-sourcing approach |
Sunao Hara, Akinori Kasai, Masanobu Abe (Okayama Univ.), Noboru Sonehara (NII) |
(37) SP |
11:30-15:30 |
ROCKON:Environmental sound collection and recognition system using smartphones |
Minori Matsuyama, Takahiko Tsuda, Ryuichi Nisimura, Hideki Kawahara (Wakayama Univ), Junnosuke Yamada (NTT), Toshio Irino (Wakayama Univ) |
(38) |
11:30-15:30 |
|
(39) |
11:30-15:30 |
|
(40) |
11:30-15:30 |
|
(41) |
11:30-15:30 |
|
(42) SP |
11:30-15:30 |
Underdetermined Blind Separation of Moving Sources Based on Probabilistic Modeling |
Takuya Higuchi, Norihiro Takamune, Tomohiko Nakamura (Univ. of Tokyo), Hirokazu Kameoka (Univ. of Tokyo/NTT) |
(43) SP |
11:30-15:30 |
Psychometric functions for across-frequency gap detection |
Yousuke Kikuchi, Takako Mitsudo, Nobuyuki Hirose, Shuji Mori (Kyushu Univ.) |
(44) SP |
11:30-15:30 |
Deriving the Salience Level of a Target Sound using a Tapping Technique Method |
Shunsuke Kidani, Hsin-I Liao, Makoto Yoneya, Makio Kashino, Shigeto Furukawa (NTT) |
(45) SP |
11:30-15:30 |
Perception of stop consonants at the beginning of binaurally fused words |
Hitomi Kondo, Yousuke Kikuchi, Takako Mitsudo, Nobuyuki Hirose, Shuji Mori (Kyushu Univ.) |
(46) SP |
11:30-15:30 |
Effect of interaural time difference for localization of spatially segregated sound |
Daisuke Morikawa (JAIST) |
(47) SP |
11:30-15:30 |
Acquisition and retention of perceptual cue for size judgment using whispered speech |
Koudai Yamamoto, Toshio Irino, Ryuichi Nisimura, Hideki Kawahara (Wakayama Univ.) |
Sun, May 25 AM 11:30 - 15:30 |
(48) |
11:30-15:30 |
|
(49) |
11:30-15:30 |
|
(50) |
11:30-15:30 |
|
(51) |
11:30-15:30 |
|
(52) |
11:30-15:30 |
|
(53) |
11:30-15:30 |
|
(54) |
11:30-15:30 |
|
(55) |
11:30-15:30 |
|
(56) |
11:30-15:30 |
|
(57) |
11:30-15:30 |
|
(58) SP |
11:30-15:30 |
Analysis of the Relationship between Pitch and Formant Frequencies in Voice Register Transition |
Yasufumi Uezu, Takahiro Furukawa, Tokihiko Kaburagi (Kyushu Univ.) |
(59) SP |
11:30-15:30 |
Statistical bandwidth extension using sub-band basis spectrum model |
Yamato Ohtani, Masatsune Tamura, Masahiro Morita, Masami Akamine (Toshiba) |
(60) SP |
11:30-15:30 |
Text-to-speech prosody synthesis based on probabilistic model for F0 contour |
Kento Kadowaki, Tatsuma Ishihara, Nobukatsu Hojo (Univ. of Tokyo), Hirokazu Kameoka (Univ. of Tokyo/NTT) |
(61) SP |
11:30-15:30 |
Evaluation of singing voice similarity based on "acoustic singing-structure" |
Shun Kojima, Takeshi Saitou, Masato Miyoshi (Kanazawa Univ.) |
(62) SP |
11:30-15:30 |
Statistical approach to perceived age control of singing voice |
Kazuhiro Kobayashi, Tomoki Toda (NAIST), Tomoyasu Nakano, Masataka Goto (AIST), Graham Neubig, Sakriani Sakti, Satoshi Nakamura (NAIST) |
(63) SP |
11:30-15:30 |
A portable application for assistance of vocal sound training by overtone analysis |
Iori Sugahara, Takayuki Itoh (Ochanomizu Univ) |
(64) SP |
11:30-15:30 |
An Evaluation of a Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Prediction |
Kou Tanaka, Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura (NAIST) |
(65) SP |
11:30-15:30 |
Design of voice-enabled web test system for eliminating users' impatience |
Chihiro Tafuji, Ryuichi Nisimura, Hideki Kawahara, Toshio Irino (Wakayama Univ.) |
(66) SP |
11:30-15:30 |
A joint restricted Boltzmann machine for dictionary learning in sparse-representation-based voice conversion |
Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki (Kobe Univ.) |
(67) SP |
11:30-15:30 |
Speech waveform generation on subband domain |
Nobuyuki Nishizawa, Tsuneo Kato (KDDI R&D Labs) |
(68) SP |
11:30-15:30 |
A Kana Protocol Recommendation Method for Switch Input Speech Synthesis Systems |
Fuming Fang, Takahiro Shinozaki, Takao Kobayashi (Tokyo Tech) |
(69) SP |
11:30-15:30 |
Current situations and issues of open-source high-quality speech synthesis system WORLD |
Masanori Morise (Univ. of Yamanashi) |
(70) SP |
11:30-15:30 |
The Acoustic Feature of the Loudspeaker which used the Reinforced Corrugated Fibreboard for the Enclosure Material |
Takuto Isoyama, Yukio Mori (Salesian Polytechnic), Yoshiaki Kiyama |
(71) SP |
11:30-15:30 |
Spot-forming method by using two shotgun microphones |
Motoyuki Suzuki, Takeshi Honjo (Osaka Inst. of Tech.) |
(72) SP |
11:30-15:30 |
Signal processing of ultrasound for osteoporosis diagnosis
-- Modeling, time domain analysis, and frequency domain analysis -- |
Yoshiki Nagatani (KCCT), Ryosuke O. Tachibana (Univ. of Tokyo) |
(73) SP |
11:30-15:30 |
Modulation transfer function based robust method of voice activity detection for noisy reverberant environments
-- Utilization of subband SNR estimation -- |
Shota Morita, Masashi Unoki (JAIST), Xugang Lu (NICT), Masato Akagi (JAIST) |
(74) SP |
11:30-15:30 |
Systematic study on kawaii products (The seventeenth report)
-- Basic study for Kawaii sound -- |
Michiko Ohkura, Ryo Kanno (Shibaura Inst. Tech.) |
(75) SP |
11:30-15:30 |
The basic mechanisms for perception of simultaneity, stream segregation, and temporal order for auditory stimuli |
Satoshi Okazaki, Makoto Ichikawa (Chiba Univ.) |
(76) |
11:30-15:30 |
|
(77) |
11:30-15:30 |
|
(78) SP |
11:30-15:30 |
[研究紹介] Adaptive adjustment of local temporal structure in song of Bengalese finches |
Ryosuke O. Tachibana, Neal A. Hessler, Kazuo Okanoya (Univ. of Tokyo) |
(79) SP |
11:30-15:30 |
Modulation of the Temporal Dynamics of Microsaccades with the Presentation of Salient Sounds |
Makoto Yoneya, Hsin-I Liao, Shunsuke Kidani, Shigeto Furukawa (NTT), Makio Kashino (NTT/Tokyo Tech) |