BEYOND HMM --- Workshop on statistical modeling approach for speech recognition

ATR
BEYOND HMM

Co-organized by

SP, SLP and ATR

Co-organized by Speech Committee (SP) of The Institute for Electronics, Information and Communication Engineers (IEICE) / The Acoustical Society of Japan (ASJ), Special Interest Group for Spoken Language Processing (SLP) of Information Processing Society of Japan (IPSJ), and Advanced Telecommunication Research Institute(ATR). Co-sponsored by the IEEE Signal Processing Society Japan Chapter.

WORKSHOP THEME

The Hidden Markov Model has been widely used for speech recognition. However, its performance significantly degraded in adverse environment, and when speech is highly spontaneous. Most attempts to overcome these problems are effective only in special applications. Consequently, there is a strong demand for a new modeling scheme beyond HMMs, that is based on comprehensive understanding of the inner nature of speech.

One promising approach is to employ statistical methods, which make full use of large speech corpora and provide large computational power. In this workshop, we will discuss the exploratory research being pursued in this direction and to attempt to foster scientific insight into statistical modeling for speech recognition.

PROGRAM

09:20 Greetings    Satoshi Nakamura (ATR)

09:30-10:45 Oral Session

   1. (Invited) "Production models for speech recognition",
Erik McDermott (NTT)

   2. (Invited) "Robust acoustic modeling for speech recognition",
Koichi Shinoda (Tokyo Institute of Technology)

   3. (Invited) "Design and Implementation of HMM/BN Acoustic Models",
Konstantin Markov and Satoshi Nakamura (ATR)

11:00-11:50 Oral Session

   4. (Invited) "SVMs, Score-Spaces and Maximum Margin Statistical Models",
Mark J.F. Gales (Cambridge University)

11:50-13:20 Lunch

13:20-15:00 Oral Session

   5. (Invited) "What HMMs Can't Do: A Graphical Model Perspective",
Jeff Bilmes (Univ. Washington)

   6. (Invited) "Minimum Bayes Risk Estimation and Decoding in Large Vocabulary Automatic Speech Recognition",
William Byrne (Cambridge University)

15:15-16:05 Poster Abstract Presentation

16:15-17:15 Poster Session

   7. "Asynchronous Articulatory Feature Recognition using Dynamic Bayesian Networks",
Mirjam Wester, Joe Frankel, and Simon King (University of Edinburgh, Great Britain)

   8. "Dynamic Bayesian Networks for Acoustic and Language Modeling",
Khalid Daoudi (IRIT-UPS, France)

   9. "Reformulating the HMM as a Trajectory Model",
Keiichi Tokuda, Heiga Zen, Tadashi Kitamura (Nagoya Institute of Technology)

   10. "Speech recognition method based on trajectories generated by Kalman filters",
Yasuhiro Minami (NTT)

   11. "Robustness of acoustic model topology determined by Variational Bayesian Estimation and Clustering for speech recognition for different speech data sets.",
Shinji Watanabe and Atsushi Nakamura (NTT)

   12. "Variational Bayesian Based Topology Training and Mixture Component Splitting for Acoustic Modeling",
Takatoshi Jitsuhiro, Satoshi Nakamura (ATR)

   13. "Mixtures of Probabilistic Principal Component Analyzers in Speech Recognition",
Mike Schuster (NTT)

   14. "Aggregate A Posteriori Linear Regression Adaptation of Hidden Markov Models",
Jen-Tzung Chien and Chih-Hsien Huang (National Cheng Kung University,Taiwan)

   15. "Speaker recognition without feature extraction process",
Tomoko Matsui and Kunio Tanabe (The Institute of Statistical Mathematics)

   16. "Speaker Recognition using a Non-parametric Speaker Model Representation and Earth Mover's Distance",
Yoshiaki Umeda, Shingo Kuroiwa, Satoru Tsuge, Fuji Ren (Univ. of Tokusima)

17:15-17:45 Discussions

IMPORTANT DEADLINES

One page (A4) abstract ................ : September 15, 2004 Notification of Acceptance/Rejection .. : October 1, 2004 Six-page, camera-ready paper .......... : November 15, 2004
Note: The abstract should be sent to shinoda@cs.titech.ac.jp along with title and author information.

REGISTRATION FEE

There is no registration fee for this workshop. However, it is requested that individuals interested in attending the workshop notify the secretariat of their intention to participate by Dec 1, 2004.

WORKSHOP LOCATION

The workshop will be held at the Advanced Telecommunications Research Institute (ATR).
ATR Web Site: http://www.atr.co.jp/html/access/access.html

HOTEL INFORMATION

The Keihanna Plaza hotel is located just in front of the venue of this workshop (ATR), and it should be convenient for all attendees of the workshop. Please see http://hotel.keihanna-plaza.co.jp/ for more information, and mail to information@hotel.keihanna-plaza.co.jp for reservations. Many other hotels and inns are available in downtown Kyoto.

OFFICIAL LANGUAGE

The official language of the workshop will be English. All abstracts and papers submitted should be written in English.

SCIENTIFIC COMMITTEE

Satoshi Nakamura
Koichi Shinoda
Technical Committee of Speech (IEICE)
Technical Committee of Spoken Language Processing (IPSJ)

FOR FURTHER INFORMATION

IEICE Workshop Secretariat
Prof. Koichi Shinoda
Department of Computer Science
Tokyo Institute of Technology
mailto: shinoda@cs.titech.ac.jp
Workshop Web Site: http://www.ieice.or.jp/iss/sp/eng/workshop/beyondHMM.html