Presentation 2004/12/14
Development of Speech Corpus and Speech Recognition System for Indonesian Language
Sakriani SAKTI, Paulus HUTAGAOL, Airy Akhmad ARMAN, Satoshi NAKAMURA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we describe our efforts in developing Indonesian speech corpus and speech recognition system. Difficulties arise in developing Indonesian speech corpus since Indonesian is actually most people's second language after their own ethnic native language. In developing speech recognition system, segmented utterances according to labels as a starting point for training speech models also one of the main issues. The initialization method with uniform segmentation would not give sufficient performance. Here, we used an English speech recognizer to set initial segmentation of Indonesian utterances. This method improves the performance significantly up to 40% absolute.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Indonesian speech recognition system / speech corpus development / uniform segmentation / new language segmentation utterance with other different language
Paper # NLC2004-71,SP2004-111
Date of Issue

Conference Information
Committee NLC
Conference Date 2004/12/14(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Development of Speech Corpus and Speech Recognition System for Indonesian Language
Sub Title (in English)
Keyword(1) Indonesian speech recognition system
Keyword(2) speech corpus development
Keyword(3) uniform segmentation
Keyword(4) new language segmentation utterance with other different language
1st Author's Name Sakriani SAKTI
1st Author's Affiliation Spoken Language Translation Research Laboratories, ATR()
2nd Author's Name Paulus HUTAGAOL
2nd Author's Affiliation R&D Division, PT Telekomunikasi Indonesia
3rd Author's Name Airy Akhmad ARMAN
3rd Author's Affiliation Electrical Engineering Department, Bandung Institute of Technology
4th Author's Name Satoshi NAKAMURA
4th Author's Affiliation Spoken Language Translation Research Laboratories, ATR
Date 2004/12/14
Paper # NLC2004-71,SP2004-111
Volume (vol) vol.104
Number (no) 539
Page pp.pp.-
#Pages 6
Date of Issue