Presentation 2010-11-19
Acoustic separation between linguistic and extra-linguistic information in speech and its significant importance to enable speech communication
Nobuaki MINEMATSU,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) The source-filter model, which was derived from observations of speech production, has been widely used to separate speech features into two parts: vocal source characteristics and vocal tract characteristics. However, the latter characteristics, often called as spectrum envelopes, transmit both linguistic information and extra-linguistic information, which are intrinsically independent of each other. This is why a speaker-independent acoustic model of a linguistic content for ASR is often built statistically by collecting utterances of that linguistic content from a large number of speakers. In the beginning part of this paper, after reviewing infants' vocal imitation for language acquisition, the vocal imitation observed in severely impaired autistics who have difficulty in speech communication, and the vocal imitation of animals, we claim the importance to derive the acoustic modeling which can separate acoustic features for linguistic information and those for extra-linguistic information. We also insist that the acoustic modeling with incomplete separation should be suited not for realizing speech communication ability on machines but only for realizing impersonation ability on machines. Further, we describe that, only with incomplete separation, speech communication has to be difficult even for humans. In the ending part of this paper, we introduce the structural representation of speech, which we proposed to realize the information separation for creating human-like machines, and show some experimental results obtained by using the proposed representation.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) source-filter model / linguistic and extra-linguistic information / spectral envelope / speech communication / vocal imitation and impersonation / autism / speech structure / transform-invariance based on f-divergence
Paper # SP2010-78
Date of Issue

Conference Information
Committee SP
Conference Date 2010/11/11(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Acoustic separation between linguistic and extra-linguistic information in speech and its significant importance to enable speech communication
Sub Title (in English)
Keyword(1) source-filter model
Keyword(2) linguistic and extra-linguistic information
Keyword(3) spectral envelope
Keyword(4) speech communication
Keyword(5) vocal imitation and impersonation
Keyword(6) autism
Keyword(7) speech structure
Keyword(8) transform-invariance based on f-divergence
1st Author's Name Nobuaki MINEMATSU
1st Author's Affiliation Graduate School of Information Science and Technology, The University of Tokyo()
Date 2010-11-19
Paper # SP2010-78
Volume (vol) vol.110
Number (no) 297
Page pp.pp.-
#Pages 6
Date of Issue