Presentation | 2010-11-19 Acoustic separation between linguistic and extra-linguistic information in speech and its significant importance to enable speech communication Nobuaki MINEMATSU, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | The source-filter model, which was derived from observations of speech production, has been widely used to separate speech features into two parts: vocal source characteristics and vocal tract characteristics. However, the latter characteristics, often called as spectrum envelopes, transmit both linguistic information and extra-linguistic information, which are intrinsically independent of each other. This is why a speaker-independent acoustic model of a linguistic content for ASR is often built statistically by collecting utterances of that linguistic content from a large number of speakers. In the beginning part of this paper, after reviewing infants' vocal imitation for language acquisition, the vocal imitation observed in severely impaired autistics who have difficulty in speech communication, and the vocal imitation of animals, we claim the importance to derive the acoustic modeling which can separate acoustic features for linguistic information and those for extra-linguistic information. We also insist that the acoustic modeling with incomplete separation should be suited not for realizing speech communication ability on machines but only for realizing impersonation ability on machines. Further, we describe that, only with incomplete separation, speech communication has to be difficult even for humans. In the ending part of this paper, we introduce the structural representation of speech, which we proposed to realize the information separation for creating human-like machines, and show some experimental results obtained by using the proposed representation. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | source-filter model / linguistic and extra-linguistic information / spectral envelope / speech communication / vocal imitation and impersonation / autism / speech structure / transform-invariance based on f-divergence |
Paper # | SP2010-78 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2010/11/11(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Acoustic separation between linguistic and extra-linguistic information in speech and its significant importance to enable speech communication |
Sub Title (in English) | |
Keyword(1) | source-filter model |
Keyword(2) | linguistic and extra-linguistic information |
Keyword(3) | spectral envelope |
Keyword(4) | speech communication |
Keyword(5) | vocal imitation and impersonation |
Keyword(6) | autism |
Keyword(7) | speech structure |
Keyword(8) | transform-invariance based on f-divergence |
1st Author's Name | Nobuaki MINEMATSU |
1st Author's Affiliation | Graduate School of Information Science and Technology, The University of Tokyo() |
Date | 2010-11-19 |
Paper # | SP2010-78 |
Volume (vol) | vol.110 |
Number (no) | 297 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |