Presentation 2014/12/8
A METRIC FOR EVALUATING SPEECH RECOGNITION ACCURACY BASED ON HUMAN PERCEPTION
Nobuyasu Itoh, Gakuto Kurata, Ryuki Tachibana, Masafumi Nishimura,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Word error rate or character error rate are usually used as the metrics for evaluating the accuracy of speech recognition. These are objective metrics, naturally defined, and are helpful for comparing recognition methods fairly. However the overall performance of the recognition systems and the usefulness of the results are not necessarily considered. To address this problem, we study and propose a metric which replicates human-annotated scores using their perception to the recognition results. The features that we use are the numbers of insertion errors, deletion errors, and substitution errors in the characters and the syllables. In addition we studied the numbers of consecutive errors, the misrecognized keywords, and the locations of errors. We created models using linear regression and random forest, predicted human-perceived scores, and compared them with the actual scores using Spearman's rank-based correlation. According to our experiments the correlation of human perceived scores with character error rates is 0.456, while those with the predicted scores by using a random forest of 10 features is 0.715. The latter is close to the averaged correlation between the scores of the human subjects, 0.765, which suggests that we can predict the human-perceived scores using those features. The important factors (features) for the prediction are the numbers of substitution errors and consecutive errors.
Keyword(in Japanese) (See Japanese page)
Keyword(in English)
Paper # Vol.2014-SLP-104 No.11
Date of Issue

Conference Information
Committee SP
Conference Date 2014/12/8(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A METRIC FOR EVALUATING SPEECH RECOGNITION ACCURACY BASED ON HUMAN PERCEPTION
Sub Title (in English)
Keyword(1)
1st Author's Name Nobuyasu Itoh
1st Author's Affiliation IBM Research()
2nd Author's Name Gakuto Kurata
2nd Author's Affiliation IBM Research
3rd Author's Name Ryuki Tachibana
3rd Author's Affiliation IBM Research
4th Author's Name Masafumi Nishimura
4th Author's Affiliation Graduate School of Informatics, Shizuoka University
Date 2014/12/8
Paper # Vol.2014-SLP-104 No.11
Volume (vol) vol.114
Number (no) 365
Page pp.pp.-
#Pages 4
Date of Issue