Presentation | 2014/12/8 A METRIC FOR EVALUATING SPEECH RECOGNITION ACCURACY BASED ON HUMAN PERCEPTION Nobuyasu Itoh, Gakuto Kurata, Ryuki Tachibana, Masafumi Nishimura, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Word error rate or character error rate are usually used as the metrics for evaluating the accuracy of speech recognition. These are objective metrics, naturally defined, and are helpful for comparing recognition methods fairly. However the overall performance of the recognition systems and the usefulness of the results are not necessarily considered. To address this problem, we study and propose a metric which replicates human-annotated scores using their perception to the recognition results. The features that we use are the numbers of insertion errors, deletion errors, and substitution errors in the characters and the syllables. In addition we studied the numbers of consecutive errors, the misrecognized keywords, and the locations of errors. We created models using linear regression and random forest, predicted human-perceived scores, and compared them with the actual scores using Spearman's rank-based correlation. According to our experiments the correlation of human perceived scores with character error rates is 0.456, while those with the predicted scores by using a random forest of 10 features is 0.715. The latter is close to the averaged correlation between the scores of the human subjects, 0.765, which suggests that we can predict the human-perceived scores using those features. The important factors (features) for the prediction are the numbers of substitution errors and consecutive errors. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | |
Paper # | Vol.2014-SLP-104 No.11 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2014/12/8(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A METRIC FOR EVALUATING SPEECH RECOGNITION ACCURACY BASED ON HUMAN PERCEPTION |
Sub Title (in English) | |
Keyword(1) | |
1st Author's Name | Nobuyasu Itoh |
1st Author's Affiliation | IBM Research() |
2nd Author's Name | Gakuto Kurata |
2nd Author's Affiliation | IBM Research |
3rd Author's Name | Ryuki Tachibana |
3rd Author's Affiliation | IBM Research |
4th Author's Name | Masafumi Nishimura |
4th Author's Affiliation | Graduate School of Informatics, Shizuoka University |
Date | 2014/12/8 |
Paper # | Vol.2014-SLP-104 No.11 |
Volume (vol) | vol.114 |
Number (no) | 365 |
Page | pp.pp.- |
#Pages | 4 |
Date of Issue |