Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Chang, Pao-Chung | en_US |
dc.contributor.author | Chen, Sin-Horng | en_US |
dc.contributor.author | Juang, Biing-Hwang | en_US |
dc.date.accessioned | 2014-12-08T15:04:28Z | - |
dc.date.available | 2014-12-08T15:04:28Z | - |
dc.date.issued | 1993-07-01 | en_US |
dc.identifier.issn | 1063-6676 | en_US |
dc.identifier.uri | http://dx.doi.org/10.1109/89.232616 | en_US |
dc.identifier.uri | http://hdl.handle.net/11536/2962 | - |
dc.description.abstract | In a traditional speech recognition system, the distance score between a test token and a reference pattern is obtained by simply averaging the distortion sequence resulted from matching of the two patterns through a dynamic programming procedure. The final decision is made by choosing the one with the minimal average distance score. If we view the distortion sequence as a form of observed features, a decision rule based on a specific discriminant function designed for the distortion sequence obviously will perform better than that based on the simple average distortion. We, therefore, suggest in this paper a linear discriminant function of the form Delta = Sigma(T)(i=1) omega(i) * d(i) to compute the distance score A instead of a direct average Delta = 1/T Sigma(T)(i=1) d(i). Several adaptive algorithms are proposed to learn the discriminant weighting function in this paper. These include one heuristic method, two methods based on the error propagation algorithm [1], [2], and one method based on the generalized Probabilistic descent (GPD) algorithm [3]. We study these methods in a speaker-independent speech recognition task involving utterances of the highly confusible English E-set (b, c, d, e, g, p, t, v, z). The results show that the best performance is obtained by using the GPD method which achieved a 78.1% accuracy, compared to 67.6% with the traditional unweighted average method. Besides the experimental comparisons, an analytical discussion of various training algorithms is also provided. | en_US |
dc.language.iso | en_US | en_US |
dc.title | Discriminative Analysis of Distortion Sequences in Speech Recognition | en_US |
dc.type | Article | en_US |
dc.identifier.doi | 10.1109/89.232616 | en_US |
dc.identifier.journal | IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | en_US |
dc.citation.volume | 1 | en_US |
dc.citation.issue | 3 | en_US |
dc.citation.spage | 326 | en_US |
dc.citation.epage | 333 | en_US |
dc.contributor.department | 電信工程研究所 | zh_TW |
dc.contributor.department | Institute of Communications Engineering | en_US |
dc.identifier.wosnumber | WOS:000207078600007 | - |
dc.citation.woscount | 6 | - |
Appears in Collections: | Articles |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.