標題: ISOLATED MANDARINE SYLLABLE RECOGNITION USING SEGMENTAL FEATURES
作者: CHANG, S
CHEN, SH
電信工程研究所
電信研究中心
Institute of Communications Engineering
Center for Telecommunications Research
關鍵字: SPEECH RECOGNITION;ACOUSTIC SEGMENTS;MANDARINE BASE SYLLABLES
公開日期: 1-Feb-1995
摘要: A segment-based speech recognition scheme is proposed. The basic idea is to model explicitly the correlation among successive frames of speech signals by using features representing contours of spectral parameters. The speech signal of an utterance is regarded as a template formed by directly concatenating a sequence of acoustic segments. Each constituent acoustic segment is of variable length in nature and represented by a fixed dimensional feature vector formed by coefficients of discrete orthonormal polynomial expansions for approximating its spectral parameter contours. In the training, an automatic algorithm is proposed to generate several segment-based reference templates for each syllable class. In the testing, a frame-based dynamic programming procedure is employed to calculate the matching score of comparing the test utterance with each reference template. Performance of the proposed scheme was examined by simulations on multispeaker speech recognition for 408 highly confusing isolated Mandarin base-syllables. A recognition rate of 81.1% was achieved for the case using 5-segment, 8-reference template models with cepstral and delta-cepstral coefficients as recognition features. It is 4.5% higher than that of a well-modelled 12-state, 5-mixture CHMM method using cepstral, delta cepstral, and delta-delta cepstral coefficients.
URI: http://dx.doi.org/10.1049/ip-vis:19951648
http://hdl.handle.net/11536/2072
ISSN: 1350-245X
DOI: 10.1049/ip-vis:19951648
期刊: IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING
Volume: 142
Issue: 1
起始頁: 59
結束頁: 64
Appears in Collections:Articles


Files in This Item:

  1. A1995QL50500011.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.