標題: TONE RECOGNITION OF CONTINUOUS MANDARINE SPEECH ASSISTED WITH PROSODIC INFORMATION
作者: WANG, YR
CHEN, SH
電信工程研究所
電信研究中心
Institute of Communications Engineering
Center for Telecommunications Research
公開日期: 1-十一月-1994
摘要: In this paper, a simple recurrent neural network (SRNN) is employed to model the prosody of continuous Mandarin speech to assist tone recognition. For each syllable in continuous speech, several acoustic features carrying prosodic information are extracted and taken as inputs to the SRNN. If proper linguistic features extracted from the context of the syllable are set as output targets, the SRNN can learn to represent the prosodic state of the utterance at the syllable using its hidden nodes. Outputs of the hidden nodes then serve as additional recognition features to assist recognition of the tone of the syllable. The performance of the proposed tone recognition approach was examined by simulation on a multilayer perception (MLP)-based speaker-dependent tone recognition task. The recognition rate was improved from 91.38% to 93.10%. The SRNN prosodic model is further analyzed to exploit the linguistic meaning of prosodic states. By vector quantizing the outputs of the hidden nodes of the SRNN, a finite-state automata that roughly represents the mechanism of human prosody pronunciation can be obtained.
URI: http://dx.doi.org/10.1121/1.411274
http://hdl.handle.net/11536/2248
ISSN: 0001-4966
DOI: 10.1121/1.411274
期刊: JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA
Volume: 96
Issue: 5
起始頁: 2637
結束頁: 2645
顯示於類別:期刊論文


文件中的檔案:

  1. A1994PQ01800002.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。