標題: 以韻律輔助之中文語音辨認系統之實現
An Implementation of Prosody-Assisted Mandarin Speech Recognition System
作者: 劉銘傑
Liu, Ming-Chieh
陳信宏
Chen, Sin-Horng
電信工程研究所
關鍵字: 韻律輔助之自動語音辨認;韻律模式化;階層式韻律模型;Prosody-assisted ASR;Prosody modeling;Prosody-hierarchy model
公開日期: 2010
摘要: 本研究提出一套新的整合韻律資訊於中文大辭彙連續語音辨認之方法。有別於以往只利用少數韻律資訊來幫助語音辨認,本研究利用先前已開發出的PLM演算法從大量未經人工標記的語料庫中自動產生訓練出12種韻律模型,並將其加入到two-stage自動語音辨認系統中,對系統中第一個stage,也就是傳統HMM辨認器所產生的詞圖(word lattice)作重新評分的動作,如此可以得到更正確的詞辨認序列;此外,系統第二個stage還會同時解碼出更多資訊,包含詞性(POS)、詞後所接的標點符號(PM)以及用來建構測試語料之階層式韻律架構的兩種韻律標記。本研究實驗語料是利用包含朗讀式長句之TCC300語料庫,同時實驗中會引入一個factored語言模型,它是一個描繪詞、詞性及標點符號三者之間關係的模型,用以產生更好的baseline辨認效能。本研究在加入所有韻律資訊後之實驗結果對於詞(word)、字(character)、音節(syllable)的錯誤率分別為20.1%、13.6%及9.4%,與baseline結果比較起來則分別改善了4.1%、4.0%及2.4%的絕對錯誤率(16.9%、22.6%及20.6%的相對錯誤率)。經由實驗結果分析,可以發現本系統能成功修正許多聲調及詞的錯誤辨認。
This thesis presents a new prosody-assisted ASR system for Mandarin speech. It differs from the conventional approach of using simple prosodic cues on employing a sophisticated prosody modeling approach to automatically generate 12 prosodic models from a large unlabeled speech database by the PLM algorithm proposed previously. By incorporating these 12 prosodic models into a two-stage ASR system to rescore the word lattice generated in the first stage by the conventional HMM recognizer, we can obtain a better recognized word string. Besides, some other information can also be decoded, including POS, PM, and two types of prosodic tags which can be used to construct the prosody hierarchical structure of the testing speech. Experimental results on the TCC300 database, which consists of long paragraphic utterances, showed that the proposed system significantly outperformed the baseline scheme using a factored LM to model word, POS, and PM. Performances of 20.1%, 13.6%, and 9.4% in word, character, and base-syllable error rates were obtained, which corresponds to 4.1%, 4.0%, and 2.4% absolute (16.9%, 22.6%, and 20.6% relative) error reductions. By error analysis, we found that many word segmentation errors and tone recognition errors were corrected.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079813522
http://hdl.handle.net/11536/47008
Appears in Collections:Thesis


Files in This Item:

  1. 352201.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.