標題: 台語語音辨識與文字處理之研究
Studies on Taiwanese Speech Recognition and Text Analysis
作者: 王文德
Wen-De Wang
Sin-Horng Chen
關鍵字: 台語語音辨識;隱藏式馬可夫模型;台語文句分析;斷詞;Taiwanese speech recognition;hidden Markov model;Taiwanese text analysis;word tagging
公開日期: 2003
摘要: 本論文探討台語語音處理的兩個問題:語音辨識與文句分析,在台語語音辨識方面,我們建立聲韻母次音節隱藏式馬可夫模型,作連續語音音節辨識,在多語者的情況下,男女生之音節辨識率分別為42.67%及47.33%;在台語文句分析方面,我們使用詞典依長詞優先之原則進行斷詞,對於台語文字表示法不統一而引起的斷詞混淆問題,我們藉建立音節與字的對應表,將詞典之詞展開,來改善斷詞之正確率。 關鍵詞:台語語音辨識、隱藏式馬可夫模型、台語文句分析、斷詞
In this thesis, two tasks of Taiwanese language processing are studied. First, the task of Taiwanese speech recognition is exploited. A set of initial and final sub-syllable hidden Markov models (HMMs) is constructed for continuous base-syllable recognition. Syllable accuracy rates of 47.33% and 42.67% were obtained for multi-speaker female and male speech recognition, respectively. Then, the task of Taiwanese text analysis is studied. The problem of word tagging ambiguity due to the non-standardization of Taiwanese written form is exploited. A method to expand the representations of words in the lexicon by using a syllable-to-character mapping table is proposed. Experimental results confirmed the effectiveness of the proposed method on improving the performance of word tagging. Keywords: Taiwanese speech recognition, hidden Markov model, Taiwanese text analysis, word tagging


  1. 356401.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。