標題: 一種整合向量量化扭曲度與離散隱藏式馬可夫模型的語音辨識方法
An Integrated Speech Recognition Method Based on VQ-Distortion Measure and Discrete HMM
作者: 林建進
Jiann-Jinn Lin
李嘉晃
Chia-Hoang Lee
資訊科學與工程研究所
關鍵字: 語音辨識;向量量化;字碼庫;隱藏式馬可夫模型;Speech Recognition;Vector Quantization;Codebook; Hidden Markov Model (HMM)
公開日期: 1994
摘要: 在本篇論文中,我們提出一種整合的方法來辨識國語獨立字。我們 所提 出的語音辨識器架構包含一個個別字專屬的向量量化前處理器 ,以及一 個離散的隱藏式馬可夫模型後處理器。前處理器主要是用來獲得一串後選 字。在只有一個候選字的情況下,前處理器即認定此候選字為辨認字。在 有多個候選字的情況下,這些候選字將進一步由後處理器來處理。後處理 器計算每個候選字被判定為辨認字的可能機率,具有最大可能機率的候選 字將被認定為最終辨認字。影響辨識積效的因素包括個別字的字碼庫大小 、整體性字碼庫的建構方法以及前處理器判定值的決定等等。這些因素的 研究將透過數個語音辨認實驗來完成,語音辨認實驗的資料庫是由五十個 國語字所組成。從實驗結果得知我們的語音辨識器可達到96%的辨識率 並且相對的減少辨識所需的計算時間。 In this thesis, we propose an integrated method for isolated Mandarin word recognition. The structure of the proposed speech recognizer consists of a word-specific vector quantization (VQ) preprocessor, followed by a discrete HMM postprocessor. The purpose of the preprocessor is essentially to obtain a good list of word candidates. In some cases, only one candidate word is selected by the preprocessor decision logic, and then this word is claimed to be final recognized word. In all the other cases, the list of candidate words is passed to the postprocessor for further processing. The discrete HMM postprocessor computes likelihoods of each candidate word, and the word with maximum likelihood is claimed as final recognized word. The performance of the recognizer is affected by a number of factors, including the size of word-specific individual codebooks, the rule used to construct universal codebook, the decision thresholds of preprocessor, etc. These factors were studied through recognition experiments using a database consisting of 50 Mandarin words. The results achieve a recognition accuracy about 96 percent and show that the computation time comapred to traditional HMM/VQ method is reduced greatly.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT830394037
http://hdl.handle.net/11536/59059
顯示於類別:畢業論文