Full metadata record
DC FieldValueLanguage
dc.contributor.author顧原竹en_US
dc.contributor.authorKu, Yuan-Chuen_US
dc.contributor.author簡仁宗en_US
dc.contributor.authorChien, Jen-Tzungen_US
dc.date.accessioned2014-12-12T02:42:47Z-
dc.date.available2014-12-12T02:42:47Z-
dc.date.issued2013en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT070160261en_US
dc.identifier.urihttp://hdl.handle.net/11536/75226-
dc.description.abstract本篇論文提出一套貝氏學習法則來建構遞迴式類神經網路(Recurrent Neural Network)語言模型並應用於大詞彙連續語言辨識系統上,這套法則的目標在於解決遞迴式類神經網路語言模型之模型正規化問題,透過補償模型參數的不確定性(Uncertainty),提升語音辨識系統的強健性及辨識效能,我們的作法是使用一組高斯事前機率(Gaussian Prior)表示類神經網路參數的隨機性,此機率分布的超參數(Hyperparameter)或正規化參數(Regularization Parameter)是經由最大化邊緣相似度(Marginal Likelihood)函數估測出來,遞迴式類神經網路參數則是透過最大化事後(Maximum a Posteriori)機率所獲得,事後機率的計算是受正規化參數所影響,取過負對數(Negative Logarithm)的事後機率相當於是一種正規化後的交叉熵誤差函數(Cross Entropy Error Function),這套演算法可以建立具正規化之遞迴式類神經網路模型,然而實現本方法的過程需要大量高維度梯度向量(Gradient Vector)的外積(Outer Product)計算以求取模型參數二次微分的海森矩陣(Hessian Matrix),我們提出一套快速近似法以擷取少量突出的外積項做海森矩陣的計算,大量降低貝氏類神經網路模型的實現成本。在華爾街日報(Wall Street Journal)大詞彙連續性語音語料庫,Penn Treebank和十億詞彙(1-Billion-Word)標準資料庫的初步評估實驗結果顯示,快速貝氏學習法可以有效提升遞迴式類神經網路語言模型的預估機率量測(Perplexity)及語音辨識率。zh_TW
dc.description.abstractThis study presents the Bayesian framework to construct the recurrent neural network language model (RNN-LM) for speech recognition. Our idea is to regularize the RNN-LM by compensating the uncertainty of the estimated model parameters which is represented by a Gaussian prior. The objective function in Bayesian RNN is formed as the negative logarithm of the posterior distribution or equivalently the regularized cross entropy error function. The regularized model is not only constructed by training the regularized parameters according to the maximum a posteriori criterion but also estimating the Gaussian hyperparameters according to the type 2 maximum likelihood method. Hessian matrix is calculated to implement the Bayesian RNN. However, a critical issue in Bayesian RNN-LM is the heavy computation of Hessian matrix which is formed as the sum of a large amount of outer-products of high-dimensional gradient vectors. We present a rapid approximation to reduce the redundancy due to the curse of dimensionality and speed up the calculation by summing up a small set of salient outer-products. Experiments on Wall Street Journal, Penn Treebank and 1B Word Benchmark corpora show that rapid Bayesian learning for RNN-LM consistently improves the perplexity and word error rate in comparison with standard RNN-LM.en_US
dc.language.isoen_USen_US
dc.subject貝氏學習zh_TW
dc.subject遞迴式類神經網路zh_TW
dc.subject海森矩陣zh_TW
dc.subject快速近似法zh_TW
dc.subject語言模型zh_TW
dc.subject語音辨識zh_TW
dc.subjectBayesian learningen_US
dc.subjectrecurrent neural networken_US
dc.subjectHessian matrixen_US
dc.subjectrapid approximationen_US
dc.subjectlanguage modelen_US
dc.subjectspeech recognitionen_US
dc.title貝氏遞迴式類神經網路於語言模型之建立zh_TW
dc.titleBayesian recurrent neural networks for language modelingen_US
dc.typeThesisen_US
dc.contributor.department電信工程研究所zh_TW
Appears in Collections:Thesis