標題: | Bayesian Recurrent Neural Network for Language Modeling |
作者: | Chien, Jen-Tzung Ku, Yuan-Chu 電機學院 College of Electrical and Computer Engineering |
關鍵字: | Bayesian learning;Hessian matrix;language model;rapid approximation;recurrent neural network |
公開日期: | Feb-2016 |
摘要: | A language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions. |
URI: | http://dx.doi.org/10.1109/TNNLS.2015.2499302 http://hdl.handle.net/11536/133534 |
ISSN: | 2162-237X |
DOI: | 10.1109/TNNLS.2015.2499302 |
期刊: | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS |
Volume: | 27 |
Issue: | 2 |
起始頁: | 361 |
結束頁: | 374 |
Appears in Collections: | Articles |