Bayesian Recurrent Neural Network for Language Modeling

doi:10.1109/TNNLS.2015.2499302

標題:	Bayesian Recurrent Neural Network for Language Modeling
作者:	Chien, Jen-Tzung Ku, Yuan-Chu 電機學院 College of Electrical and Computer Engineering
關鍵字:	Bayesian learning;Hessian matrix;language model;rapid approximation;recurrent neural network
公開日期:	Feb-2016
摘要:	A language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions.
URI:	http://dx.doi.org/10.1109/TNNLS.2015.2499302 http://hdl.handle.net/11536/133534
ISSN:	2162-237X
DOI:	10.1109/TNNLS.2015.2499302
期刊:	IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
Volume:	27
Issue:	2
起始頁:	361
結束頁:	374
Appears in Collections:	Articles