标题: Bayesian Recurrent Neural Network for Language Modeling
作者: Chien, Jen-Tzung
Ku, Yuan-Chu
电机学院
College of Electrical and Computer Engineering
关键字: Bayesian learning;Hessian matrix;language model;rapid approximation;recurrent neural network
公开日期: 二月-2016
摘要: A language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions.
URI: http://dx.doi.org/10.1109/TNNLS.2015.2499302
http://hdl.handle.net/11536/133534
ISSN: 2162-237X
DOI: 10.1109/TNNLS.2015.2499302
期刊: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
Volume: 27
Issue: 2
起始页: 361
结束页: 374
显示于类别:Articles