Full metadata record
DC FieldValueLanguage
dc.contributor.authorChien, Jen-Tzungen_US
dc.contributor.authorKu, Yuan-Chuen_US
dc.date.accessioned2017-04-21T06:56:03Z-
dc.date.available2017-04-21T06:56:03Z-
dc.date.issued2016-02en_US
dc.identifier.issn2162-237Xen_US
dc.identifier.urihttp://dx.doi.org/10.1109/TNNLS.2015.2499302en_US
dc.identifier.urihttp://hdl.handle.net/11536/133534-
dc.description.abstractA language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions.en_US
dc.language.isoen_USen_US
dc.subjectBayesian learningen_US
dc.subjectHessian matrixen_US
dc.subjectlanguage modelen_US
dc.subjectrapid approximationen_US
dc.subjectrecurrent neural networken_US
dc.titleBayesian Recurrent Neural Network for Language Modelingen_US
dc.identifier.doi10.1109/TNNLS.2015.2499302en_US
dc.identifier.journalIEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMSen_US
dc.citation.volume27en_US
dc.citation.issue2en_US
dc.citation.spage361en_US
dc.citation.epage374en_US
dc.contributor.department電機學院zh_TW
dc.contributor.departmentCollege of Electrical and Computer Engineeringen_US
dc.identifier.wosnumberWOS:000372020500014en_US
Appears in Collections:Articles