标题: MARKOV RECURRENT NEURAL NETWORKS
作者: Kuo, Che-Yu
Chien, Jen-Tzung
电机工程学系
Department of Electrical and Computer Engineering
关键字: Deep learning;recurrent neural network;stochastic transition;discrete latent structure
公开日期: 1-一月-2018
摘要: Deep learning has achieved great success in many real-world applications. For speech and language processing, recurrent neural networks are learned to characterize sequential patterns and extract the temporal information based on dynamic states which are evolved through time and stored as an internal memory. Traditionally, simple transition function using input-to-hidden and hidden-to-hidden weights is insufficient. To strengthen the learning capability, it is crucial to explore the diversity of latent structure in sequential signals and learn the stochastic trajectory of signal transitions to improve sequential prediction. This paper proposes the stochastic modeling of transitions in deep sequential learning. Our idea is to enhance latent variable representation by discovering the Markov state transitions in sequential data based on a K-state long short-term memory (LSTM) model. Such a latent state machine is capable of learning the complicated latent semantics in highly structured and heterogeneous sequential data. Gumbel-softmax is introduced to implement stochastic learning procedure with discrete states. Experimental results on visual and text language modeling illustrate the merit of the proposed stochastic transitions in sequential prediction with limited amount of parameters.
URI: http://hdl.handle.net/11536/150841
ISSN: 2161-0363
期刊: 2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP)
显示于类别:Conferences Paper