标题: | MARKOV RECURRENT NEURAL NETWORKS |
作者: | Kuo, Che-Yu Chien, Jen-Tzung 电机工程学系 Department of Electrical and Computer Engineering |
关键字: | Deep learning;recurrent neural network;stochastic transition;discrete latent structure |
公开日期: | 1-一月-2018 |
摘要: | Deep learning has achieved great success in many real-world applications. For speech and language processing, recurrent neural networks are learned to characterize sequential patterns and extract the temporal information based on dynamic states which are evolved through time and stored as an internal memory. Traditionally, simple transition function using input-to-hidden and hidden-to-hidden weights is insufficient. To strengthen the learning capability, it is crucial to explore the diversity of latent structure in sequential signals and learn the stochastic trajectory of signal transitions to improve sequential prediction. This paper proposes the stochastic modeling of transitions in deep sequential learning. Our idea is to enhance latent variable representation by discovering the Markov state transitions in sequential data based on a K-state long short-term memory (LSTM) model. Such a latent state machine is capable of learning the complicated latent semantics in highly structured and heterogeneous sequential data. Gumbel-softmax is introduced to implement stochastic learning procedure with discrete states. Experimental results on visual and text language modeling illustrate the merit of the proposed stochastic transitions in sequential prediction with limited amount of parameters. |
URI: | http://hdl.handle.net/11536/150841 |
ISSN: | 2161-0363 |
期刊: | 2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP) |
显示于类别: | Conferences Paper |