標題: | 深層貝氏推論於主題模型之研究 Deep Bayesian Inference for Topic Models |
作者: | 李詔熙 Lee, Chao-Hsi 簡仁宗 電機工程學系 |
關鍵字: | 主題模型;貝氏推論;深層學習;狄氏混合分佈;深層展開;Topic models;Bayesian inference;Deep learning;Dirichlet mixture model;deep unfolding |
公開日期: | 2015 |
摘要: | 隨著科技的進步,深度學習(Deep Learning)的方法快速發展,並廣泛地應用在許多資訊計算的領域上,例如影像辨識(Image Recognition)、語音辨識(Speech Recognition)、資料檢索(Information Retrieval),他們以類神經網路(Neural Network)為基礎,試圖透過多重非線性變換的方式對資料進行抽象化。在本研究中,我們藉由深度學習的概念提出兩套主題模型(Topic Model)。首先,過去潛在狄氏配置(Latent Dirichlet Allocation, LDA)採用單一狄氏分佈(Dirichlet Distribution)描述文件中各個潛在主題的先驗分佈(Prior Distribution),因為受限於單一先驗模型的假設,所以主題模型不易對文集本身的多樣性和異質性進行建模,為了改善這個問題,我們以潛在狄氏配置為基礎,導入狄氏混合分配(Dirichlet Mixtures),提出狄氏混合配置(Dirichlet Mixture Allocation),其透過混合先驗分佈的假設,來近似文件潛在主題的真實分佈。另一方面,在參數學習的部份,不論是監督式(Supervised Learning)或非監督式學習(Unsupervised Learning),一般採用貝氏變分推論法(Variational Bayes)學習主題模型的參數,其主要為提升文件的邊際相似度函數(Marginal Likelihood)的下界,來間接優化最終目標函數的結果,此間接性限制了主題模型的效果。我們透過深層展開框架(Deep Unfolding Framework),將貝氏變分推論仿製成深層網路結構,並以倒傳遞(Back-propagation)演算法,直接優化最終期望的目標函數。實驗結果將顯示兩套深度學習之主題模型如何提升文件表示的效能。 Owing to the growth of computational power, the researches on deep learning have been rapidly developed and widely applied for many applications such as image recognition, speech recognition and information retrieval. Traditionally, deep learning based on the deep neural network was performed by cascading multiple layers of nonlinear processing units. High-level abstraction from data could be achieved. In this study, we conduct two other ways of deep learning for topic modeling from multiple documents. First, we look deeply into the standard topic model based on latent Dirichlet allocation which is constructed by using a single Dirichlet prior to represent the topic multinomials over the words from the whole set of documents. The variability and the heterogeneity in the topics over different documents could not be well characterized from real-world data collection. To deal with this weakness, we develop a deep model, called the Dirichlet mixture allocation, where a Dirichlet mixture model with a set of Dirichlet distributions is introduced as a prior to represent the variations in topic multinomials over different documents. On the other hand, we conduct deep learning for an inference procedure of topic model. Traditionally, the topic model, inferred by the variational inference algorithm, is constrained by maximizing the lower bound of an objective function based on either an unsupervised model or a supervised model. This study aims to upgrade the variational inference by relaxing this constraint through maximizing the objective function via a deep unfolding procedure. The coordinate ascent learning is accordingly implemented and treated as the layer-wise learning in a deep model structure. Deep learning of unsupervised and supervised topic models can be achieved through the specialized back-propagation algorithms. Experiments are conducted to show how these two deep learning machines work for document representation. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT070250724 http://hdl.handle.net/11536/127585 |
Appears in Collections: | Thesis |