標題: 基於矩陣分解及隱含主題模型之部落格文章推薦
Blog Article Recommendation based on Matrix Factorization and Latent Dirichlet Allocation
作者: 陳美棻
Chen, Mei-Fen
劉敦仁
Duen-Ren Liu
資訊管理研究所
關鍵字: 部落格;矩陣分解;隱含主題模型;內容導向式過濾;協同式過濾;blogosphere;Matrix Factorization;Latent Dirichlet Allocation;content-based filtering;collaborative filtering
公開日期: 2013
摘要: 由於資訊科技技術的快速進步,Web2.0已成為相當熱門的社群媒體發佈平台。於Web2.0所提供的各式應用服務中,部落格(blogosphere)提供相當便利的環境供使用者分享個人偏好與表達情感。但隨著部落格文章與使用者的快速增加,使用者很難從大量的文章主題中發掘有興趣的文章內容。 本研究提出兩種方法:Matrix Factorization以及Latent Dirichlet Allocation來進行部落格文章推薦。部落格文章推薦不像其他應用系統所收集的資料紀錄有顯性的回饋資料,只有使用者看過和沒看過的紀錄。針對沒看過的資訊,無法判定使用者是不喜歡或是沒看到,由於缺乏使用者給予的負面回饋,因此無法得知使用者的真正喜好。矩陣分解(Matrix Factorization)處理的資料為使用者對項目的顯性的評分值,本研究提出矩陣分解處理隱性的資料集(如:看過或買過的紀錄)之方法。另外使用隱含主題模型(Latent Dirichlet Allocation),找出文件和主題的機率分佈,且計算推薦文章的相似度,並予以推薦。最後我們提出一個結合矩陣分解以及隱含主題模型的方法,以提高推薦的準確度。並與傳統的推薦方法:內容式過濾、協同式過濾比較推薦準確度。實驗結果顯示本研究提出之方法能夠依使用者的興趣,有效率的推薦使用者喜好的文章。
Web 2.0 has become a popular social media on the Internet due to the fast evolution of the Internet technologies, resources, and users. Among the applications of Web 2.0, blogosphere is a new Internet social media for users to express their preference and personal feelings. Most of the people tend to receive the newest information and articles related to popular issues. However, with the rapidly increasing number of active writers and viewers, it is hard for people to discover useful information that is beneficial or interesting to them. In this work, we propose three recommendation approaches: Matrix Factorization, Latent Dirichlet Allocation and Hybrid MF and LDA to recommend blog articles to users. First, many recommendation systems suggest items to users by utilizing the techniques of collaborative filtering based on historical records of items that the users have views, purchased, or rated. Unlike the much more extensively researched explicit feedback, we don’t have direct input from the users regarding their preferences. In particular, we lack substantial evidence on which articles user dislike. In this work, we propose Matrix Factorization (MF), and identify unique properties of implicit feedback datasets. This leads to a factor model which is especially tailored for implicit feedback recommenders. Second, we use probabilistic modeling based on Latent Dirichlet Allocation (LDA), which is originally proposed as a probabilistic document-topic model. Third, we develop a Hybrid MF and LDA Recommendation (HML) recommendation approach that integrates Matrix Factorization and Latent Dirichlet Allocation to make recommendations that satisfy user interests and enhance the accuracy of recommendation result. Finally, the experiment result demonstrates that the proposed approach can effectively recommend users’ desired blog articles with personal interests.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070153411
http://hdl.handle.net/11536/74830
顯示於類別:畢業論文