標題: 個人化網頁排序演算法之研究
PPR:Personalized PageRank for Web Search
作者: 林裕欽
Yu-Chin Lin
彭文志
Web-Chih Peng
資訊科學與工程研究所
關鍵字: 個人化搜尋;資料探勘;網路探勘;Personalized search;data mining;Web mining
公開日期: 2004
摘要: 在數量龐大的網頁中搜尋高品質並且與使用者興趣有相關的網頁已經成為一個重要的研究領域。現今有許多以關鍵字為主的搜尋引擎,但是這些搜尋引擎通常都會回傳大量的搜尋的結果,使用者仍然要花許多的時間去找出他們真正需要的網頁。世上最強大的搜尋引擎 Google 使用 PageRank 演算法用以搜尋結果做排序,以符合需求。雖然可以盡量的找到了許多的搜尋結果,但是 Google 並不能提供個人化的網頁排序。因此,將搜尋結果以符合使用者的興趣做排序已經成為了一個重要的議題。在這篇論文當中,我們首先將實做一個用戶端的模組,此模組會記錄使用者瀏覽網頁的行為,並且使用資料探勘的方法找出頻繁瀏覽網頁的行為。我們將從使用者的經常瀏覽行為中找出使用者的興趣。在這篇論文當中,依據使用者的興趣,我們提出了一個個人化網頁排序演算法,簡稱 PPR。這個演算法將分成四個步驟,第一個步驟會依照使用者的興趣給予搜尋結果的網頁不同的初始值。第二個步驟是依照使用者的興趣在搜尋的結果中增加虛擬的網頁與虛擬的鏈結。第三個步驟則是依照使用者先前的點選行為來加以調整。為了增加個人排序的精準度,我們在第四步驟加上了群組推薦的技術。從實驗結果中,証明演算法 PPR 在提供個人化的排序上不但是相當的有效率,並且也可適時地動態調整網頁順序以符合個人化之需求。
With a huge amount of Web pages, searching web pages for high quality and relevance to user interests has become an important research …field. Nowadays, there are many keyword-based search engines available for this purpose. However, the amount of search results found by these search engines usually is very large and users still spend a lot of time to find what they really want after searching. Note that the most powerful search engine Google explores algorithm PageRank in ranking the search results. Through …finding as many pages as possible, Google cannot provide a personalized Web ranking. Thus, ranking search results that satisify users interests has become a growing importance topic, which is the very problem we shall address. In this paper, we first implement a client-side module to capture user browsing behavior and then exploit the technique of data mining to mine frequent browsing access patterns from user browsing behavior. In light of frequent browsing access patterns, we propose a method to extract user interests as user preferences. In this paper, we propose a new algorithm with the idea of adjusting the ranking scores of Web pages. The adjustments are in accordance with user preferences mined from user browsing behavior. Specifically, the algorithm, referred to as algorithm PPR(standing for Personalized PageRank), is divided into four phases. The first phase assigns the initial weights based on user interests. In the second phase, the virtual links and hubs are created according to user interests. By observing user click streams, our proposed algorithm will incrementally reflect users favors for the personalized ranking in the third phase. To improve the accuracy of ranking, collaborative filter is taken into considerations when the query with similar keywords are submitted by users having similar user interests. By conducting simulation experiments, we have shown that algorithm PPR is not only very effective but also very adaptive in providing personalized ranking to users.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009217600
http://hdl.handle.net/11536/74046
顯示於類別:畢業論文


文件中的檔案:

  1. 760001.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。