標題: 在社群媒體中探勘使用者軌跡特徵
Mining User Trajectory Patterns in Social Media
作者: 朱文園
彭文志
陳伶志
Zhu, Wen-Yuan
Peng, Wen-Chih
Chen, Ling-Jyh
資訊科學與工程研究所
關鍵字: 社群媒體;使用者移動行為;個人軌跡模型;社群結構;軌跡特徵探勘;地點推銷;影響最大化;傳播模型;打卡行為;Social media;User movement behavior;Trajectory profile;Community structure;Trajectory pattern mining;Location promotion;Influence maximization;Propagation model;Check-in behavior
公開日期: 2016
摘要: 隨著擁有定位能力的裝置愈來愈輕便和普及(如智慧型手機和穿戴式裝置),使得使用者的位置資訊愈來愈容易被取得。許多社群媒體提供使用者分享軌跡資料,如旅遊軌跡、跑步軌跡、騎車路線和行車軌跡。另外,許多社群媒體也提供使用者分享位置訊息,如打卡資訊和相片中的位置資訊。這些包含地理資訊的資料代表著使用者的移動行為,如果能從這些資料中探勘使用者的軌跡特徵,則可以在社群平台上提供更多和地理相關的個人化服務。在這篇論文中,我們專注於在社群媒體中探勘使用者的軌跡特徵。 首先,在第一個主題中,我們從軌跡資料中探勘有類似移動行為的使用者社群。找出基於移動行為的使用者社群能夠快速找到有著類似移動行為的使用者,並在行動社群平台上提供像是使用者推薦和軌跡推薦等服務。針對這個問題,我們設計 SP-tree (Sequential Patteren tree) 用來表示使用者移動軌跡的特徵,其中 SP-tree 不止包含了使用者移動的循序特徵 (sequential pattern) 也包含使用者下一步移動行為 (next movement)。另外也針對不同特性的軌跡資料,設計了兩種不同的演算法 DF 和 BF 以適應不同特性的軌跡資料,有效率地建立每個使用者的 SP-tree。另外為了計算使用者之間的相似度,我們定義 SP-tree 之間的相似度算法。最後基於使用者之間的相似度,針對使用者移動社群的特性,設計了一個基於貪婪演算法的移動社群探勘演算法 Geo-Cluster,找出有類似移動行為的使用者社群。 在第二個主題中,我們發現在社群媒體中,許多店家希望來店消費的人在店裡打卡 (check-in),藉此吸引更多的消費者前來消費。從影響最大化 (influence maximization) 的角度來看,店家會選擇一群種子使用者在店內打卡,讓這些使用者能夠影響其朋友來店內打卡並消費,這些使用者打卡之後又能影響其朋友。為了選擇一群種子使用者,能夠讓最多人能在店內打卡,我們設計打卡資訊在社群媒體中傳播的模型 (Location-aware Independent Cascade Model; LICM)。為了找出傳播模型中的參數,我們探勘使用者的打卡記錄,設計以高斯模型為基礎的模型 (Gaussian-based Mobility Models; GMMs) 和以距離為基礎的模型 (Distance-based Mobility Models; DMMs) 描述使用者的打卡行為,透過 GMMs 和 DMMs 我們能夠精準地根據使用者打卡紀錄推測傳播模型中的參數。透過此傳播模型,我們能輕易地選出一群種子使用者讓最多人能在店內打卡。 在第三個主題中,我們認為使用者的打卡資訊除了包含使用者什麼時間在什麼地方的訊息之外,也包含使用者所從事了活動的訊息。如果我們從使用者打卡紀錄中探勘地點、時間和活動之間的關係,社群媒體就能提供更為精準的個人化在地服務。因此,我們專注於個人化活動預測 (individual activity inference) 和個人化移動預測 (individual mobility inference)。我們分析使用者的打卡紀錄,透過貝氏網路 (Bayesian network) 描述使用者打卡行為中,地點、時間和活動之間的關係。透過此網路模型,個人化活動預測和個人化移動預測也可被簡化成只需要探討 1) 活動--時間的關係和 2) 地點--活動的關係。針對活動--時間的關係,我們提出 Order-1 Activity Transition Model (OATM) 描述使用者打卡中活動和時間的關係。另一方面,針對地點--活動,我們使用以高斯組合模型 (Gaussian mixture model; GMM) 描述使用者打卡中地點和活動的關係。透過我們所提出的 OATM 和 GMM,我們可以大幅簡化計算個人化活動預測和個人化移動預測,也能得到良好的預測結果。 從此三個主題中,我們能夠了解社群媒體上包含地理資訊的傳播和使用者的移動行為是有關係的,此篇論文即是透過探勘社群媒體上的軌跡特徵,嘗試找出這些關係並且加以描述。
With the increase in the number of portable devices equipped with location-aware sensors (e.g., smart phones and wearable devices), the positions of users can be easily obtained. Many social media allow users to share their own trajectories which capture their travel routes, biking paths and driving traces. Furthermore, many social media also allow users to share their locations with their friends for more social interactions. The data which contain geographic information represent users' movement behavior. If we mine user trajectory patterns from these data, many localized and personalized services can be provided. In this dissertation, we present our techniques of mining user trajectory patterns in social media. In the first work, we tackle the problem of discovering movement-based communities of users, where users in the same community have similar movement behavior. Note that the identification of movement-based communities is beneficial to location-based services and trajectory recommendation services. Specifically, we propose a framework to mine movement-based communities, which consists of three phases: 1) constructing trajectory profiles of users, 2) deriving similarities between trajectory profiles, and 3) discovering movement-based communities. In the first phase, we design a data structure, called the Sequential Probability tree (SP-tree), as a user trajectory profile. SP-trees not only derive sequential patterns, but also indicate transition probabilities of movements. Moreover, we propose two algorithms: BF (Standing for Breadth-First) and DF (Standing for Depth-First) to construct SP-tree structures as user profiles. To measure the similarity values among users' trajectory profiles, we further develop a similarity function that takes SP-tree information into account. In light of the similarity values derived, we formulate an objective function to evaluate the quality of communities. According to the objective function derived, we propose a greedy algorithm Geo-Cluster to effectively derive communities. To evaluate our proposed algorithms, we have conducted comprehensive experiments on two real datasets. The experimental results show that our proposed framework can effectively discover movement-based user communities. In the second work, we investigate the key techniques that can help businesses promote their locations by advertising wisely through the underlying location-based social networks (LBSNs). In order to maximize the benefits of location promotion, we formalize it as an influence maximization problem in an LBSN, i.e., given a target location and an LBSN, in which a set of $k$ users (called seeds) should be advertised initially such that they can successfully propagate and attract most other users to visit the target location. Existing studies have proposed different ways to calculate the information propagation probability, which is how likely it is that a user may influence another, in the setting of a static social network. However, it is more challenging to derive the propagation probability in an LBSN since it is heavily affected by the target location and the user mobility, both of which are dynamic and query dependent. This work proposes two user mobility models, namely Gaussian-based and distance-based mobility models, to capture the check-in behavior of individual LBSN users, based on which location-aware propagation probabilities can be derived. Extensive experiments based on two real LBSN datasets have demonstrated the superior effectiveness of our proposals compared with existing static models of propagation probabilities to truly reflect the information propagation in LBSNs. In the third work, we think that the check-in records reflect not only when and where they are, but also what they are doing. If we can capture the relations of the location, time, and activity factors from check-in records, the location-based social platforms can provide more personalized location-based services to users. Therefore, we focus on two inference problems, inferring individual activity and mobility, based on their check-in records. For these two inference problems, we analyze check-in records, and utilize a Bayesian network to represent the relations among the location, time, and activity factors of the check-in records. Based on the proposed network model, the two inference problems can be simplified to two modules, the activity-time and the location-activity model. For the activity-time model, we propose the Order-1 Activity Transition Model (OATM) to capture the activity-time relations of the check-in records. Moreover, for the location-activity model, we exploit the Gaussian mixture model (GMM) to capture individual mobility features in different activities. To evaluate the proposed network model for the two inference problems, we conduct extensive experiments on two real datasets, and the experimental results show that our proposed Bayesian-based approach has higher performance than the state-of-the-art approaches for activity and mobility inferencing in LBSNs. From the three works in this dissertation, it is clear that the geographic information propagation in social media is related to user movement behavior. In this dissertation, we try to discover and describe the relations between the geographic information propagation in social media and the user movement behavior by mining user trajectory patterns in social media.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070086028
http://hdl.handle.net/11536/143108
Appears in Collections:Thesis