Title: 基於時空資料之社會連結關係推論模型
Inferring Social Connection from Spatio-temporal Data
Authors: 高敏嘉
Kao, Min-Chia
彭文志
Peng, Wen-Chih
資訊科學與工程研究所
Keywords: 社會連結;移動行為;時空資料;社群網站;social connection;mobility relationship;spatio-temporal data;social network
Issue Date: 2015
Abstract: 隨著社群網路的盛行,有越來越多的社群媒體提供豐富多樣的互動方式供使用者間做交流。透過發表文章、照片,甚至在拜訪過的地方打卡,讓使用者能夠輕易地分享日常動態。其中,打卡資料包含了時間與地點的資訊,讓我們能夠進一步探討使用者的移動行為。另一方面,我們也發現:社群網站上使用者之間的連結關係常與他們的移動行為有著密不可分的關聯。越常共同出現在同一個地點的人們,常隱含著他們之間具有朋友關係的可能性。因此,本論文旨在透過分析使用者的移動行為,探討他們在社群網站上的連結關係。在解決社會連結關係推論問題時,一種常見的方式即為分析使用者之間共同出現在同一個地點的頻率。共同出現在相同地點的頻率越高,在社群網站上具有朋友關係的機率也越高。然而這樣的推論模型常隱藏著錯誤估計湊巧出現在同一個地點的風險。 為了有效區別巧合以及朋友間的真實聚會,本論文提出一個基於時空資料的社會連結關係推論模型:SoC( Social Connection Inference Model)。在此模型的第一階段,SoC會萃取出三個描述共同出現行為的特徵值:地點多樣性(diversity)、地點熱門度(popularity)、時間關聯性(temporal)。其中,時間關聯性包含了兩個衡量指標:穩定性(stability),以及持續性(duration)。特別值得注意的是:在我們的實驗中,相較於其他特徵值,「穩定性」以及「持續性」在推論社會連結關係的模型中具有相當的指標性。 在此推論模型的第二階段,SoC藉由隨機森林分類器(Random Forest)整合第一階段所產生的特徵值,並進一步訓練出分類模型,推論使用者之間的社會連結關係。為了評估本論文所提出的方法,在實驗上我們使用了一組真實世界的dataset:Gowalla。實驗結果說明,本論文所提出的社會連結關係推論模型能夠比其他現有方法有更好的準確度。
Rich location based services for mobile users, such as Facebook, Foursquare, Gowalla, etc., allow users to share their location information on social networks. Each location information having a geographical coordinates that enable us to investigate their mobility relationship in the real world. In this paper, we analyze users’ mobility relationship disclosed on social network to infer friendship connection in the physical world. A commonly used method for tackling link prediction problem is to measure the frequency that users visit the same place at the same time. That is, people who co-locate more frequently, have higher possibility that they are socially correlated. However, how to limit the influence of coincidence could be a problem. Here, We investigate three features to depict the co-occurrence: diversity, popularity, and temporal to differentiate between coincidences and real togetherness. In particular, we propose two features, stability and duration in temporal aspect, which has an unexcelled performance in terms of AUC in our experiments. In the light of co-occurrence, we adopt a classification model, random forest, to further utilize the features obtained from the previous step to predict social connections between user pairs. Extensive experiments on a real world dataset, Gowalla, show that our method could outperform state-of-the-art methods.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070256054
http://hdl.handle.net/11536/127387
Appears in Collections:Thesis