標題: 使用社群偵測演算法於圖片相似度網路之圖片分群
Image Clustering Using Community Detection Algorithm on Image Similarity Network
作者: 洪大鈞
Hung, Ta-Chun
李素瑛
Lee, Suh-Yin
資訊學院資訊學程
關鍵字: 資料探勘;視覺特徵;親合式互動分群;階層式圖片分群;data mining;visual feature;Affinity Propagation Clustering;Hierarchical Image Clustering
公開日期: 2015
摘要: 隨著科技的進步,社群網路和行動裝置的盛行和普及,數位影像(digital image)製作與處理的技術與應用進展相當快速,民眾拍照的習慣也越來越頻繁,使得相片大量且快速地產生。根據社群網站的統計,相片每日以上億張圖片的速度在增加,這些圖片有很多隱含的資訊可以代表使用者的特性,找出使用者的特性,就可以在社群網路上開發更多應用,例如:交友系統、社團推薦或廣告行銷…等等。 本篇論文利用圖片分群法(image clustering)找出圖片的隱含資訊。圖片分群法的前置處理就是對圖片擷取特徵,我們針對圖片的擷取SIFT特徵和CLD特徵,作相似度計算。實驗中使用兩者分群法親和式互動分群(APC:Affinity Propagation)和階層式圖片分群(HIC:Hierarchical Image Clustering),HIC是本片論文提出的分群法, HIC有產生hub節點的特性,也就是該節點會同時分到一個以上的社群裡,代表該節點和其他社群都有很高的相似度,所以會被分到多個社群內。而APC只能將該節點分到關係最大的分群裡。 在社群網路上的圖片每天以上億張的速度在增加,如果要在社群網路上使用,系統的執行時間必須很快速,要在大量的圖片裡產生較多的分群數目且是準確度要高。在實驗中發現HIC的執行時間、找到分群的數目和F1量度都優於APC,比較實驗效果後,HIC比APC更符合我們想提出的社群網路應用。
In recent years, with the prevalence and popularity of community networks and mobile devices, digital image produced quite fast. The frequency of the people photographed is more frequent than before. According to statistics from social networks, Facebook、 Instagram and Flickr, they have hundreds of millions of photos uploaded per day. These pictures with a lot of hidden information can represent the characteristics of the users. To identify the characteristics of the users can develop more applications in the community networks, such as: dating system、 community recommendations、 advertising and marketing and so on. In the thesis, we use pictures clustering method to find hidden information of the picture. The pre-processing of image clustering method is to extract image feature, we extract image feature by SIFT (Scale-invariant feature transform) and CLD (Color layout descriptor), and calculate the similarity between images. This experiment uses two clustering method: APC (Affinity propagation clustering) and HIC (Hierarchical image clustering). HIC is a clustering method which this paper propose. HIC have hub node characteristics, the node will also be assigned to more than one community, the representatives of the node and other communities have a high degree of similarity, it will be assigned to multiple groups within. APC can only be assigned to the nodes in the most similar group. Social networks have hundreds of millions of photos uploaded per day. If you want the system can be applied on social networks, the execution time of the system must be very fast 、 produce more number of grouping a large number of the picture and the accuracy is higher. After comparing the experimental results, HIC’s execution time、 measure of clustering number and F1 score are superior to APCs’. HIC which this paper propose is most suitable for the community network applications.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079979524
http://hdl.handle.net/11536/125912
顯示於類別:畢業論文