基於地標選擇之多層異質網路表示學習加速

Full metadata record

DC Field	Value	Language
dc.contributor.author	蔡政銘	zh_TW
dc.contributor.author	帥宏翰	zh_TW
dc.contributor.author	Tsai, Cheng-Ming	en_US
dc.date.accessioned	2018-01-24T07:42:41Z	-
dc.date.available	2018-01-24T07:42:41Z	-
dc.date.issued	2017	en_US
dc.identifier.uri	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070450726	en_US
dc.identifier.uri	http://hdl.handle.net/11536/142800	-
dc.description.abstract	網路表示(graph representation)，旨在將大型資訊網路在低維度的向量空間中表示，已經在同質網路的範疇廣泛地被研究。推算得到資訊網路表示可用於數種應用，像是視覺化整個網路架構、做節點的分類問題、或是用於偵測社群。表示在這些資料分析的工作中扮演著重要的角色。由於許多不同種節點間的複雜關係，儘管異質網路蘊藏了比同質網路更多的潛在特徵，卻鮮少被研究。一種簡單的方式即將異質網路視為同質網路，將其利用現有演算法亦可得到網路表示，然而卻會造成資訊流失與計算速度緩慢的問題。因此，我們首先利用元路徑(metapath)指出異質點間有意義之路徑，以便產生網路表示時，網路上相近的點在另一空間亦為相近。此外，由於網路中節點對於網路表示之計算並非一樣重要，因此我們提出地標選擇，目的在於給節點們優先次序排列。高優先度的節點有更多的機會可以被訓練，以取得更好的表示。我們的地標選擇，將焦點鎖定在每個walk初始節點的分配。我們設計鄰居數指標(degree centrality)—一個根據相連的邊的數量，來排序節點的方法—作為決定地標的標準。我們將表示在多標籤分類法中，Micro-F1和Macro-F1的結果，用來衡量兩個方法的成效。元路徑展示了其優於同質表示方法的一面，而地標選擇則將該成效提升至更進一步的水平。	zh_TW
dc.description.abstract	Network representation, embedding large information networks into low dimensional vector spaces, has been widely studied in homogeneous networks. Deriving the latent representations of the information networks can apply to data analysis methods such as visualizing the entire network, classifying nodes into their belonging classes, and detecting communities. Representation serves a crucial role in those data analyzing tasks. Heterogeneous networks, containing more hidden features not available in homogeneous networks, however, are less studied. One straightforward method is to view a heterogeneous network as a homogeneous one and obtain its representation using existing algorithms. Yet, data loss and computational inefficiency is the bottleneck of previous methods. Hence, we first use metapath to highlight those meaningful paths so that pairs of nodes close in networks would also be near in the representation space. Landmark selection, as a result of that nodes differ in the importance to representation learning, purposes to give the nodes a priority order. High priority nodes are provided with more chances to train their representations. Our landmark selection concentrates on the distribution of the starting nodes of each walk. We design degree centrality as the criteria to determine landmarks, which rank the nodes by the number of their linked edges. The effectiveness of both methods is testified through the multi-label classification results in terms of Micro-F1 and Macro-F1 score. Metapath demonstrates its strength over conventional homogeneous representation methods while landmark selection further promotes the benefits to an even higher level.	en_US
dc.language.iso	en_US	en_US
dc.subject	異質表示學習	zh_TW
dc.subject	網路表示	zh_TW
dc.subject	特徵學習	zh_TW
dc.subject	維度縮減	zh_TW
dc.subject	元路徑	zh_TW
dc.subject	地標選擇	zh_TW
dc.subject	表徵學習	zh_TW
dc.subject	Heterogeneous representation learning	en_US
dc.subject	Network representation	en_US
dc.subject	Feature learning	en_US
dc.subject	Dimension reduction	en_US
dc.subject	Metapath	en_US
dc.subject	Landmark selection	en_US
dc.subject	Embedding learning	en_US
dc.title	基於地標選擇之多層異質網路表示學習加速	zh_TW
dc.title	On Accelerating Multi-Layered Heterogeneous Network Representation Learning via Landmark Selection	en_US
dc.type	Thesis	en_US
dc.contributor.department	電機工程學系	zh_TW
Appears in Collections:	Thesis