標題: 基於MapReduce的鄰近窗格叢集查詢之研究
MapReduce-based NearestWindow Cluster Query Processing
作者: 江炫佑
黃俊龍
Chiang, Hsuan-Yu
Huang, Jiun-Long
資訊科學與工程研究所
關鍵字: 鄰近窗格叢集查詢;網格索引;空間資料查詢;MapReduce;Nearest Window Cluster Query;grid index;spatial query processing;MapReduce
公開日期: 2016
摘要: 隨著近幾年各種地理資訊系統(Geographical Information System GIS)以及地理位置服務(Location-Based Service LBS)的發展,各種空間查詢的相關應用也相繼被提出。但由於攜帶裝置的普及,使用者也大量增加的情況下,隨著時間推移會不斷產生龐大的資料,為了能夠有效率的處理這些資料,我們希望能透過分散式運算來提昇處理資料的量與時間。在本篇論文中,我們針對NearestWindow Cluster Query (NWCQ)作研究。在此種查詢中,使用者可以指定一個查詢點、區域大小以及區域內資料數量的門檻,NWCQ 會回傳指定的資料點集合,且這些點被指定的區域大小所涵蓋。原有的NWCQ演算法使用R-tree作為index,然而因為該種index的各個子節點中可能有overlap的現象,在平行處理上除了效率不彰外,也不容易實作。因此本篇論文採用grid index 的變種作為index ,基於MapReduce 框架來實作分散式版本的NWCQ 演算法,並針對資料分佈不均的情況作調整,在資料比較密集的區域與資料稀疏區域有不同的切割方式,以增進處理的效能。
With the growing development of Geographical Information System (GIS) and Location Based Services (LBS), various applications of spatial query are proposed. However, due to the wide use of mobile devices and explosive growth of users, a huge amount of data is generated with time passing. Distributed computing techniques are used to facilitate efficient query processing on such huge data. In this paper, we focus on the spatial query, called Nearest Window Cluster Query (NWCQ). By given a query point, desired window size and the amount of data objects, NWCQ returns a group of objects within a desired window range. The previous work used R-tree for spatial indexing, but R-tree might have overlapping between index nodes, which is inappropriate for distributed computing. Therefore, based on MapReduce framework, we propose a grid-based indexing algorithm to index data objects and a companion query processing algorithm for NWCQ.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070356007
http://hdl.handle.net/11536/139900
顯示於類別:畢業論文