標題: 閘道器系統之R語言中記憶體回收與管理最佳化
Garbage Collection and Memory Management Optimization in R Language for IoT Gateway System
作者: 劉得印
陳添福
Liou, De-Yin
Chen, Tien-Fu
資訊科學與工程研究所
關鍵字: 物聯網;物聯網閘道系統;R語言;記憶體回收;平行化;記憶體管理;IoT;IoT Gateway System;R language;Garbage collection;Parallel;Memory management
公開日期: 2016
摘要: 由於嵌入式裝置及網路通訊的蓬勃發展,物聯網(IoT)成為近年來最熱門的議題。隨著嵌入式裝置的種類及數量不斷增加,大量且複雜的資料也隨之產生。然而,大量的資料對於傳統的雲端計算會因為頻寬的限制導致無法預期的延遲。因此,把雲端計算的工作移到端點計算(使用本地的閘道器)來減輕負擔是勢在必行。普遍用於雲端計算做資料探勘的工具是R語言。然而R語言有一個嚴重的瓶頸-垃圾回收。主要有兩個原因讓R語言的記憶體回收成為瓶頸,第一個是因為記憶體回收必須處理大量的物件而花費大量的時間;第二個原因是最後一級的快取記憶體有很高的未命中,因此產生嚴重的未命中懲罰。當我們將這些資料探勘運算從雲端計算移到端點計算時,上述的問題會因為缺乏硬體資源導致更嚴重的後果。 因此,本篇論文提出部分平行化記憶體回收來改善記憶體回收時所花的時間;並提出集中式記憶體管理來減少因為最後一級的快取記憶體有高比例的未命中所帶來的未命中懲罰。我們的優化可以從那些花大部分時間在執行R程式而不是外部的函式庫的機器學習演算法中獲益。
The Internet of Things (IoT) has become the most popular topic in recent years due to the booming development of embedded devices and the widespread of communication network. With the number and variety of devices increasing, the large amount of data is produced and more complex. However, the large amount of data need to be centralized in cloud that will cause unpredictable latency because of the limitation of network bandwidth. Therefore, moving from cloud computing or centralized computing to edge computing on local gateway is underway. The most popular tool that widely used in cloud computing for data mining is R language. Nevertheless, R has a serious bottleneck – garbage collection. There are two reasons that causing the garbage collector of R become a bottleneck, processing a large amount of objects and high percentage of LLC misses incur serious miss penalty. Those issues will be more severe when moving the computing from cloud to edge because of the lacking of resources. In this thesis, we propose Partially Parallel Garbage Collection to improve the spend time during garbage collection; and Centralized Memory Management to reduce miss penalty that is caused by the high percentage of LLC misses. Our optimizations can benefit from those machine learning algorithms that spend most of time in R instead of external library.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070356077
http://hdl.handle.net/11536/139349
Appears in Collections:Thesis