標題: MapReduce 架構之可靠性、延展性暨能源效率之提升研究
Toward a Highly Reliable, Scalable, and Energy-Efficient MapReduce Framework
作者: 陳穎平
CHEN YING-PING
國立交通大學資訊工程學系(所)
關鍵字: MapReduce;雲端計算;單點錯誤;可靠度;處理能力延展性;能源節省;MapReduce;cloud computing;single point of failure;reliability;scalability;energy efficiency
公開日期: 2012
摘要: MapReduce 是Google 提出的軟體架構,它將大的工作 (job) 拆解成許多較小的map 和reduce 任務 (tasks),並利用雲端環境的計算資源來平行執行任務,加快處理速度。近年來,許多組織單 位紛紛採用此架構來處理龐大資料。然而,MapReduce 遵循主從式 (master/slave) 架構的設計導 致 MapReduce 伺服器具有單點錯誤 (single point of failure) 以及受限的延展性 (limited scalability) 兩大問題。若擔任伺服器的節點突然故障,所有工作將無法順利完成。另外,由於計 算資源有限,在面臨大量讀寫請求時,可能無法即時處理每個讀寫請求,進而影響到工作的執行 時間。對於大規模的 MapReduce 叢集系統 (cluster),伺服器應具有高度的可靠性以及處理延展 能力。為此,我們提出為期三年的計劃構想:於第一年提出混合式冗餘機制,暫稱 Hybrid Redundant Mechanism (HRM) 來提升伺服器的可靠性。HRM 利用一個熱待命伺服器在原伺服器發生故障時 快速接管,以縮短系統停機時間,讓所有工作得以完成。此外,HRM 也利用一個冷待命伺服器 配合適當的熱身機制來提高整體可靠性。第二年則預計提出具有多重角色扮演之 MapReduce server cluster,暫稱為 Adaptive Hybrid Server Cluster (AHSC) 以同時提升伺服器的可靠性與延展 性。當節點故障或者讀寫請求處理時間不如預期時,AHSC 中剩餘節點會調整角色使得伺服器得 以持續運作,並讓所有讀寫請求完成,以免拖延執行時間。在第三年,我們將能源耗費 (energy consumption) 因素納入 AHSC 的設計中,運用排隊理論中的M/M/C/F 模型來提出具能源效率 (energy efficiency) 的調整機制, 暫稱為 Energy-efficient Scalability Adjustment Mechanism (E2SAM),以同時平衡 AHSC 的延展性與能源耗費。
MapReduce is a parallel programming model proposed by Google. With the capability of breaking a data-intensive job into many smaller map and reduce tasks and of running these tasks in parallel on a large-scale cluster of commodity machines, MapReduce plays an important role in Cloud computing and has been widely adopted by many companies and organizations. However, the master/slave architecture of MapReduce leads to the single point failure problem and the limited scalability of MapReduce servers. If the node acting as a MapReduce server unexpectedly fails, all the related jobs cannot proceed and be completed. Besides, the node, due to its limited CPU and memory resource, might be unable to immediately process read/write requests if the number of incoming requests dramatically increased, consequently delaying the completion of jobs. To overcome these problems, we would like to propose a three-year research project. In the first year, we will design a hybrid redundant mechanism (HRM) by integrating hot-standby and cold-standby mechanisms to achieve fast takeover and to improve the reliability of MapReduce servers. In the second year, we are planning to propose an adaptive hybrid server cluster (AHSC) to enhance both the reliability and scalability of MapReduce servers. In the third year, we will take energy consumption into consideration and propose an energy-efficient scalability adjustment mechanism (E2SAM) such that the request processing capability of MapReduce servers can be dynamically adjusted according to the number of incoming requests, and energy consumption can be minimized by switching the active/standby modes of cluster nodes.
官方說明文件#: NSC101-2628-E009-024-MY3
URI: http://hdl.handle.net/11536/98785
https://www.grb.gov.tw/search/planDetail?id=2638680&docId=397202
顯示於類別:研究計畫