標題: | 軟體定義網路下快速及具負載感知之控制器故障恢復機制 A Fast and Load-aware Controller Failover Mechanism for Software-Defined Networks |
作者: | 方科植 Fang, Ko-Chih 王國禎 Wang, Kuo-Chen 網路工程研究所 |
關鍵字: | 故障切換;故障偵測;故障回復;基因演算法;多控制器;軟體定義網路;Failover;failure detection;failure recovery;genetic algorithm;multiple controllers;SDN |
公開日期: | 2015 |
摘要: | 軟體定義網路 (SDN) 架構將交換器資料層與控制層分離,會造成在 SDN 中使用單一控制器來控制整個網路,並會發生單點故障 (SPOF) 問題。因此,多控制器架構被提出。針對多控制器,許多現有研究提出故障偵測與回復機制來解決其故障問題,但是這些解決機制有一些缺點。首先,主流的 SDN 控制器如Opendaylight,採用分散式檔案系統中的伺服器管理機制 (Akka) 來處理多控制器的故障切換,但是這些分散式檔案系統中的機制很輕易地把控制器判斷為故障,因而造成很高的誤判率。再者,在交換器重配置的過程中,現有研究所提出的方法皆無法同時最小化交換器-控制器之間的延遲及達成最佳化控制器負載平衡。本篇論文提出一個考量到控制器負載的SDN控制器故障切換機制 (FLCF) 來解決以上問題。在故障偵測的部分,我們使用多台控制器去偵測一台控制器,並由一台控制器做是否故障的最後判斷。在故障回復部分,每台控制器事先算好自己的回復計劃並同步給其他控制器,一旦控制器故障時,其他控制器可以立刻知道要接管哪些交換器。為了算出最佳回復計劃,我們提出了一個基於基因演算法的交換器重配置演算法。模擬結果顯示,我們的方法在故障切換時間及控制器負載平衡的表現優於其他方法。在相同的錯誤偵測率之下,我們的方法比起 FCF-M 能減少 15.7% 的故障切換時間。而在故障回復方面,我們的方法在控制器負載平衡的表現上優於其他大部分的方法。我們的方法雖然較 Survivor 的控制器負載變異較大,但是 Survivor 的結果會有過高的交換器-控制器之間的延遲。總結來說,我們的方法能達到有較少的整體故障回復時間,並且在系統回復正常運作之後有更低的交換器-控制器之間的延遲及有更好的控制器負載平衡表現。 The Software-Defined Network (SDN) is a new kind of network architecture that separates the control plane from the data plane. In the SDN network, using only one controller has the single point of failure (SPOF) problem. Therefore, multiple SDN controllers are adopted. Several existing multiple controller failover mechanisms, which include failure detection and failure recovery, have been proposed to resolve the failure problem of multiple controllers, but they have some shortcomings. First, a distributed file system management tool, Akka, was adopted by a multiple controller architecture, Opendaylight. However, the Akka judges a controller failure by using only one controller, so it has a high false positive rate during controller failure detection. Second, during switch reassignment, existing controller failure recovery mechanisms cannot reduce overall switch-controller delays and balance controllers’ load at the same time. In this thesis, we propose a Fast and Load-aware Controller Failover (FLCF) for SDNs to resolve the above problems. In failure detection, the proposed FLCF utilizes one detecting controller to collect all the other controllers' failure notifications about a failed controller so as to help the detecting controller to make the final decision. In failure recovery, each controller pre-computes its recovery plan (switch reassignment plan) and synchronizes the plan with other controllers that will take over the switches if the controller is determined failed. The proposed FLCF uses a genetic algorithm to derive a best switch reassignment for a failed controller. Simulation results show that the proposed FLCF has better performance in terms of failover time and controllers load standard deviation than the related works. Under the same false positive rate during failure detection, the proposed FLCF reduces 15.7% failover time compared to FCF-M. In failure recovery, the proposed FLCF achieves the best controllers load balancing compared to all related works except the Survivor. However, the Survivor has longer switch-controller delay than the proposed FLCF after switch reassignment. In summary, the proposed FLCF has less failover time compared to FCF-M. After switch reassignment, FLCF can achieves lower average switch-controller delay and better controller load balancing, compared to the related works. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT070256543 http://hdl.handle.net/11536/127536 |
Appears in Collections: | Thesis |