分散式計算系統中程式可靠度的分析與研究

標題:	分散式計算系統中程式可靠度的分析與研究 Program Reliability Analysis in Distributed Computing Systems
作者:	林敏勝 Min-Sheng Lin 陳登吉 Deng-Jyi Chen　資訊科學與工程研究所
關鍵字:	可靠度;分散式計算系統;分散式程式可靠度;reliability;istributed computing system; distributed program reliability
公開日期:	1993
摘要:	對於分散式計算系統的可靠度，其決定於通訊網路線的可靠度和節點的可靠度，並且和其資源（例如：程式，檔案）的分散情形也有關係。這一篇論文提出了一個演算法，此演算法是用來計算假設節點是完全可靠之下的分散式計算系統之可靠度。此演算法稱為FREA (Fast Reliability Evaluation Algorithm)，是以factoring定理為基礎，並使用了許多可靠度的化簡技巧。我們同時提出了兩個演算法用來計算假設節點不是完全可靠之下的分散式計算系統之可靠度。第一個演算法稱為SM(Symbolic Method)，是以符號為主的兩次處理計算。第二個演算法稱為FM( Factoring Method)，同時對通訊網路線和節點做factoring的動作。和其他的演算法作比較，由其結果可知我們所提的演算法對於計算大型分散式計算系統的可靠度較有效率。為了縮小此問題的大小，我們提出了對於分散式計算系統一般化的化簡方法。這些化簡方法可以很有效的縮小分散式計算系統的大小，如此也就加快了計算可靠度的時間。這篇論文也證明了此計算可靠度的問題是一個NP-hard問題，即使對於序列─平行網路，2─ 樹網路，樹狀網路，星狀網路也是NP-hard問題。我們同時也提出了兩個演算法，用來分別計算線性和環狀網路的可靠度，並且可以在polynomial 的時間之內得到答案。 The reliability of a distributed computing system depends on the reliability of its communication links and nodes, as well as on the distribution of its resources, such as programs and data files. This thesis presents an algorithm for computing the reliability of a distributed computing system with perfect nodes. The algorithm, called FREA (Fast Reliability Evaluation Algorithm), is based on the factoring theorem employing several reliability-preserving reductions techniques. We also propose two algorithms for computing the reliability of a distributed computing system with imperfect nodes. Algorithm one, called SM (Symbolic Method), is based on a symbolic approach that consists two passes of computation. Algorithm two, called FM (Factoring Method), employs a general factoring technique on both nodes and edges. Comparisons with existing methods show the effectiveness of the proposed algorithms for large distributed computing systems. In order to reduce the size of the problem, we will propose several general reduction methods for computing the reliability of distributed computing systems. These reduction methods can dramatically reduce the size of a distributed computing system, and therefore speed up the reliability computation. This thesis also shows that solving this reliability problem is NP-hard even when a distributed computing system is restricted to a series-parallel, a 2-tree, a tree, or star structure. Two polynomial-time algorithms are proposed for computing the reliability of a distributed program which runs on a linear and a ring distributed computing systems, respectively.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#NT820392083 http://hdl.handle.net/11536/57894
顯示於類別：	畢業論文