Title: 分散式計算系統中程式可靠度的分析與研究
Program Reliability Analysis in Distributed Computing Systems
Authors: 林敏勝
Min-Sheng Lin
陳登吉
Deng-Jyi Chen 
資訊科學與工程研究所
Keywords: 可靠度;分散式計算系統;分散式程式可靠度;reliability;istributed computing system; distributed program reliability
Issue Date: 1993
Abstract: 對於分散式計算系統的可靠度,其決定於通訊網路線的可靠度和節點的可
靠度,並且和其資源(例如:程式,檔案)的分散情形也有關係。這一篇
論文提出了一個演算法,此演算法是用來計算假設節點是完全可靠之下的
分散式計算系統之可靠度。此演算法稱為FREA (Fast Reliability
Evaluation Algorithm),是以factoring定理為基礎,並使用了許多可靠
度的化簡技巧。我們同時提出了兩個演算法用來計算假設節點不是完全可
靠之下的分散式計算系統之可靠度。第一個演算法稱為SM(Symbolic
Method),是以符號為主的兩次處理計算。第二個演算法稱為FM(
Factoring Method),同時對通訊網路線和節點做factoring的動作。和其
他的演算法作比較,由其結果可知我們所提的演算法對於計算大型分散式
計算系統的可靠度較有效率。為了縮小此問題的大小,我們提出了對於分
散式計算系統一般化的化簡方法。這些化簡方法可以很有效的縮小分散式
計算系統的大小,如此也就加快了計算可靠度的時間。這篇論文也證明了
此計算可靠度的問題是一個NP-hard問題,即使對於序列─平行網路,2─
樹網路,樹狀網路,星狀網路也是NP-hard問題。我們同時也提出了兩個
演算法,用來分別計算線性和環狀網路的可靠度,並且可以在polynomial
的時間之內得到答案。
The reliability of a distributed computing system depends on
the reliability of its communication links and nodes, as well
as on the distribution of its resources, such as programs and
data files. This thesis presents an algorithm for computing the
reliability of a distributed computing system with perfect
nodes. The algorithm, called FREA (Fast Reliability Evaluation
Algorithm), is based on the factoring theorem employing several
reliability-preserving reductions techniques. We also propose
two algorithms for computing the reliability of a distributed
computing system with imperfect nodes. Algorithm one, called SM
(Symbolic Method), is based on a symbolic approach that
consists two passes of computation. Algorithm two, called FM
(Factoring Method), employs a general factoring technique on
both nodes and edges. Comparisons with existing methods show
the effectiveness of the proposed algorithms for large
distributed computing systems. In order to reduce the size of
the problem, we will propose several general reduction methods
for computing the reliability of distributed computing systems.
These reduction methods can dramatically reduce the size of a
distributed computing system, and therefore speed up the
reliability computation. This thesis also shows that solving
this reliability problem is NP-hard even when a distributed
computing system is restricted to a series-parallel, a 2-tree,
a tree, or star structure. Two polynomial-time algorithms are
proposed for computing the reliability of a distributed program
which runs on a linear and a ring distributed computing
systems, respectively.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT820392083
http://hdl.handle.net/11536/57894
Appears in Collections:Thesis