Title: 分散式檔案系統評估環境
AN EVALUATION ENVIRONMENT FOR DISTRIBUTED FILE SYSTEMS
Authors: 許凱平
Xu, Kai-Ping
張明峰
Zhang, Ming-Feng
資訊科學與工程研究所
Keywords: 分散式檔案系統;評估;以檔案存取記錄;資訊;電腦;電子工程;以檔案存取記錄驅動的模擬;DISTRIBUTED FILE SYSTEMS;EVALUATION;TRACE-DRIVEN SIMULATION;INFORMATION;COMPUTER;ELECTRONIC-ENGINEERING;distributed file systems;evaluation;trace-driven simulation;INFORMATION-SCIENCE
Issue Date: 1994
Abstract: 分散式檔案系統是分散式系統中一個重要的部份,它使分散式系統中各機 器間能分享儲存資源。目前已有多種分散式檔案系統的設計,卻一直沒有 一個能充份反應真實使用情況的系統效能測試程式,可以不同的檔案系統 。本論文的主要目的便是設計一個以真實檔案存取記錄驅動的分散式檔案 系統評估環境,一方面可以用來分析不同的檔案系統,並且可以做為分散 式檔案系統設計的工作平台。這個評估環境包含三個部份:檔案存取記錄 收集器、檔案存取行為分析及以實際檔案存取記錄驅動的分散式檔案系統 模擬器。檔案存取記錄收集器在實際作業的系統中,記錄使用者程序所發 出與檔案存取相關的系統呼叫。檔案存取記錄分析則包含了針對使用者程 序的分析、個別檔案存取的分析以及檔案開啟及生命周期的分析。模擬器 以收集器記錄的檔案存取以及檔案系統的設定做為輸入進行欲模擬之分散 式檔案的評估。我們已經完成了這一個分散式檔案系統評估環境所需要的 三部份。分散式檔案系統的設計者可以在實作之前,利用這個環境去驗證 他們的設計,並且也可以利用這個環境來探討檔案存取的行為。我們也寫 了一個簡單的分散式檔案系統,用以分析分散式檔案系統中的讀取策略、 快取大小以及快取中區塊大小的影響。與以前的研究比較,我們記錄了所 有的檔案存取,準確度應該比以前的研究為佳。模擬結果顯示,read- ahead及open-and-read的方式的確可以降低檔案服務READ的回應時間。但 是由於read- ahead及open-and-read的方式可能會帶入一些不需要的資料 ,因此會造成網路及伺服器較大的負擔。從我們的模擬結果來看,當區塊 大小不大時(2k byte),這幾種開檔及讀檔的方式的所造成的網路及伺服 器負載差異不大。但是在大區塊(64k byte)的策略下,這幾種讀取的方式 就會造成顯著的差異。Read-ahead的方式會比單做read的方法多產生約百 分之二十的負載。整體而言,在我們的模擬中8k區塊,open-and-read加 上read-ahead的策略可以得到最好的結果。這個策略可以有效地降低READ 系統呼叫的回應時間,只有增加少許的網路及伺服器負載。 Distributed file systems (DFS) which enable storage resources sharing among machines are the key components of distributed systems. Currently, there are many DFS designs available, such as Sun's Network File System, Andrew File System, Sprite, Locus, etc.. . However, there is no common performance test-bed that uses real working load to compare different designs. Our goal is to build a trace-driven evaluation environment to analyze the performance of distributed file systems and to be used as a workbench for DFS designs. We have completed the implementation of the evaluation environment. DFS designers can use the evaluation environment to verify their design before the system implementation. We have also evaluate an example DFS design by analyzing the performance of different open/read policies, cache size and cache block size. As a comparison to early studies which record partial file access records, our simulations use complete file access records collected in client machines and the simulation results are more accurate. Our simulation results show that read-ahead and open- and-read policies indeed reduce the response time of the READ operations. The tradeoff is that they may read some unneeded data blocks, and thus increase the loads of the file servers and the network. Simulation results also show, when the block size is 2k byte, the differences of server and network's load between different read/open policies is small. However, if large block size(64k) is used, the differences become significant. The server load of read-ahead method is about 20% more than that of pure read. Overall, the policy with 8k block size, open-and-read, read-ahead method offers the best solution in our simulations, it reduces the response time of READ effectively with little overhead in both the loads of the network and the servers.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT834393001
http://hdl.handle.net/11536/59900
Appears in Collections:Thesis