標題: 多種類流量下組合式輸入輸出佇列交換器之排程演算法
Parallel Matching Algorithms for CIOQ Switches with Multiple Traffic Classes
作者: 郭英哲
Ying-Che Kuo
李程輝
Prof. Tsern-Huei Lee
電信工程研究所
關鍵字: 組合式輸入輸出佇列交換器;最小緩衝優先及最緊急優先演算法;可變長度封包;輸出啟始並行配對演算法;權重式公平分配;配對演算法;combined input and output queued switch;least cushion first/most urgent first scheduling algorithm;variable-length packet;output initiated parallel matching algorithm;weighted fairness;matching algorithm
公開日期: 2004
摘要: 組合式輸入輸出佇列交換器(combined input and output queued (CIOQ) switch)在交換架構(switching fabric)加速(speed-up)兩倍工作下,已被證明可達到輸出佇列交換器(output queued (OQ) switch)相同的效能。使用CIOQ交換器主要優點在於提供服務品質(quality of service, QoS)的同時,可降低對記憶體讀寫頻寬需求。而CIOQ交換器能否正確仿傚OQ交換器決定在於有效率的排程演算法(scheduling algorithm),一種名為最小緩衝優先及最緊急優先(least cushion first/most urgent first, LCF/MUF)演算法被應用於兩倍加速下的CIOQ交換器,已被證明可完全仿傚OQ交換器,並適用於任何服務規範(service discipline)。但是使用LCF/MUF演算法時,複雜的緩衝值(cushion)計算,以及在輸出埠需處理封包重新排列(reordering),使得此方法難以應用於高速交換器。本論文第一部分提出一種近似性LCF/MUF排程演算法,同時估算它的權重式循環(weighted round robin, WRR)服務效能。為了易於實做達成,此近似性LCF/MUF演算法計算近似的緩衝值,亦不採行封包重新排列機制,換得的是,此CIOQ交換器失去正確仿傚OQ交換器的特性。但經由模擬結果顯示,在均勻性(uniform)、非均勻性(non-uniform)、關聯性(correlated)三種流量模型(traffic model)下,即使流量負載達到0.9,採用我們所提方法的CIOQ交換器效能可趨近於OQ交換器的效能。此外,還具有維持原有封包傳送次序的特性。另一方面,輸入輸出埠的緩衝儲存空間大小也是能否正確仿傚OQ交換器的條件之一,本論文也展現輸入輸出埠安置有限緩衝儲存空間下的效能表現。 由於大部分先前所被提出的排程演算法都應用於固定長度(fixed-length)封包的交換,因此對於可變長度(variable-length)的封包處理,須要切割(segmentation)及重組(reassembly)的步驟,同時會有不同封包片段相互插斷(interleaving)的問題。本論文第二部分提出一種封包基礎LCF/MUF (PB-LCF/MUF)演算法,此方法以整個可變長度的封包為交換單位,可避免不同封包片段相互插斷的問題發生,並省略封包切割及再重組的機構。模擬結果顯示,將PB-LCF/MUF演算法應用於5倍加速下的CIOQ交換器效能可趨近於使用權重式循環服務的OQ交換器效能。 為了達成QoS保證,交換器需要擁有數量龐大的佇列(queue),但是在考慮硬體實做成本時,佇列的數量將是取決因素之一。本論文最後部分提出一種的新穎的配對演算法,稱為輸出啟始並行配對(Output Initiated Parallel Matching, OIPM)演算法。在我們建議的交換器架構裡,封包被儲存在輸入埠緩衝記憶體;佇列則安置在輸出埠。在K種類流量環境下,輸出埠僅需安置K個佇列,遠少於輸入佇列交換器(input queued (IQ) switch)需要安置N×K個佇列的數量。除了佇列數量減少外,我們的提議具有保證不會發生挨餓(starvation)現象、維持封包傳送次序、依流量種類作權重式頻寬公平分配等特性。
Packet switching is the core technology of the Internet. Internet link speeds are increasing to beyond tens of gigabits per second to satisfy a continuing growth in traffic. This increase in link speed and the need for quality of service (QoS) for diverse applications such as voice, audio, and video drive research into new switch architectures. It has recently been shown that a combined input and output queued (CIOQ) switch with a speed-up factor of 2 can exactly emulate an output-queued (OQ) switch. The main benefit of using a CIOQ switch is to reduce memory bandwidth requirement while providing QoS guarantees since memory bandwidth sets a limit in building a large-capacity switch. The key component for exact emulation is a matching algorithm for bipartite graphs. For example, a CIOQ switch with a speedup factor of 2, which adopts the least cushion first/most urgent first (LCF/MUF) matching algorithm, can exactly emulate an OQ switch with any arbitrary service discipline. However, the complexities of cushion calculation and packet reordering required at output ports make the algorithm very difficult to be realized in a high-speed switch. In the first part of this thesis, we propose an approximate LCF/MUF algorithm and evaluate its performance for the weighted round robin (WRR) service discipline. For ease of implementation, the proposed algorithm calculates approximate cushions and does not perform reordering at the output ports. The trade-off is that it loses the property of exact emulation. It was found, via computer simulations, that the performance of a CIOQ switch with the proposed single-iteration matching algorithm is close to that of an OQ switch under uniform, non-uniform, and correlated input traffic models for offered load up to 0.9. In addition, the packet departure order can be maintained under the single-iteration algorithm. On the other hand, to exact emulate an OQ switch, the buffer size at each input and output port is assumed to be of infinite size. This assumption is obviously unrealistic in practice. The performance of the LCF/MUF algorithm with finite input and output buffers is also presented. In current Internet environment, packets are transported with different lengths. Most of previous scheduling algorithms found in open literature were designed for fixed-length packets (or cells) scheduling. Hence, segmenting packets of variable-lengths into cells and re-assembling the cells into original packets are needed before and after packets transmissions across the switching fabric because cells belonging to various packets may interleave with one another. In the second part of this thesis, we present the packet-based LCF/MUF (PB-LCF/MUF) algorithm for CIOQ switches and evaluate its performance via computer simulations. The PB-LCF/MUF algorithm delivers each packet as a whole and thus no cell interleaving of packets happened at output ports. Therefore, no re-assembly buffer is needed. Numerical results show that the performance of a CIOQ switch with a speedup factor of 5 which adopts the single-iteration PB-LCF/MUF algorithm is close to that of an OQ switch under the WRR service discipline. In order to provide QoS guarantee, a switch may have to maintain a large number of queues. However, when considering the cost of hardware implementation, the number of queues is a decisive factor. In the last part of this thesis, we propose a novel output initiated parallel matching (OIPM) algorithm for large-scale input-buffered switches. In our proposed architecture, packets are buffered at input ports while queues are maintained at output ports. The number of queues maintained at each output port is K queues for an N×N switch with K priority classes of traffic which is much smaller than N×K virtual output queues (VOQ) maintained in an input-queued switch. In addition to reducing the number of queues, our proposal guarantees starvation free, in-order delivery, and achieves weighted fairness among traffic classes.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT008513818
http://hdl.handle.net/11536/60112
顯示於類別:畢業論文


文件中的檔案:

  1. 381801.pdf
  2. 381802.pdf
  3. 381803.pdf
  4. 381804.pdf
  5. 381805.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。