標題: 瀑布執行之處理器模型---以循序控制電路複雜度達到亂序執行效能
Waterfall Execution Processor Model---Achieving Out-of-Order Execution Performance with In-Order Control Circuit Complexity
作者: 鍾崇斌
CHUNG CHUNG-PING
國立交通大學資訊工程學系(所)
關鍵字: 處理器執行模型;控制電路複雜度;指令平行度;時脈頻率;Processor Execution Model;Control Circuit Complexity;Instruction-Level Parallelism
公開日期: 2010
摘要: 現代高性能處理器以亂序執行模型(out-of-order execution model) 達到高指令平行度 (instruction-level parallelism, ILP),然而複雜的控制電路也限制了時脈速度。主要的效能障礙包括大指令發出窗 (instruction issue window),二次方成長的資料相依檢查電路,以及隨著導線延遲增加帶來的晶片佈局困難。我們提出瀑布執行模型(waterfall execution model)來達到亂序執行同等的性能,但以循序執行(in-order execution)控制方式來降低複電路雜度。 構想之瀑布執行模型串連一序列的單層資料驅動的執行階,稱之為 waterfall stage,以達到亂序執行效能。階層數可為任意。指令基本區塊 (basic block) 逐階下送。可執行指令在本階執行,無法此刻本階執行的指令則送往下階。前階因跳過需等待執行的指令而加速,後階因前階執行了部份指令而加速。以這種簡單持續的前進方式,瀑布模型不會被執行時間阻礙指令流的流動。理想狀態下,以足夠的階層數,指令基本區塊可以在critical path所限制的時脈數內執行完畢。 發展中的半導體技術提供了豐富的電路容量,鼓勵計算機架構充分使用這些電晶體。然而導線延遲同時限制了電路的複雜度。瀑布模型優美的設計在於規則的線路結構及極簡化的流動控制。本計劃將以三年時間探討瀑布模型及設計問題,包括瀑布模型的正規描述,參數或設計考量,實作問題,效能計算及評估,軟體環境,及可能的延伸應用。我們也將開發開放源碼的模擬器及tool chain,以作為評估及發展平台,並進以推動學界及業界的後續研究。此技術一旦證明可行,將為同等效能的最簡單設計。其模組化的設計觀念兼可應用於高效能或嵌入式處理器,同時也將是多核心時代(many-core era)的最佳處理器模型。
Out-of-order execution model achieves high instruction-level parallelism but control complexity impedes performance. Major performance hurdles include large issue-window size, quadratic complexity growth of data dependency check, and increased wiring delays due to difficult floor-planning. We propose a waterfall execution processor model that achieves out-of-order performance but only with in-order control complexity. Waterfall execution model concatenates a series of single-layer data-driven execution stages, number of stages being arbitrary. A basic block is sent down the stages, batch of available instructions are executed in current stage, unleashing further instructions to be executable in subsequent stage. Early stages are accelerated because the waiting instructions are skipped, and late stages are accelerated because some instructions have been executed in early stages. With simple, always-advance-one-stage instruction dispatching rule, no waterfall stall will ever occur. Ideally, with sufficient stage depth, a basic block can be executed in a number of cycles equal to its critical path depth. Current semiconductor technology offers abundant circuit capacity, encouraging architectures that utilize more transistors. And the beauty of waterfall execution model lies in its extremely regular circuit structure, and very simplified and streamlined control. This three-year project will study waterfall model and related design issues. The study will include formal description of waterfall execution model, parameter and design considerations, implementation issues, performance evaluation and calculation, software environment, and possible extensions. We will develop open-source simulator and tool chain, which will serve as the evaluation and development platform and will promote the deployment and following research of the model. This model, if proved working, will be the simplest design providing the equivalent performance. The modularized design concept can apply to both high-performance and embedded processors, and will be the best processor model in many-core era.
官方說明文件#: NSC97-2221-E009-058-MY3
URI: http://hdl.handle.net/11536/100447
https://www.grb.gov.tw/search/planDetail?id=1985821&docId=324194
Appears in Collections:Research Plans