標題: | 智慧型自主指令記憶體設計 Intelligent Autonomous Instruction Memory Design |
作者: | 王立銘 Li-Ming Wang 鍾崇斌 Chung-Ping Chung 資訊學院資訊學程 |
關鍵字: | 動態分支預測器;指令位址匯流排;指令記憶體;分支目標緩衝器;返回堆疊;dynamic branch predictor;instruction address bus;instruction memory;branch target buffer;return stack |
公開日期: | 2005 |
摘要: | 智慧型自主指令記憶體的主要概念是將動態分支預測器併入最上層的指令記憶體使後者具備“程式流程追蹤” 能力。藉著動態分支預測器的協助,指令記憶體在多數時間可以不需CPU核心提供指令位址而知道要到那個位址去擷取下一道指令。這個概念的目的是要將CPU與指令記憶體之間的指令位址傳輸量降到最低。實作出這樣的概念或許可以比許多已知的指令位址匯流排編碼技術要節省更多的能源。當動態分支預測器從CPU移到指令記憶體,新增輔助硬體與一套溝通CPU與指令記憶體之間有效率的控制匯流排傳輸協定對維持程式流程的正確性以及原本動態分支預測器的運作是不可或缺的。運用上述概念的一個簡單設計會先提出來,接著提出配備具有解碼分支指令並計算其分支目標位址能力的部份指令解碼器的一個強化設計。最後提出的是配備部份指令解碼器與返回堆疊的更強化設計。實驗結果顯示這三個設計比起傳統的架構分別減少97.71%, 98.49% 與99.99%的指令位址傳輸以及84.99%,86.54%與92.01%的總位元變化量。以上提出的設計都勝過T0編碼技術許多。第三個設計略勝T0 DAT(128筆)編碼技術。 Main concept of Intelligent Autonomous Instruction Memory (iAIM) is to equip top-level instruction memory with “program flow tracing” capability by incorporating dynamic branch predictor into top-level instruction memory. With help of dynamic branch predictor, instruction memory can know where to fetch the next instruction without instruction address supplied by CPU most of the time. The purpose of such concept is to reduce instruction address traffic between CPU and instruction memory to a minimum. The realization of such concept may conserve more energy on instruction address bus than many known instruction address bus encoding techniques. While dynamic branch predictor is removed from CPU to instruction memory, additional auxiliary hardware and an efficient control bus communication protocol between CPU and instruction memory are essential to maintain program flow correctness and original dynamic branch predictor operation. A simple design of iAIM that makes use of the above concept is proposed first, followed by an enhanced design that equips iAIM with a partial instruction decoder capable of calculating branch target address by decoding branch instruction. A more enhanced design that equips iAIM with a partial instruction decoder and a return stack is proposed finally. The experiment results show three proposed designs can reduce instruction address transmission to 97.71%, 98.49% and 99.99% and reduce total bit transitions to 84.99%, 86.54% and 92.01% compared with conventional architecture respectively. All these designs greatly outperform T0 encoding technique. The third design outperforms T0 DAT with 128 entries technique slightly. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009367588 http://hdl.handle.net/11536/80112 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.