標題: | 在自主指令記憶體的設計中採用迴圈暫存器來降低指令擷取流量 Reducing Instruction Fetching Traffic Using Loop Buffer in Autonomous Instruction Memory Design |
作者: | 許裕昇 Shu, Yu-Sheng 鍾崇斌 Chung, Chung-Ping 電機學院IC設計產業專班 |
關鍵字: | 低功耗;指令快取;指令擷取;迴圈暫存器;多工匯流排;top level instruction memory;loop buffer;low power;instruction cache;autonomous instruction memory;multiplex bus |
公開日期: | 2008 |
摘要: | 自主指令記憶體的設計的想法在於結合動態分支預測器和最上層的指令記憶體而擁有自行產生指令位址的特性,因而使得CPU可以不用傳送指令位址給自主指令記憶體,而達到降低CPU和自主指令記憶體之間指令位址匯流排上傳遞資料量的目的。而在此種架構之下,雖然指令位址匯流排上所傳遞的資料量已經被有效的降低,然而在同時指令內容匯流排上還是在每個時脈週期傳送著資料。因此我們的問題便是在於如何在此種架構之下去降低CPU和指令記憶體以及動態分支預測器之間的指令內容匯流排上傳遞資料量的問題。
而為了解決這樣的問題,我的想法是在CPU的內部增加一個緩衝器用來儲存過去使用過的指令,而這個緩衝器不能有標誌的部分,主要的原因在於不想增加過多額外的硬體成本。CPU藉由存取存放於緩衝器中的指令達到節省CPU和自主指令記憶體之間指令內容匯流排耗電的目的。
我們的設計將會包含了新增的輔助硬體以及一套適用於CPU和指令記憶體以及動態分支預測器之間的傳輸協定,這套傳輸協定必須能夠同時達到可以有效率地控制指令內容匯流排以及維持程式流程的正確性。
以上述的想法做為出發,我的設計將會以line buffer做為此種緩衝器的初步設計,之後結合田濱華的loop buffer設計做更進一步的改良,最後再採用藉由指令內容匯流排來傳送存取緩衝器中的指令的想法來使得CPU之中的緩衝器可以儲存巢狀迴圈進而提升緩衝器的使用率使得指令內容匯流排的存取次數可以持續地下降。除此之外我們並目針對當這兩條特定用途的匯流排結合成一條的通用匯流排時對於整體系統效能所造成的影響進行評估。 The idea of Autonomous Instruction Memory (AIM) is to combine Top-Level Instruction Memory (TLIM) and Branch Target Buffer (BTB). This kind of architecture has an character of self-generating instruction address. With this character, CPU core needs not to transfer instruction address to AIM, and this will obtain the goal of reducing traffic on instruction address bus. Under this kind of architecture, although the traffic on instruction address bus has been greatly reduced, the instruction content bus between CPU core and AIM still works every clock cycle. Therefore, our problem is to reduce the traffice on instruction content bus between CPU core and AIM. Our primitive idea for solving this problem is to introduce a tagless buffer inside CPU core storing those used instructions. CPU core can access instructions inside buffer instead of fetching from instruction content bus and then power consumption on instruction content bus can be saved. Our design will including additional auxiliary hardware, buffer inside CPU core, and an efficient control bus communication protocol between CPU core and TLIM plus BTB and instruction memory are essential to maintain which instructions can be write into buffer and when to reuse these instructions. Based on the above ideas, I will use line buffer as a first design. And then integrate loop buffer as my second design. Finally we use instruction content bus to deliver the buffer access position as my third design. Under this design, the buffer inside CPU core can store the nested loop and reuse them. Therefore, the access time of instruction content bus can continuously be reduced. Besides, we evaluate the efficiency impact if we merge the two dedicated bus into a common bus. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009395522 http://hdl.handle.net/11536/80357 |
顯示於類別: | 畢業論文 |