標題: | 低內容交換失誤率之轉換搜尋緩衝器與其非同步電路實作之探討 TLB with Low Miss Rate in Context Switching and Study of Implementation of Asynchronous Circuit |
作者: | 鄭緯民 Cheng, Wei-Min 陳昌居 Chen, Chang-Jiu 資訊科學與工程研究所 |
關鍵字: | 內容交換;虛擬記憶體;轉換搜尋緩衝器;非同步電路;Balsa硬體描述語言;Context Switching;Virtual Memory;Translation Lookaside Buffer;Asynchronous Circuit;Balsa HDL |
公開日期: | 2008 |
摘要: | 嵌入式處理器廣泛運用在嵌入式系統或手持裝置中,因此低功率、可靠與扎實就成為這類處理器最重要的課題。非同步電路應該是解決這類問題最好的解答,因此,非同步電路非常適合用來實作這些處理器。
眾所周知,這些嵌入式處理器被用來執行許多不同的工作。近來,許多嵌入式系統與手持式裝置開始執行非常複雜的作業系統,像是嵌入式Linux或Windows□ mobile, 而為了支援現代作業系統的虛擬記憶體機制,支援虛擬位址與實體位址間的轉換是必需的,這也被認為是影響整體記憶體系統效能的關鍵因素之一。為了提高位址轉換的效能,幾乎所有近代處理器內都具備了轉換搜尋緩衝器,因此,在我們的計畫中,我們提出了一個設計給嵌入式處理器具備低內容轉換失誤率的轉換搜尋緩衝器架構。為了區隔不同的位址空間,我們採取了區隔轉換搜尋緩衝器庫來取代每個轉換搜尋緩衝器項目的位址空間區隔標籤,並且使用了簡單的預取機制來減少可能發生的強迫性失誤,除此之外,因為是設計給非同步嵌入式處理器的轉換搜尋緩衝器,設計上所有的運作行為也都很簡單。
最後,我們以Balsa硬體描述語言實作該轉換搜尋緩衝器之控制器,因為實作過程中我們巧妙的安排了通訊交換管道,實作過程中我們就可以比較容易以假設的輸入樣本驗證正確性,儘管也許以這樣的方式驗證我們這個實作是可行的,然而這種方法運用在我們目前進行中的非同步嵌入式處理器計畫既不可能也不合理,因此我們提出建議了一個未來我們計畫進行中的軟硬體共同設計與交互驗證的流程,最後,以Balsa工具產生了邏輯閘級的netlist也評估了實作所需等效的邏輯閘個數,然而結果顯示如此方式實作成本並不低,等價邏輯閘數為688,560, 我們也說明了還是依然以此高層次非同步硬體描述語言實作的原因,最重要的,這對未來比較大的非同步設計而言是必需的。 Embedded processors are widely used in many embedded systems and handheld devices. Hence, low power, reliability, and robustness have been becoming the critical issues for these processors. Asynchronous circuits may be one of the best solutions to overcome these problems. Thus it may be more suitable to implement these processors with asynchronous circuits. It is widely known that these embedded processors are used to execute varieties of tasks. Recently, many new embedded systems and handheld devices begin to execute very complex operating systems, such as embedded Linux or Windows□ mobile. In order to support virtual memory mechanism of modern operating systems, address translation from virtual address to physical address should be supported. However, it is widely considered as the critical issue of memory system performance. In order to improve the address translation performance, the Translation Lookaside Buffer (TLB) is implemented inside almost all contemporary processors. In this work, we propose an alternative TLB architecture with low context switch miss rate for asynchronous embedded processors. We adopted a heuristic TLB banking designs to replace per-entry ASID to identify each address space. In addition, simple prefetching mechanism is used to reduce some possible compulsory misses. Because the architecture is designed for asynchronous embedded processors, all operations are very simple. Finally, we implemented the TLB controller for the proposed TLB architecture with Balsa HDL. Because we skillfully arrange the communication channels, we can verify the implementation easier with assumed random pattern. Though it’s possible to verify our implementation with such simple way, it’s impossible and unreasonable to verify the whole asynchronous embedded processor that we are currently working for. We also suggested a hardware/software co-design and cross-verification flow for our future work. Finally, the gate-level netlist was generated with Balsa tools, and the equivalent gate count of the implementation was estimated. The result shows that the cost of the implementation modeled with Balsa HDL is not cheap. The total equivalent gate count is 688,560. However, we also describe why designing asynchronous circuits with such high-level asynchronous HDL. It’s needed for future larger design! |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT008817802 http://hdl.handle.net/11536/60890 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.