標題: | 在特定應用的超長指令集處理器上產生及使用可重組態的客製化功能單元 Generating and Exploiting Reconfigurable Custom Functional Unit in Application Specific VLIW Processors |
作者: | 王惠珊 Wang, Hui-Shan 單智君 Shann, Jyh-Jiun 資訊科學與工程研究所 |
關鍵字: | 可重組態;客製化功能單元;排程演算法;reconfigurable;custom functional unit;instruction scheduling algorithms |
公開日期: | 2008 |
摘要: | 為了提升針對特殊應用而設計之處理器的效能,可將可重組態的客製化功能單元﹙reconfigurable custom functional unit, RCFU﹚ 附加於超長指令字﹙VLIW﹚處理器架構中。此技術是藉由常出現的運算序列﹙operation segment﹚產生可重組態的客製化功能單元,並將可以在可重組態的客製化功能單元上執行的運算序列包裹成客製化指令﹙customized instructions﹚。接下來在程式編譯階段中,指令排程﹙instruction scheduling﹚演算法利用指令的可平行化執行來提升效能表現。此篇研究中,我們不只提出產生緊密附著於超長指令字處理器的可重組態之客製化功能單元的方法,同時也提出一個使用此硬體的演算法。我們假設在原本處理器中的功能單元和附加的可重組態的客製化功能單元可同時執行,且將在過去研究中分成兩個不同步驟的包裹客製化指令及指令排程整合成一個步驟,以便獲得更多的效能提升及硬體使用率。比較具可重組態的客製化功能單元處理器及不具可重組態的客製化功能單元處理器的效能表現,整體來說,在採取我們提出的使用可重組態的客製化功能單元的演算法下產生可重組態的客製化功能單元的方式相較於過去的產生方式在效能上有大幅的提升;針對如何善用已存在可重組態的客製化功能單元的架構所提出的演算法也比傳統將找客製化指令及指令排程分別討論的方式有明顯效能提升。 To improve the performance of processors, a customized accelerator, reconfigurable custom functional unit (RCFU), may be appended to a very long instruction word (VLIW) processor architecture. The technique is to generate RCFU by those frequent operation segments and collapse operation segments which could be executed on the RCFU as customized instructions. Then, instruction scheduling is done to elaborate instruction-level parallelism for performance improvement at compile time. In this research, we propose not only a tightly-coupled RCFU design on the VLIW processor, but also an algorithm is also proposed to exploit the processor augmented with RCFU. We assume that FUs in the processor pipeline and RCFU could execute simultaneously, and independent operation mapping and instruction scheduling algorithms are integrated into a single phase to get more performance gains and higher hardware usability. We had comparisons between the processors with RCFU and without RCFU. Overall, our proposed RCFU design while using our proposed exploitation algorithm still achieves giant speedup on average over previous generating algorithms. Furthermore, the algorithm for exploiting RCFU also achieves obviously speedup on average over previous methods, separating algorithms. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079655623 http://hdl.handle.net/11536/43430 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.