標題: | 應用於粗粒可重組式架構之敘詞感知模數排程 A Predicate-Aware Modulo Scheduling for Coarse Grained Reconfigurable Architectures |
作者: | 江俊賓 Jiang, Jun-Bin 單智君 Shann, Jyh-Jiun 電機學院IC設計產業專班 |
關鍵字: | 可重組式;模數排程;Coarse grained Reconfigurable Architecture;Modulo scheduling |
公開日期: | 2011 |
摘要: | 為了平衡運算架構的效率和彈性,可重組式架構於焉產生,原本在設計上就富有彈性的可重組式架構可以利用高度的指令平行化進而擁有相當不錯的效益。但是,找出更多的指令平行化必須依靠編譯器來完成,然而,在這方面的必非想像中的簡單。近年來,在可重組的架構的研究上,模數化排程應用在其上的例子相當多,因為它基於software pipeline 技術的特性,使得模數化排程在重疊的iteration 中找到了更多的能平行直行的指令。儘管,模數化排程使得指令平行化增加,但在我們的觀察中發現硬體資源的使用卻因為條件運算指令占了全部指令數的37.8%的關係而被限制住。所以在本研究中,我們提出了一個考慮predicted execution 指令的模數化排程,它將不會同時運算到的predicated 指令對映到相同的運算單元(processing element) 作硬體資源共享的動作,以減少硬體資源的需求,進而改善效能。當然,我們也設計了相對應的硬體使得資源共享的模數化排程能正確的運行。此外,對於指令必須對映到哪個硬體資源的選擇方法上面,我們提出了將各個cost分別給予權重的方式來選擇,藉此選擇方式來更加以改善執行效能。在成果上,無論是只有資源共享的方法或是給予cost權重的選擇方式都有著比原本以貪婪為基礎的方法12% 到 25.2% 效能的改進,即使是在原本方法應用在硬體資源增多的情況下,我們以較少的資源與其相比仍有著 18 % 的效能改善。 To balance the efficiency and flexibility, a coarse-grain reconfigurable architecture (CGRA) is proposed, which exploits the parallelism of a program without compromising of its flexibility. However, how to find more operation parallelism is a complicated problem for compilation. Modulo scheduling is one of the most adopted operation scheduling techniques in recent years, which introduces more parallelism by overlapping the iterations of a loop. Although modulo scheduling parallelizes lots of operations, we still observe that hardware resources is limited by 37.8% conditional executed operations. In this research, we propose a predicate-aware modulo scheduling which may map two disjoint operations into a same processing element to reduce the requirements of hardware resources; meanwhile, the corresponding architecture is also proposed. In addition, a weighted cost value mapping decision selection heuristic is designed to improve execution performance for the reconfigurable architecture. Our experimental results indicate that the initial interval of a loop of the selected benchmarks can be reduced by 12% to 25.2% compared with a related work and there is still 18 % reduction when compared with the related work that are equipped more resources. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079495533 http://hdl.handle.net/11536/41022 |
Appears in Collections: | Thesis |