標題: 多核心系統上動態優化及耗電管理的研究
Dynamic Optimization and Power Management on Multi-Core Based Systems
作者: 徐慰中
Hsu Wei Chung
國立交通大學資訊工程學系(所)
關鍵字: 多核心處理器;幫手緒;投機緒;動態優化系統;編譯器;虛擬機器;Multi-core processor;helper thread;speculative thread;compiler;virtual machines
公開日期: 2010
摘要: 多核心多執行緒的處理器己是市場主流. 這些處理器除了核心與核心溝通快速, 同步便捷之外, 還有一個特性, 就是資源共享. 譬如說數個核心可以共用一個L3快取記憶體. 其他許多較昂貴的芯片內資源, 也都可以讓多個核心來共用. 資源共享, 提昇了多核心處理器的功效, 也提供了許多最佳化的新機會. 舉例來說, 利用幫手緒(Helper Thread) 或是投機緒(Speculative Thread) 可以將所需使用的資料提早拿進共用的快取記憶體, 有效地減少資料獲取所花費的時間. 然而這些新的優化機會和因為資源共享而造成的資源競爭,使得傳統的編譯器優化束手縛腳, 難以使力. 因為在芯片上的資源競爭, 通常是動態的. 編譯器所預估的情形也許在執行時己大不相同, 所以在多核心多執行緒的處理器上,許多優化應該配合實際執行時的情況來決定該如何進行. 如何有效地善用多核心多執行緒芯片上的資源, 是系統研究上一大挑戰. 這需要硬體, 作業系統, 編譯器, 多方面的配合. 從程式碼優化這角度來看, 我們計劃延伸過去數年設計開發的動態優化系統ADORE來因應多核心多執行緒處理器上優化的挑戰. ADORE原本是設計開發在Intel Itanium系列機器上. 後來應Sun Microsystem要求, 被移植到SPARC平台的UltraSparc IV+上. UltraSparc IV+是早期的多核心處理器. ADORE/SPARC可以自動在程式執行時生成幫手緒(Helper Thread), 而幫手緒可在另一個核心上執行, 替主緒(Main Thread) 預先將需要的資料拿進共用的快取記憶體. 在Itanium 和SPARC這兩個商業平台上, ADORE動態優化都能呈現明顯的功能增進. 為了可以對付更多在多核心多執行緒處理器上的應用程式(如OpenMP, MPI 一類), 我們更進一步設計與實作了COBRA系統, 對執行程式中每一個緒進行監督觀測. 依照整體系統的資源使用情況, 搭配每一緒的執行資料來擬定最有效的功能優化或省電策略. 以單核心的動態優化系統ADORE為基礎, 以多核心平台上的COBRA為輔助, 我們計劃進行在多核心多執行緒處理器上對可調適監測, 動態執行狀況收錄, 和程式碼動態優化的研究. 特別值得注意的項目包括(A) 如何使原本在單核心上的應用程式在多核心環境下獲利, (B) 如何使原本己多緒化的應用程式能在新的多核心環境下充分發揮, (C) 如何從計算機架構, 編譯器, 虛擬機器, 各方面來支援在多核心多執行緒處理器上更有效, 更可靠的動態優化管理系統
Summary As the increase of clock rate becomes increasingly difficult due to the power and other considerations, many modern high-performance microprocessor chips are moving to Multi-core and/or Chip Multi-Threading (CMT). The common features on these processors are resource sharing as well as low communication and synchronization latency. While such features create many opportunities for new optimizations to improve performance (such as helper threads and speculative threads generation), they also pose a significant challenge to compilers because the effectiveness of many such optimizations depend heavily on the conditions at runtime. Very often, the conditions manifested at runtime would differ greatly from what the static compiler has assumed, and the performance of its optimized code would suffer accordingly. With dynamically deployed optimizations at runtime, the applications could avoid such constraints and be more adaptable to the changing execution conditions. Efficient and Effective management of the shared resources in multi-core systems has emerged as one of the key challenges for systems research. We propose to extend our existing dynamic compiler framework that has been successfully targeting application binaries to single-core processors, called ADORE, to the new Multi-core/CMT processors. ADORE has been implemented on two commercially available platforms: ADORE/Itanium and ADORE/Sparc. Both systems are fully functional and have shown significant performance improvement on real-world applications. ADORE/Sparc could also generate helper threads on-the-fly to assist the execution of main thread on SPARC Multi-cores (e.g. Panther) with shared caches. In order to handle a larger class of applications, such as multiprocessed programs with OpenMP, on existing and future Multi-core/CMT processors, we have further designed and implemented the COBRA framework that adds a layer on top of ADORE to handle applications running on multiple cores. COBRA continuously collects and accumulates execution profiles from application binaries, and uses off-line analyses and optimizations to improve the adaptability of applications. As it focuses on parallel and multi-threaded applications, it needs to aggregate and correlate the execution profiles collected from different threads running on different cores to extract the necessary information for dynamic optimizations. With the integrated COBRA/ADORE framework, we propose to study adaptive monitoring, profiling, and dynamic optimization on the latest and future Multi-core/CMT processors. In particular, we propose to research on A) strategies to optimize single-core applications running on Multi-core/CMT processors, B) strategies to adapt multi-core applications running on Multi-core/CMT processors, and C) architectural and compiler support (such as compile-time annotation) needed for adaptive dynamic optimization systems.
官方說明文件#: NSC99-2218-E009-008-MY2
URI: http://hdl.handle.net/11536/99872
https://www.grb.gov.tw/search/planDetail?id=2016270&docId=330255
顯示於類別:研究計畫