標題: | 具可自我調整及協調功能之多核心記憶體預取系統 Adaptive Granularity and Coordinated Management for Timely Prefetching in CMPs |
作者: | 張家融 Chang, Chia-Jung 陳添福 Chen, Tien-Fu 資訊科學與工程研究所 |
關鍵字: | 預取;節流;快取;多核心;prefetch;throttle;cache;CMP;thread correlation |
公開日期: | 2013 |
摘要: | 近年來有許多研究有關於硬體預取系統,並利用之來改善記憶體存取延遲時間,達到增進系統效能的目的。但是硬體預取系統並不是如此的完美,它亦會排擠到記憶體系統中的其他資料,更可能造成嚴重的系統效能下降。
本篇論文所提出之具可自我調整及協調功能之多核心記憶體預取系統分為兩個部分:第一是預取系統之控管,收集第一級快取記憶體受預取系統影響的資訊來抑制預取系統;其二是精度預取,收集預取器中各個串流的時間資訊,並利用該資訊建立出適合各個串流的預取策略。藉由以上兩中方法的協調,我們建立出在時間上較準確之預取器,達到降低各級快取記憶體之排擠及競爭的副作用。在經由基準測試程式SPLASH-2及PARSEC的測試之後,相對於無管控之預取系統,我們所提出之方法可以平均增進7%之系統效能(最多擁有24%的增進)。 For the last decade, there have been varying techniques for hardware prefetchers to improve the system performance. However, untimely prefetching may pollution caches and resulting into significant performance degradation. This thesis proposes an Adaptive Granularity Prefetching (AGP) that consists of a coarse-grained Prefetch Allocation Management (PAM) and fine-grained prefetched mechanism to provide a better caching environment for parallel applications. PAM targets on the degree-adjusting and location-choosing and tries to minimize the influence caused by prefetcher for each core. Fine-grained prefetching focuses on the prefetched streams detected by L1 prefetcher in every cores and uses the timing information to build a relative aggressiveness to current one. By combining mechanism with different granularities, AGP let prefetcher could produce more timely prefetched requests that reduce the cache pollutions and contentions. Across a variety of SPLASH2 and PARSEC benchmark, our approach can contribute 7% (up to 24%) of performance improvement on a 4-core multicore system compared to the static prefetcher configuration. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT070056048 http://hdl.handle.net/11536/73372 |
Appears in Collections: | Thesis |