完整後設資料紀錄
DC 欄位語言
dc.contributor.author游本永en_US
dc.contributor.authorYu, Pen-Yungen_US
dc.contributor.author游逸平en_US
dc.contributor.authorYou, Yi-Pingen_US
dc.date.accessioned2014-12-12T02:39:18Z-
dc.date.available2014-12-12T02:39:18Z-
dc.date.issued2013en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT070056013en_US
dc.identifier.urihttp://hdl.handle.net/11536/73916-
dc.description.abstract圖形處理單元具有大量的運算處理器,這些運算處理器是以單指令流多資料流的方式執行,因此圖形處理單元能處理每秒兆個浮點運算,運算量是中央處理器的數十甚至數百倍。通用圖形處理器依靠大量的執行緒來隱藏會花費400~800時序的off-chip記憶體延遲,然而能平行執行的執行緒數量會特別受到執行緒使用的暫存器數量影響,因此在這篇論文中,我們提出了降低暫存器壓力以最佳化線程級並行處理的架構,這個架構的目的就是要降低執行緒使用的暫存器數目,以增加線程級並行處理。在這個架構中包含了兩個降低暫存器使用量的方法,第一個是暫存器的重算,第二個是溢出暫存器至on-chip記憶體。實驗結果顯示這個架構是有效果的,平均減少了5.7%的執行時間,最多能減少27%。zh_TW
dc.description.abstractGraphics processing units (GPUs) are equipped with enormous amounts of arithmetic processors running in a single-instruction, multiple-data fashion, producing a throughput of Tera floating-point operations per second, which is ten or even hundred times higher than the throughput of central processing units. GPUs reply on massive hardware multithreading to hide off-chip memory latencies, which are approximately 400–800 cycles. However, the number of parallel threads running on GPUs is highly restricted by the resource requirement of such a thread, especially the register requirement. In this thesis, we proposed a thread-level parallelism-aware register-pressure reduction framework to reduce the register usage of threads on GPGPUs, thereby increasing the thread-level parallelism. This framework includes two register-pressure reduction methods: (1) register rematerialization, (2) spilling registers to on-chip memory. The experimental results demonstrate that the proposed framework was effective in improving performance of OpenCL kernel programs by a maximum of 27% and an average of 5.7%.en_US
dc.language.isoen_USen_US
dc.subject編譯器最佳化zh_TW
dc.subject暫存器配置zh_TW
dc.subject線程級並行處理zh_TW
dc.subject圖形處理單元zh_TW
dc.subjectOpenCLzh_TW
dc.subjectCUDAzh_TW
dc.subjectCompiler optimizationen_US
dc.subjectregister allocationen_US
dc.subjectthread-level parallelismen_US
dc.subjectGPUen_US
dc.subjectOpenCLen_US
dc.subjectCUDAen_US
dc.title在NVIDIA圖形處理器上管理暫存器以增加線程級並行處理zh_TW
dc.titleIncreasing Thread-Level Parallelism with Register Resource Management for NVIDIA GPUsen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
顯示於類別:畢業論文