在NVIDIA圖形處理器上管理暫存器以增加線程級並行處理

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	游本永	en_US
dc.contributor.author	Yu, Pen-Yung	en_US
dc.contributor.author	游逸平	en_US
dc.contributor.author	You, Yi-Ping	en_US
dc.date.accessioned	2014-12-12T02:39:18Z	-
dc.date.available	2014-12-12T02:39:18Z	-
dc.date.issued	2013	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#GT070056013	en_US
dc.identifier.uri	http://hdl.handle.net/11536/73916	-
dc.description.abstract	圖形處理單元具有大量的運算處理器，這些運算處理器是以單指令流多資料流的方式執行，因此圖形處理單元能處理每秒兆個浮點運算，運算量是中央處理器的數十甚至數百倍。通用圖形處理器依靠大量的執行緒來隱藏會花費400~800時序的off-chip記憶體延遲，然而能平行執行的執行緒數量會特別受到執行緒使用的暫存器數量影響，因此在這篇論文中，我們提出了降低暫存器壓力以最佳化線程級並行處理的架構，這個架構的目的就是要降低執行緒使用的暫存器數目，以增加線程級並行處理。在這個架構中包含了兩個降低暫存器使用量的方法，第一個是暫存器的重算，第二個是溢出暫存器至on-chip記憶體。實驗結果顯示這個架構是有效果的，平均減少了5.7%的執行時間，最多能減少27%。	zh_TW
dc.description.abstract	Graphics processing units (GPUs) are equipped with enormous amounts of arithmetic processors running in a single-instruction, multiple-data fashion, producing a throughput of Tera floating-point operations per second, which is ten or even hundred times higher than the throughput of central processing units. GPUs reply on massive hardware multithreading to hide off-chip memory latencies, which are approximately 400–800 cycles. However, the number of parallel threads running on GPUs is highly restricted by the resource requirement of such a thread, especially the register requirement. In this thesis, we proposed a thread-level parallelism-aware register-pressure reduction framework to reduce the register usage of threads on GPGPUs, thereby increasing the thread-level parallelism. This framework includes two register-pressure reduction methods: (1) register rematerialization, (2) spilling registers to on-chip memory. The experimental results demonstrate that the proposed framework was effective in improving performance of OpenCL kernel programs by a maximum of 27% and an average of 5.7%.	en_US
dc.language.iso	en_US	en_US
dc.subject	編譯器最佳化	zh_TW
dc.subject	暫存器配置	zh_TW
dc.subject	線程級並行處理	zh_TW
dc.subject	圖形處理單元	zh_TW
dc.subject	OpenCL	zh_TW
dc.subject	CUDA	zh_TW
dc.subject	Compiler optimization	en_US
dc.subject	register allocation	en_US
dc.subject	thread-level parallelism	en_US
dc.subject	GPU	en_US
dc.subject	OpenCL	en_US
dc.subject	CUDA	en_US
dc.title	在NVIDIA圖形處理器上管理暫存器以增加線程級並行處理	zh_TW
dc.title	Increasing Thread-Level Parallelism with Register Resource Management for NVIDIA GPUs	en_US
dc.type	Thesis	en_US
dc.contributor.department	資訊科學與工程研究所	zh_TW
顯示於類別：	畢業論文