VecRA: A Vector-Aware Register Allocator for GPU Shader Processors

doi:10.1145/2961026

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	You, Yi-Ping	en_US
dc.contributor.author	Chen, Szu-Chien	en_US
dc.date.accessioned	2017-04-21T06:55:23Z	-
dc.date.available	2017-04-21T06:55:23Z	-
dc.date.issued	2016-08	en_US
dc.identifier.issn	1539-9087	en_US
dc.identifier.uri	http://dx.doi.org/10.1145/2961026	en_US
dc.identifier.uri	http://hdl.handle.net/11536/134282	-
dc.description.abstract	Graphics processing units (GPUs) are now widely used in embedded systems for manipulating computer graphics and even for general-purpose computation. However, many embedded systems have to manage highly restricted hardware resources in order to achieve high performance or energy efficiency. The number of registers is one of the common limiting factors in an embedded GPU design. Programs that run with a low number of registers may suffer from high register pressure if register allocation is not properly designed, especially on a GPU in which a register is divided into four elements and each element can be accessed separately, because allocating a register for a vector-type variable that does not contain values in all elements wastes register spaces. In this article, we present a vector-aware register allocation framework to improve register utilization on shader architectures. The framework involves two major components: (1) element-based register allocation that allocates registers based on the element requirement of variables and (2) register packing that rearranges elements of registers in order to increase the number of contiguous free elements, thereby keeping more live variables in registers. Experimental results on a cycle-approximate simulator showed that the proposed framework decreased 92% of register spills in total and made 91.7% of 14 common shader programs spill free. These results indicate an opportunity for energy management of the space that is used for storing spilled variables, with the framework improving the performance by a geometric mean of 8.3%, 16.3%, and 29.2% for general shader processors in which variables are spilled to memory with 5-, 10-, and 20-cycle access latencies, respectively. Furthermore, the reduction in the register requirement of programs enabled another 11 programs with high register pressure to be runnable on a lightweight GPU.	en_US
dc.language.iso	en_US	en_US
dc.subject	Register allocation	en_US
dc.subject	shader processors	en_US
dc.subject	register packing	en_US
dc.title	VecRA: A Vector-Aware Register Allocator for GPU Shader Processors	en_US
dc.identifier.doi	10.1145/2961026	en_US
dc.identifier.journal	ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS	en_US
dc.citation.volume	15	en_US
dc.citation.issue	3	en_US
dc.contributor.department	資訊工程學系	zh_TW
dc.contributor.department	Department of Computer Science	en_US
dc.identifier.wosnumber	WOS:000384247100003	en_US
顯示於類別：	期刊論文