完整後設資料紀錄
DC 欄位語言
dc.contributor.authorYu, Chia-Linen_US
dc.contributor.authorTsao, Shiao-Lien_US
dc.date.accessioned2020-02-02T23:54:36Z-
dc.date.available2020-02-02T23:54:36Z-
dc.date.issued2020-02-01en_US
dc.identifier.issn1045-9219en_US
dc.identifier.urihttp://dx.doi.org/10.1109/TPDS.2019.2937295en_US
dc.identifier.urihttp://hdl.handle.net/11536/153558-
dc.description.abstractThe performance of an OpenCL program is strongly influenced by both hardware and software attributes. To achieve superior performance, developers may leverage automatic performance tuning techniques to determine the optimal parameters on the target device. Although existing approaches have shown promising tuning results in their target scenarios, other requirements such as efficiency, portability, and usability should also be considered because of the rapid growth of heterogeneous computing applications and platforms. In this paper, we re-examine the workgroup size tuning problem and propose a novel approach to meet the aforementioned requirements. We abstract the architectural details into a set of hardware parameters so that the proposed approach can be applied without the presence of target devices, which makes it more accessible to developers. The proposed approach is evaluated on 20 OpenCL kernels and six devices, including both CPUs and GPUs. Experimental results demonstrate that, with negligible overhead, our approach filters out 88.6 percent of the possible workgroup sizes on average. Among all the workgroup size candidates, the best- and worst-performing candidates can achieve average performance of 95.5 and 92.1 percent, respectively, compared with the optimal workgroup size.en_US
dc.language.isoen_USen_US
dc.subjectTuningen_US
dc.subjectPerformance evaluationen_US
dc.subjectKernelen_US
dc.subjectHardwareen_US
dc.subjectIndexesen_US
dc.subjectComputational modelingen_US
dc.subjectGraphics processing unitsen_US
dc.subjectOpenCLen_US
dc.subjectworkgroup size selectionen_US
dc.subjectautomatic performance tuningen_US
dc.subjectmicrobenchmarkingen_US
dc.titleEfficient and Portable Workgroup Size Tuningen_US
dc.typeArticleen_US
dc.identifier.doi10.1109/TPDS.2019.2937295en_US
dc.identifier.journalIEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMSen_US
dc.citation.volume31en_US
dc.citation.issue2en_US
dc.citation.spage455en_US
dc.citation.epage469en_US
dc.contributor.department資訊工程學系zh_TW
dc.contributor.departmentDepartment of Computer Scienceen_US
dc.identifier.wosnumberWOS:000507919800016en_US
dc.citation.woscount0en_US
顯示於類別:期刊論文