標題: 轉譯NESL至有標記之C語言
Translating NESL to C with Annotations
作者: 陳敬憲
楊武
Chen, Ching-Hsien
Yang, Wuu
資訊科學與工程研究所
關鍵字: NESL;圖形處理器;圖形處理器通⽤計算;巢狀資料平行;GPU;GPGPU;nested data parallelism;NESL
公開日期: 2017
摘要: NESL 是⼀個 1990 年代時被提出的函數程式語⾔,⾄今仍活躍在 GPU 相關的研 究領域。函數程式語⾔屬於⾼階語⾔,雖然能夠更貼近程式開發⼈員的想法,但是能 否貼近機器的執⾏模式以取得較⾼的執⾏效能,則是仰賴編譯系統。通⽤繪圖運算單 元雖能提供⼤量規則運算的加速,但是在不規則巢狀運算上則不易符合其運作模式做 加速運算。去年,⿈銘祥學⻑提出了名為 Partial-Flattening 的編譯技術,可以翻譯有 標記的 C 語⾔⽽進⼀步⽀援在 GPU 上進⾏不規則巢狀平⾏運算的,並且在多項效能 測試上超越現有的 NESL 編譯器。故我們現在設計⼀個翻譯器將 NESL 的程式翻譯成 有標記的 C 程式,以利透過 Partial Flattening 的編譯技術來達成 GPU 的巢狀資料結 構的平⾏性。實驗結果顯⽰, 儘管在部分測資上 NESL2C 會因為記憶體操作的時間讓 效能變慢, 跟現⾏的 NESL CPU 直譯器⽐較, 我們翻譯過的快速排序法平均慢了 48 倍, Maximum Clique Enumeration 的測試平均慢了 1.79 倍, 但是在 Dot product 的測試我 們平均快了 2 倍⽽ Quickhull 則是平均快了 29.2 倍.
NESL is a functional language that supports nested data parallelism. Even though it was proposed in 1990s, NESL is still under active GPU research in recent years. In 2016, Huang and Yang proposed the Partial flattening technique, which translates C programs together with annotations on parallel loops to CUDA programs and supports irregular nested parallelism on GPUs, Partial flattening decreases the difficulty of irregular nested parallel programing on GPU devices and outperforms existing NESL compilers on several benchmarks. We build a translator that translates NESL programs into C programs with annotations which can be further translated into CUDA programs by partial flattening translator in order to achieve nested data parallelism on GPUs. The experimental result of our current implementation shows that, our translated C might suffer from the overhead of memory allocations on GPU devices in some cases, and therefore, compared with the existing NESL interpreter targeting on CPU, we are in average 48 times slower in Quicksort and 1.79 times slower in Maximum Clique Enumeration. However, we still performs in average 2 times faster in dot product and 29.2 times faster in Quickhull.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070456069
http://hdl.handle.net/11536/142758
顯示於類別:畢業論文