標題: | 在一個動態轉譯引擎中優化SIMD指令之生成 Improvement of SIMD Code Generation in a Dynamic Binary Translator |
作者: | 傅勝余 Fu, Sheng-Yu 徐慰中 Hsu, Wei-Chung 資訊科學與工程研究所 |
關鍵字: | 模擬器;QEMU |
公開日期: | 2013 |
摘要: | Modern processors are increasingly enhanced with SIMD instructions. For examples, the MMX, SSE, and AVX instructions in the x86 architecture, and the Neon instruction set in the ARM architecture are all SIMD instructions. Using these SIMD instructions could significantly increase the performance of applications, hence application binaries are likely to have a greater fraction of instructions that are SIMD instructions. However, SIMD instruction translation has not attacked much attention in Dynamic Binary Translation (DBT). For example, in the popular QEMU system emulator, guest SIMD instructions are often emulated with a sequence of scalar instructions even when the host machines do have SIMD instructions to support such parallel computation, leaving a large potential for performance enhancement.
In this thesis, we propose two approaches, one to leverage the existing helper function implementation in QEMU, and the other to use a newly introduced vector IR (Intermediate Representation) to enhance the performance of SIMD instructions translation in DBT of QEMU. The two approaches have been implemented in the QEMU with ARM frontend and x86-64 backend. In our experiment, the vector IR QEMU is 1.01 to 5.55 times faster than original QEMU with benchmark SPEC2006 CFP and 7.61 times faster than original QEMU with benchmark Linpack. Modern processors are increasingly enhanced with SIMD instructions. For examples, the MMX, SSE, and AVX instructions in the x86 architecture, and the Neon instruction set in the ARM architecture are all SIMD instructions. Using these SIMD instructions could significantly increase the performance of applications, hence application binaries are likely to have a greater fraction of instructions that are SIMD instructions. However, SIMD instruction translation has not attacked much attention in Dynamic Binary Translation (DBT). For example, in the popular QEMU system emulator, guest SIMD instructions are often emulated with a sequence of scalar instructions even when the host machines do have SIMD instructions to support such parallel computation, leaving a large potential for performance enhancement. In this thesis, we propose two approaches, one to leverage the existing helper function implementation in QEMU, and the other to use a newly introduced vector IR (Intermediate Representation) to enhance the performance of SIMD instructions translation in DBT of QEMU. The two approaches have been implemented in the QEMU with ARM frontend and x86-64 backend. In our experiment, the vector IR QEMU is 1.01 to 5.55 times faster than original QEMU with benchmark SPEC2006 CFP and 7.61 times faster than original QEMU with benchmark Linpack. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT070156049 http://hdl.handle.net/11536/74892 |
顯示於類別: | 畢業論文 |