標題: 在一個動態轉譯引擎中優化SIMD指令之生成
Improvement of SIMD Code Generation in a Dynamic Binary Translator
作者: 傅勝余
Fu, Sheng-Yu
徐慰中
Hsu, Wei-Chung
資訊科學與工程研究所
關鍵字: 模擬器;QEMU
公開日期: 2013
摘要: Modern processors are increasingly enhanced with SIMD instructions. For examples, the MMX, SSE, and AVX instructions in the x86 architecture, and the Neon instruction set in the ARM architecture are all SIMD instructions. Using these SIMD instructions could significantly increase the performance of applications, hence application binaries are likely to have a greater fraction of instructions that are SIMD instructions. However, SIMD instruction translation has not attacked much attention in Dynamic Binary Translation (DBT). For example, in the popular QEMU system emulator, guest SIMD instructions are often emulated with a sequence of scalar instructions even when the host machines do have SIMD instructions to support such parallel computation, leaving a large potential for performance enhancement. In this thesis, we propose two approaches, one to leverage the existing helper function implementation in QEMU, and the other to use a newly introduced vector IR (Intermediate Representation) to enhance the performance of SIMD instructions translation in DBT of QEMU. The two approaches have been implemented in the QEMU with ARM frontend and x86-64 backend. In our experiment, the vector IR QEMU is 1.01 to 5.55 times faster than original QEMU with benchmark SPEC2006 CFP and 7.61 times faster than original QEMU with benchmark Linpack.
Modern processors are increasingly enhanced with SIMD instructions. For examples, the MMX, SSE, and AVX instructions in the x86 architecture, and the Neon instruction set in the ARM architecture are all SIMD instructions. Using these SIMD instructions could significantly increase the performance of applications, hence application binaries are likely to have a greater fraction of instructions that are SIMD instructions. However, SIMD instruction translation has not attacked much attention in Dynamic Binary Translation (DBT). For example, in the popular QEMU system emulator, guest SIMD instructions are often emulated with a sequence of scalar instructions even when the host machines do have SIMD instructions to support such parallel computation, leaving a large potential for performance enhancement. In this thesis, we propose two approaches, one to leverage the existing helper function implementation in QEMU, and the other to use a newly introduced vector IR (Intermediate Representation) to enhance the performance of SIMD instructions translation in DBT of QEMU. The two approaches have been implemented in the QEMU with ARM frontend and x86-64 backend. In our experiment, the vector IR QEMU is 1.01 to 5.55 times faster than original QEMU with benchmark SPEC2006 CFP and 7.61 times faster than original QEMU with benchmark Linpack.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070156049
http://hdl.handle.net/11536/74892
顯示於類別:畢業論文