標題: 在HBT-86上支援AVX指令集架構
Supporting Advanced Vector Extensions (AVX) on HBT-86
作者: 劉尚雯
單智君
Liu, Shang-Wen
Shann, Jyh-Jiun
資訊科學與工程研究所
關鍵字: 二元碼轉譯器;混合型二元碼轉譯器;AVX指令集;LLVM;單指令流多資料流;Binary Translation;Hybrid Binary Translation;AVX ISA;LLVM;SIMD
公開日期: 2016
摘要: HBT-86是一個以LLVM編譯器為基礎所開發的可重定目標混合型二元碼轉譯器系統。來源執行檔可支援的指令集包含x86-32與x86-64整數指令、x87浮點數指令、以及SIMD (單指令流多資料流)類型的部分SSE (Streaming SIMD Extensions)系列指令。目前可以產生在x86-32、x86-64以及ARM環境執行的目標執行檔。 繼SSE之後,Intel提出了SSE的延伸架構AVX (Advanced Vector Extensions)指令集。然而目前HBT-86系統尚未支援此類指令,無法成功轉換這些執行檔。因此,本論文之主要研究目標為在HBT-86上,設計與實作對 AVX指令集的支援。本研究之作法為將LLVM MC指令轉譯為LLVM中間表示法(LLVM intermediate representation),並且在LLVM IR層模擬AVX指令的行為與暫存器環境的設定。此外,本論文亦擴充HBT-86對SSE指令的模擬與提升SSE和AVX的相容度,以及擴充HBT-86內的系統調用方法(System Call)。 在實驗中,我們測試多種以執行AVX指令為主的標竿程式。比較對象為Bochs,一個以C++撰寫並對每一道二元碼指令做軟體模擬的模擬器,可以支援x86-32和x86-64平台的來源與目的執行檔。在來源執行檔為x86-32到目標平台為x86-64的情況下,整數類型的AVX標竿程式執行效能是Bochs的11.02倍,浮點數類型的AVX標竿程式執行效能是Bochs的14.35倍。而相對於Native執行檔的執行時間, x86-32轉x86-64下,整數類型的執行時間為4.16倍,浮點數類型的執行時間為3.34倍。來源執行檔為x86-32到目標平台為ARM的擬真之實驗結果,顯示我們的系統可成功地將AVX標竿程式轉成ARM的本地碼,並在ARM平台上執行。
The HBT-86 is an LLVM-based retargetable hybrid binary translation system. The source binary Instruction Set Architectures (ISA) supported by HBT-86 including x86-32and x86-64 integer instructions, x87 floating-point instructions, and Streaming SIMD Extensions (SSE). Furthermore, HBT-86 can generate target binary that can be executed on x86-32, x86-64, and ARM target platforms. In recently years, Intel proposed Advanced Vector Extensions (AVX) which is a 256-bit instruction set extension to SSE. However, HBT-86 has not supported AVX ISA yet, and thus it cannot successfully emulate the binary executable which contains AVX instructions. Therefore, our research aims to design and implement the emulation of AVX instructions on HBT-86. In this thesis, we translate the LLVM Machine Code into LLVM intermediate representation (IR) and emulate the behavior of AVX instructions and registers in the translated code. Besides, we also improve the supportiveness and compatibility of SSE in HBT-86. Moreover, we increase the supportiveness of system call. We compare our system with the Bochs which is a full emulator written in C++ and uses software emulation to emulate every instruction. It supports x86-32 and x86-64 source/target executable. In our AVX x86-32 to x86-64 emulation, our HBT-86 is 11.02 and 14.35 times faster than Bochs for integer and floating-point benchmark, respectively. While comparing with the native binary code, our HBT-86 is 4.16 and 3.34 times slower for integer and floating-point benchmarks, respectively. Finally, our HBT-86 may translate our AVX benchmarks into ARM binary code and execute these code on an ARM platform successfully.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070356085
http://hdl.handle.net/11536/140061
Appears in Collections:Thesis