標題: 以LLVM 為基礎開發支援X86指令集架構的可重定目標之混和型二元碼轉譯器的作業系統支援
OS Supports of an LLVM-Based Retargetable Hybrid Binary Translator for X86 ISA
作者: 劉億峻
Liu, I-Chun
單智君
Shann, Jyh-Jiun
資訊科學與工程研究所
關鍵字: 混和二元碼轉譯器;X86 指令集;LLVM;LLVM MC toolkit;子通用暫存器;狀態旗標;堆疊模擬;系統呼叫模擬;快速系統呼叫模擬;Hybrid Binary Translator;X86 ISA;LLVM;LLVM MC toolkit;sub-registers;status flags;stack emulation;System Call Emulation;Fast System Call Emulation
公開日期: 2012
摘要: 二元碼轉譯器(Binary Translator)的功能為將可執行檔從來源指令集架構轉譯到目標指令集架構,如ARM至X86。目前的二元碼轉譯器主要分成三類,各有優缺點。靜態二元碼轉譯器(Static Binary Translator)在執行前先進行轉換,可以對目標程式做深度的優化,有效能上的優勢;但是有些問題並無法有效解決,如:碼位置問題(code location problem)及碼尋找問題(code discovery problem)。而動態二元碼轉譯器(Dynamic Binary Translator)是在程式執行時進行轉換,可以輕易地解決碼位置問題及碼尋找問題;但由於轉換的時間包含在執行時間內,故不能做深度的優化,因此效能沒有靜態的好。而混合二元碼轉譯器(Hybrid Binary Translator)則結合了上述兩類的優點,可以先靜態的進行轉換,而後在程式執行時如遇到問題再動態轉換,同時擁有了效率以及可輕易解決碼位置及碼尋找等問題的優點。 混合二元碼轉譯器是很新的技術,目前只有MC2LLVM這個以LLVM為基礎所開發的可重定目標之混合型二元碼轉譯器屬於此類。而目前MC2LLVM能支援的來源指令集只有ARM,因此本論文之研究主要在MC2LLVM上的來源指令集上加入X86的支援,並且對X86來源執行檔在Linux作業系統下之執行行為做出相對應的模擬。X86和ARM相較起來是複雜許多的指令集,X86為非固定長度指令集,而且指令個數也比ARM多上許多。此外,X86的程式行為也相對複雜,譬如X86有多種記憶體定址模式,還有快速系統呼叫的設計等,所以實作上有一定的難度。由於目前有很多軟體都在X86架構上開發,所以對X86的支援是有價值的!在X86-32可執行檔轉換至X86-64目標平台的實驗結果中,我們所實作的混合型二元碼轉譯器所轉譯出來的目標執行檔其執行速度在大部分EEMBC Benchmark的程式中,和QEMU相比有1.6倍的效能改善。
The main purpose of a binary translator (BT) is to translate an executable from a source instruction set architecture (ISA) to a target ISA, for example, ARM to X86. In fact, each kind of BTs has its merits and disadvantages. A static binary translator (SBT) translates the source executable at static time, and thus may perform more aggressive optimization. Although SBT has the merit of the target executable performance, it can not perfectly handle some problems occurred at runtime, such as code discovery problem and code location problem. A dynamic binary translator (DBT) translates the source executable at runtime and thus it can handle code discovery problem and code location problem efficiently. However, since the translation time of a DBT is part of its execution time, it can not do aggressive optimizations. A hybrid binary translator (HBT) may have both the merits of SBT and DBT. An HBT translates the source executable first at static time, and then, if the execution of the target executable emits an exception at runtime, it may switch to its dynamic translator to handle the exception. Therefore, HBT has the merits of both good performance and easy handling of code discovery problem and code location problem. HBT is a brand-new binary translation technique and MC2LLVM is the only example in literature. MC2LLVM is an LLVM based retargetable HBT which supports ARM source ISA only currently. In this thesis, we extend MC2LLVM to support X86 source ISA and emulate the execution behavior of an X86 executable under Linux operation system. Compare to ARM ISA, X86 ISA is more complicated, and it has more instructions need to be implemented. Besides, the execution behavior of an X86 executable is also more complicated, such as multiple memory addressing modes and the design of fast system call. So adding the support of X86 source ISA is not easy. Because there are many applications are developed in X86, the support of X86 source ISA is valuable. In our X86-32 to X86-64 translation experiments, the target executables translated by our HBT are 1.6 times faster than QEMU on most programs of EEMBC benchmark.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070056010
http://hdl.handle.net/11536/72072
顯示於類別:畢業論文