标题: 广藿香: 将x86可执行码转成网路应用程式的智慧型二元码转译框架
Patchouli: A Knowledge-Based Translation Framework for x86 Binary to Web Applications
作者: 刘弘威
单智君
Liu, Hong-Wei
Shann, Jyh-Jiun
资讯科学与工程研究所
关键字: 二元转译技术;网路程式;虚拟化技术;binary translation;web application;virtualization
公开日期: 2016
摘要: 计算机网路已不再只是功能取向的工具,亦是构成现代文明的重要推手。主要原因在于网路上资讯(information)散播的速度远超过传统媒体,且流通的资料(Data)量亦远多于传统资料储存装置。为了适应网路的可携性(portability),许多伺服器会利用网路应用程式(web application)来协助或是完成其所需要提供的服务。目前网路应用程式之开发以脚本语言(Script language)为大宗。然而,脚本语言所开发之程式虽拥有高可携性,但其执行效能之低落亦为人所诟病。与传统的桌上型程式(desktop application)相比,网路程式的数量以及种类也相对较少。这些缺点将直接或间接影响到每个伺服器所能提供之服务的质与量。
为了改善网路程式的效能以及提升网路程式的数量,谷歌(Google)公司于2008年提出Native Client (简称NaCl)计画,运用沙盒(sandbox)技术使本地码(native code)可以安全并快速地直接在浏览器(browser)上执行。2010年,谷歌公司再推出Portable Native Client (简称PNaCl)计画,使用中间码(intermediate representation,IR)来当作传递的媒介,以提升程式的可携性。本篇研究将结合二元转译(binary translation)技术以及沙盒技术,使传统桌上型x86可执行档能透过PNaCl系统在网路上安全且有效率地执行。在保有本地码执行效能之优势与透过网际网路达到高传播效率的同时,我们亦设计出一个能利用搜集自全世界客户端(client)的侧写资料(profiling data),以不断改善转译品质的二元码转译框架(binary translation framework)。
为了完成这个框架,本篇研究将修改一个既有的二元码转译器(binary translator),将之与PNaCl系统整合,并设计侧写管理机制。此二元码转译框架的设计与实作主要分为三大部分:第一、改善二元码转译器,以增加转译码(translated code)的品质与可携性;第二、修改转译码与本地作业系统(host operating system)之间的仿真单元(Emulation runtime),使转译码可顺利地在PNaCl平台上执行。第三、开发新的侧写资料管理器(Profiling-Data Manager ),以分析并运用搜集自客服端的大量侧写资料。
实验结果显示,修改后的x86二元转译器将一既有的x86可执行档所转译出的转译码大小,平均约为原来转译器产生的30%。此外,我们能利用近乎无负担的侧写机制,获取程式动态执行时完整的记忆体指令存取区域相关资讯;而后,可将相同的可执行档再次转译,转译后的二元码大小可再减少,平均为原来转译器的21%。在效能方面,与来源码(source code)在本地环境(native platform)下之执行情况相较,再次转译的x86二元码转换成nexe后,可在仅损失50%的效能下顺利地在浏览器中的NaCl系统上仿真执行。
Computer Network is not only a tool, but also one of the important roles on advancing modern civilization. Two main reasons are that the speed of spread and the amount of data through computer networking are much faster and greater than traditional media. To utilize the portability of computer network, many servers have used web applications to assist or accomplish their services. Most of the web applications are developed by script languages, and thereby, these applications usually have high portability and compatibility. However, script language is a kind of dynamic language and thus incurs poor performance. Moreover, compare to desktop applications, web applications lack variety. These drawback will decrease the quantity and quality of service provided by servers.
To enhance the performance and increase the variety of web applications, Google Inc. released a new project named Native Client (NaCl) in 2008, which is a sandboxing technology for running a native code on web browser. In 2010, Google released another project named Portable Native Client (PNaCl), which is an architecture-independent version of NaCl for higher portability. In this thesis, we integrate a binary translator and the PNaCl system for translating legacy x86 executables to web applications. In the meantime, the high portability and usability of web applications inspire us to develop a new profiling strategy to collect and analyze the profiling data from worldwide clients for improving the translation quality further.
To accomplish our framework, we modify an existing binary translator, SBT86, integrate it with PNaCl system, and design a global profiling strategy. Implementations of this framework are divided into three parts. First, for enhancing the translation quality and portability of the translated code, we modify the translation functions and helper functions of SBT86. Second, for executing translated code on PNaCl system, we modify the interface between the host operation system and the translated code. Third, we design a Profiling-Data Manager for analyzing the big profiling data.
Our experiments show that our modification of previous x86 binary translator reduces the size of the translated code of legacy x86 executable for 70% on average. At the meantime, we may collect memory-related profiling data and then improve our translated code further with lightweight overhead for each user. After that, our framework will utilize the profiling data to retranslate the executable with aggressive optimization. Finally, we can reduce the size of the translated code for 79% on average, and the emulation of the legacy x86 executable on NaCl system may lose only 50% performance comparing with execution time of the source executable on naitve platform.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070156100
http://hdl.handle.net/11536/140144
显示于类别:Thesis