利用指令序列特性壓縮程式碼

標題:	利用指令序列特性壓縮程式碼 Code Compression Using Instruction Sequencing Characteristics
作者:	林光彬 Kuang-Bin Kelvin Lin 鍾崇斌 Chung-Ping Chung 資訊科學與工程研究所
關鍵字:	程式壓縮;嵌入式系統;指令集;資料相依性;Code Compression;Embedded System;Instruction Set Architecture;Data Dependency
公開日期:	2002
摘要:	壓縮的原理，是在資料中找出重複出現的資料串，並將重複出現較多的資料串以較小的編碼表示。程式壓縮利用資料壓縮原理，將程式中重複出現的指令（串）存入字典中，並以較小的編碼替換重複出現較多的指令（串）。然而程式壓縮和資料壓縮不同在於程式壓縮必須考慮程式執行流程。因程式執行流程會在不同的基本區塊中切換，所以壓縮及解壓縮機制必須能即時指出所要執行的基本區塊。因此在資料壓縮中的方法會因這個因素而造成資料的不連續性，使得程式壓縮效果不好。本論文探討各種程式壓縮技術，並提供一個利用資料相依性的壓縮方法來提高壓縮率，在此論文中討論四個研究項目： (a). MIPS 指令集對程式壓縮的影響。指令編碼的方式及編碼效能直接影響到程式本身的大小。我們發現以 MIPS 指令集編譯的程式中，實際所含資訊量約只有80%。我們利用 MIPS 指令中運算元和運算子的關係，重新對運算元編碼。將某些運算子及該運算子常用的運算元當作新的指令而給予新的編碼。使得壓縮時，我們可將該運算元中常用的運算子省略。 (b). 利用不同的資料相依性，發展程式壓縮技術。觀察中發現，程式中有大量的運算元，是直接來自上幾道指令中的運算元。所以我們將運算元和運算元之間的關係，用對照標籤來表示，則可減少運算元的數量。我們將指令序列分成運算子序列、對照標籤序列及運算元序列，分別加以編碼壓縮。藉由運算元的減少及對照標籤的共享可使程式壓縮率比現有的方法更小。 (c). 根據所發展的壓縮方法設計其解壓縮硬體。我們設計一個專為運算元關係程式壓縮法的硬體來提供即時壓縮程式的執行。這個硬體包含三個字典及一個指令重組緩衝區。三個字典（運算子字典、運算子字典及對照標籤字典）分別儲存指令序列的運算子，運算元及該指令中所有運算元之間對照標籤。解壓縮時，重組緩衝區根據運算子所需的運算元個數，擷取對照標籤，並參考前幾道指令的運算元來還原所有的運算元。 (d). 不同程式壓縮方法及比較。我們實作所提出來的方法，並比較文獻中所提及的程式壓縮方法與我們所提出的方法之不同處。發現我們所提出的方法，其程式壓縮在SPEC95及Mediabench標竿程式中所得到的壓縮率較小於目前所提出的方法，其壓縮率可到達33.6%和33.8%。本論文主要的貢獻在於提出一個利用運算元和運算元之間的關係來減少程式中指令的資訊量，藉以達到程式壓縮的目的。經由實驗的結果發現不同指令間的暫存器（目的暫存器和來源暫存器）存取樣式，大致是相同的，而立即值運算元（記憶體存取位址、跳躍位址）的相關性則較少。利用這種運算元關係，使得程式壓縮的技術更進一步，提供更小、更佳的嵌入式系統作業環境。 Compression techniques achieve data size reduction by replacing the most repetitive data streams with the smallest codewords. Code compression differs from data compression in that code compression techniques rely on not only pattern repetitions but also execution model. Since control flow switches between instruction sequences, there must be a mechanism to identify the start of each compressed instruction sequence caused by the branches and by the change of address space. This dissertation discusses the differences among code compression techniques and proposes a new technique utilizing data dependencies to improve compression ratio. There are four topics discussed in this dissertation: (a). The influence of MIPS instruction set on code compression is discussed. We use MIPS instruction set as examples and find that the amount of useful information contained in a MIPS program is about 80%. We utilize the relationships between opcodes and operands to reduce the amount of information in a program. Some opcodes are likely to use the specific operands. We encode such cases as new instructions to reduce the number of operands. (b). A new code compression technique utilizing data dependencies is proposed. A great amount of destination operands has great potential to be used as the sources in the subsequent instructions. The mapping tag, a small bit stream, is used to map the latter occurrences of an operand to the former one to eliminate dependent operands. The instruction sequences are decomposed into opcode sequences, mapping tag sequences and residual operand sequences and are compressed independently. The result benefits from the reduced number of operands and the sharing of the mapping tags. (c). A hardware decompression engine is designed for the proposed compression technique. Three dictionaries store unique opcode sequences, operand sequences and mapping tag sequences, respectively, and an instruction assembly buffer assembles the instructions according to the opcode, and the meanings of mapping tags used to refer to the operands history. (d). Comparisons among different code compression techniques are discussed. We find that the proposed method performs superior to the existing methods. The compression ratio of 33.6% and 33.8 are reported for SPEC95 and Mediabench benchmark programs, both compiled for MIPS ISA. The contribution of this dissertation is to propose a new code compression method to reduce the compression ratio significantly utilizing the dependencies between operands. We find that the dependencies between registers in different instruction sequences are quite similar, while the dependencies between immediate operands (including load/store address, branch address, etc.) are different. Using operand dependencies, the proposed code compression improves the compression ratio and provides a better operating environment for the embedded systems.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#NT910392001 http://hdl.handle.net/11536/70072
Appears in Collections:	Thesis