標題: 在超純量 Java 處理機上之堆疊運算間的多折疊群平行辨識
PARALLEL IDENTIFICATION OF MULTIPLE FOLDING GROUPS AMONG STACK OPERATIONS ON SUPESCALAR JAVA PROCESSORS
作者: 吳志宏
Chih-Hung Wu
單智君
Dr. Jean Jyh-Jiun Shann
資訊科學與工程研究所
關鍵字: Java 虛擬機器;Bytecode 折疊群;資料前送;標籤分派;堆疊資料間相依性關係平行檢查;Java Virtual Machine;Bytecode folding groups;data forwarding;tag assignment;parallel stack data dependency checking
公開日期: 2001
摘要: 由於 Java 虛擬機器是採堆疊的架構且絕大多數的 Java Bytecode 皆需藉由堆疊架構頂端來依序執行,因此 Bytecode間能平行執行的機會變得很稀少。在本篇論文中,我們針對高速處理的超純量Java處理器,提出能平行辨識出多個堆疊運算間可亂序執行的折疊群機制,配合資料前送(data forwarding)以有效率地將指令間的平行度開發出來。我們直覺地將會存放在運算元堆疊(operand stack)中的資料區分為三大部分,做為運算元抓取時的優先順序,提出堆疊資料間相依性關係平行檢查的設計,並配合標籤分派(tag assignment)的程序將 bytecodes 折疊起來加以儲存及執行。平行辨識能加速檢查的動作、產生多個折疊群能增加亂序執行的機會,進而使得一個超純量 Java 處理器可以比傳統堆疊架構的 Java 處理器在一個時脈週期內執行較多的指令。模擬測試的結果顯示,我們設計與建議的超純量處理器架構之執行效能可以比Sun PicoJavaII 好上3 至 3.2倍。
Because Java Virtual Machine (JVM) is a stack-based architecture and most Java bytecodes need to be executed sequentially by the top of the operand stack, the possibility of the parallel execution among bytecodes is very little in the Java processor with traditional stack architecture. In this thesis, we propose the identification of multiple folding groups among stack operations for out-of-order execution on Superscalar Java Processors, and that works together with the data forwarding process to exploit the ILP efficiently. The operand stack is partitioned into three parts intuitively for the priority of the operand fetch, and the design of parallel dependency checking among stack operations is proposed. Moreover, the bytecodes are folded by the tag assignment process for store and execution after stack data dependency checking. Parallel identification can speed up the checking operation and the identification of multiple folding groups can increase the out-of-execution possibility, and, thus, the superscalar Java processor can execute more bytecodes in parallel in a clock cycle than stack-based Java processors. Simulation results show that the superscalar Java processor could achieve the performance speedup of average 3~3.2 versus the PicoJavaII Java processor.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT900392098
http://hdl.handle.net/11536/68506
Appears in Collections:Thesis