標題: 以DSP為基礎應用於MPEG2即時系統之控制器之設計
An Embedded DSP-based System Controller for Real-Time MPEG2 Applications
作者: 趙銘陽
Jaw, Ming-Yang
李鎮宜
Chen-Yi Lee
電子研究所
關鍵字: 即時系統;程式化;向量運算;處理器;管線化;MPEG2編解碼器;MPEG2;CODEC;vector operatoin unit;RISC;finite state machine;programmable
公開日期: 1996
摘要: 本論文提出一個應用於即時MPEG2 MP@ML編解碼系統的控制器架構。 本控制器以DSP架構為基礎以程式化的方式來達成編解碼時的控制工作。 本架構利用了特定的硬體來處理運算量較大的運算(如ME, DCT/IDCT, VLC/VLD, MS等運算),而其它變化量較大較不規則的運算則由內部的一個 RISC處理器來完成。因此此架構之指令集可分為二部份,向量運算指令與 非向量運算指令。向量指令一個指令處理了一連串以macroblock為單位的 運算,而非向量指令一次則執行了一個一個運算。同樣地此架構便由一主 處理單元與一向量處理單元來構成,主處理單元負責指令的擷取與解碼及 非向量指令的運算,而向量處理單元則負責向量指令的執行。此架構將 MPEG2一個macroblock運算所用的資料皆集中在一個集中式的內部暫存記 憶體裡,如此可以節省資料分散存放在各別的運算單元裡所需的暫存記憶 體的使用量,使得暫存記憶體能做較有效率的安排與使用。 主運算單 元為一個管線化的處理器架構,它包函了四個管線級分別為指令擷取級、 指令解碼級、執行級、與回寫資料級。此管線化的處理器各級的控制訊號 皆由指令解碼級產生,再經由管線方式傳送到下一級,如此可簡單迅速地 產生各級的控制訊號。因此在正常情況下,除了跳躍指令及其它會改變程 式計數器之指令外,每一個非向量指令皆只需要一個週期來達成。 而 向量運算單元的控制則是由各別向量指令的區域控制器來控制。而此區域 控制器皆由finite state machine的方式來實現。因此各別的向量指令皆 可同時運算,亦即同一個週期內可以允許有一個以上的向量指令在執行, 且向量指令與非向量指令亦可在同一個週期內執行。而每個向量指令皆會 與內部暫存記憶體作資料存取,而匯流排的控制則是由一個可變長度的 priority queue來達成。經由COMPASS* 0.6mm 5Vcell library模擬結果 顯示,本系統可達到100Mhz工作頻率,並能簡單方便地與其它模組相結合 。 In this thesis, we propose a programmable DSP-based systemcontroller for real-time MPEG2 MP@ML applications. We use thededicate hardware to implement the computation intensive operations,such as ME, DCT/IDCT, VLC/VLD, and MS. A RISC-style processor isuseful for irregular operations. The instruction set can be separatedinto two parts, vector instructions and non- vector instructions. Vectorinstruction is in charge of macroblock-wise operations. Therefore, theexecution time for vector instructions inevitably lasts more than oneclock cycle. Oppositely, each non-vector instruction is executed forone clock cycle. Similarly, the architecture of system controller canbe separated into master processing unit and vector processing unit.Master processing unit is responsible for instruction fetch andinstruction decode for all kinds of instructions, and execution andwrite-back for non-vector instructions. Obviously, vector processingunit is used to execute vector instructions, i. e. the execution unitfor vector instructions. The proposed architecture in this thesis centralizes the internalbuffers to reduce buffer size requirement and make buffers and theutilization of internal memory buffers more efficient. Vectorprocessing unit consists of several local controllers for each vectorinstruction. Each local controller is implemented by a finite statemachine. Therefore, all the vector instructions can be executedsimultaneously. In other words, more than one vector instructions arepermitted to be executed in each clock cycle. Moreover, vectorinstructions and non-vector instructions may also be executedsimultaneously. Since each vector instruction shall load source datafrom or restore destination data to the internal memory buffer, thebus utilization is managed by a variable-length priority queue. Theentry in front of the priority queue is able to use the data bus andmake data transaction with the internal memory buffer. As a result,the system can reach 100Mhz under COMPASS* 0.6mm 5V cell library andcan easily be integrated with other modules, such as ME, DCT/ IDCT, andVLC/VLD.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT850428076
http://hdl.handle.net/11536/61947
Appears in Collections:Thesis