標題: | 用於基頻處理的數位訊號處理資料路徑之設計與實現 Design and Implementation of DSP Datapath for Baseband Processing |
作者: | 張育銘 Yu-Ming Chang 任建葳 Dr. Chein-Wei Jen 電子研究所 |
關鍵字: | 協同處理;資料路徑;處理器介面;基頻處理;資料流;AMBA;ARM coprocessor interface;實體雛型;coprocessing;datapath;interface;baseband processing;dataflow;AMBA;ARM coprocessor interface;prototype |
公開日期: | 2002 |
摘要: | 在功率消耗及價格訴求的考量下,單一微處理器系統已不能滿足大多數多媒體應用在計算能力上需求,利用外掛的硬體加速器或協同處理資料路徑是增加效能常見的方法。在本論文中,我們以資料路徑及其與處理器介面之兩個觀點來探討各式加速器的設計,並適當分類。我們針對通訊系統中基頻處理提出了一個新穎且去蕪存菁的資料流型態計算引擎,並評估AMBA標準和ARM coprocessor interface兩種與微處理器之介面的適用性。我們在計算引擎所配置的運算資源與ADI ADSP 2181相似,包括一個運算邏輯單元(ALU)、一個乘法器和一個位移器,但除去其所有的多餘組件及不必要的程式限制。順帶一提,我們利用VLIW處理器的設計觀念大量使用軟體技巧以降低硬體的複雜度。另外,我們也提供完整的軟體發展工具,自動將所輸入之浮點格式的演算法描述轉換成運算引擎所用的微程式碼。
DSP-lite是我們第一個實體雛型,它利用上述的運算引擎並內建DMA控制器及AMBA介面,在我們的模擬中,它採用與ADSP-2181類似的運算資源但可提供幾乎兩倍的運算效能。我們利用0.35微米1P4M的CMOS製程實作此加速器,它的晶片面積為2.8mmx2.8mm,在工作頻率為100MHz時的功率消耗約為122mW。另外,我們的實驗數據顯示fine grained的ARM coprocessor interface並不適合所提出的coarse grained運算引擎,因為除了產生資料位址外,ARM處理器並無法與其coprocessor同時運作,而資料存取也有不少的累贅。要降低處理器介面所耗費的代價,協同處理器需更高的平行度及更複雜的軟體協調,我們在此論文的最後描述了這個狀況並提出一個FFT coprocessor的設計說明。 Microprocessors alone cannot satisfy the computation requirements in most multimedia applications at acceptable cost or power consumption. Attaching hardware accelerators or coprocessing datapaths is a common way to boost the performance. In this thesis, we study various DSP accelerators and classify them in the viewpoints of datapath designs and interfaces. Besides, we propose a novel coprocessing datapath with a pure dataflow computing engine, and wrap it in different interfaces to evaluate the performance, such as AMBA and the ARM922T coprocessor interface. The allocated computing resources of our datapath are similar as ADI ADSP-2181, including an ALU, a multiplier, and a barrel shifter, but all redundant components and unnecessary restrictions on its programming model are removed. By the way, software techniques are extensively investigated to reduce the hardware complexity, just as the VLIW principles. Moreover, we have developed a complete software tool to automatically translate the input floating-point SDFG into the microcode for the dataflow engine. DSP-lite, our first implementation of the proposed datapath with a DMA controller and the AMBA AHB interface, has almost twice performance (estimated in the cycle count) of the ADSP-2181 core. We have implemented the DSP-lite core in 1P4M 0.35μM CMOS technology and the core size is 2.8mm×2.8mm. The operating frequency is 100MHz with 122mW power consumption. In our simulations, the fine-grained ARM coprocessor interface, on the contrary, is over-designed for our coarse dataflow scheme, but the host execution time is still occupied as overheads. By the way, to achieve high utilization of both cores, it requires much software effort to harmonize the operations between the host and the coprocessing datapath. Finally, we use an FFT coprocessor example to demonstrate a better scheme to utilize the coprocessor interface. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT910428107 http://hdl.handle.net/11536/70436 |
Appears in Collections: | Thesis |