標題: 高效能之管線式傅立葉轉換處理器之設計與實現
Design and Implementation of High-Effective Pipelined Processors for Discrete-Time Fourier Transform Applications
作者: 余遠渠
Yu Yuan-Chu
林進燈
Lin Chin-Teng
電控工程研究所
關鍵字: 高效能之管線式處理器;雙聲多頻偵測器;多輸入多輸出之正交多頻的無線區域網路;多輸入點之長快速傅立葉轉換運算;快速傅立葉正(反)轉換/二維數位餘弦轉換;下代手機應用;Effective Pipeline Processor;DTMF;MIMO-OFDM Wireless LAN;long-length based FFT/IFFT computations;FFT/IFFT/2D-DCT computations;next-generation mobile applications
公開日期: 2007
摘要: 本篇論文針對傅立葉轉換,設計其高效能之管線式處理器。論文以四種不同之即時應用為範例來提出其對應之高效能設計,其包括:雙聲多頻偵測器在高通道密度之VoP應用、多輸入多輸出之正交多頻的無線區域網路、多輸入點之長快速傅立葉轉換運算在手機之數位影像傳波系統應用、以及快速傅立葉正(反)轉換/二維數位餘弦轉換在下代手機之多媒體應用。針對這四種明顯不同之應用,本論文提出了六種特定之硬體導向設計,以達到最高效能之管線式處理器架構,其評估之指標包括: 單位時間輸出量、計算延遲時間、運算複雜度、硬體成本與硬體使用之利用率。在雙聲多頻偵測器之應用上,本論文採用:精簡式輸入序列架構、分散式記憶體以及柴比雪夫多項式為基準之改良式遞迴式轉換器,來達到低計算週期、高能量利用率之優點。所架構之單聲多頻偵測器單核心,可在相同之運算速度及運算時間內,達到雙倍之資料運算量。對於2×2以及4×4多輸入多輸出之正交多頻的無線區域網路,本論文提出兩種高效能之快速傅立葉正(反)轉換處理器:積數2/8之多回授路徑架構(R28MDF)與積數2/8之多延遲整流路徑架構(R28MDC)。依據精簡式之基數8快速傅立葉轉換單元(R8-FFT),配合先寫後讀(MAW)之技巧,此兩架構達到了100%之蝴蝶器利用率,同時更在單位時間內達到高輸出量已滿足2×2以及4×4多輸入多輸出之正交多頻之無線區域網路需求。針對多輸入點之長快速傅立葉轉換運算應用上,本論文提出兩個新式架構:基數42單一迴授路徑架構與基數43單一迴授路徑架構,其以較少之基數4理論來達到高基數16與基數64之低運算複雜度效能。在跟其他數個已存在之管線式處理器比較後,可證明本論文所提出之架構,以最少之硬體成本達到最高之硬體使用率,因此達到了高效能之應用需求。最後根據基數42單一迴授路徑架構,配合區段移位暫存器與翻轉移位暫存器架構,架構了一”三模處理器”來支援256點之快速傅立葉正(反)轉換運算與二維數位餘弦轉換運算。同樣地,在跟其他數個現存之管線式處理器比較後,可證明本論文所提出之架構,以最少之硬體成本達到最高之硬體使用率,因此達到了高效能之應用需求。在本論中六個處理器皆以用TSMC 0.13µm CMOS製程完成實現與驗證,根據實現結果與嚴謹之比較,我們可證明本文所提出之RDFT、R28MDF/R28MDC、R42SDF/ R43SDF 與三模處理器,在雙聲多頻偵測器、多輸入多輸出之正交多頻的無線區域網路、多輸入點之長快速傅立葉轉換運算、下代手機之多媒體應用上皆達到高處理效能之優點。
In this thesis, the design and implementation of effective pipeline processors for Fourier transform are presented. Four different real-time applications are introduced, which includes dual tone multi-frequency (DTMF) detector in the high channel density voice over packet (VoP) application, multiple-input multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) wireless LAN (WLAN) system, long-length based FFT/IFFT computations in digital video broadcasting-handheld (DVB-T) standard and FFT/IFFT/2D-DCT computations in next generation mobile multimedia applications. According to these four standards, six specific hardware-orientated designs for most effective pipeline processors have been proposed in terms of throughput, computation latency, computation complexity, hardware cost and hardware utilization. For the DTMF standards, one low-computation cycle and power-efficient recursive DFT/IDFT processor adopting a hybrid of input strength reduction, the Chebyshev polynomial, and register-splitting schemes has been proposed. Appling this novel low-computation cycle architecture, we could double the throughput rate and the channel density without increasing the operating frequency for the DTMF detector in the high channel density VoP application. Two effective FFT/IFFT processors, namely adix-2/8 multiple-path delay feedback (R28MDF) based and raidx-2/8 multiple-path delay commutator (R28MDC) based FFT/IFFT processors for the 2×2 and 4×4 MIMO-OFDM WLAN systems, respectively. By applying the retrenched 8-point FFT (R8-FFT) unit combined with the proposed multiplication-after-write (MAW) method, the R28MDF and R28MDC architectures resulted in 100% butterfly utilization and an appropriate throughput rate with few hardware resources for the 2×2 and 4×4 MIMO-OFDM applications, respectively. For the long-length based FFT/IFFT computations, two novel radix-42 single-path delay feedback (R42SDF) design and radix-43 single-path delay feedback (R43SDF) design with the low computational complexities of the radix-16 and radix-64 algorithms and the low hardware requirement of the radix-4 algorithm achieve the smallest hardware cost and the highest hardware utilization among the tested architectures and thus has the highest efficiency. Base on the effective R42SDF architecture with the segment shift register (SSR) and overturn shift register (OSR) structure, the proposed triple-mode processor not only supports both 256-point FFT/IFFT and 8×8 2-D DCT computations, but also has the smallest hardware requirement and largest hardware utilization among the tested architectures for the FFT/IFFT computation, and thus has the highest cost efficiency. In this thesis, six processors all implemented under TSMC 0.13µm CMOS process. According to the comprehensive comparisons and implementation results, we could demonstrate that the proposed RDFT, R28MDF/R28MDC, R42SDF/ R43SDF and Triple-Mode designs achieve the high effective advantages for DTMF, MIMO-OFDM WLAN, DVB-T and next-generation applications.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009212801
http://hdl.handle.net/11536/69346
顯示於類別:畢業論文


文件中的檔案:

  1. 280101.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。