標題: | 高速渦輪碼晶片之設計與實作 Design and Implementation for high-throughput Turbo Decoder Chip |
作者: | 唐正浩 Tang, Cheng-Hao 張錫嘉 Chang, Hsie-Chia 電子研究所 |
關鍵字: | 高速;渦輪碼;High throughput;Turbo codes |
公開日期: | 2006 |
摘要: | 本論文提出了兩個關於渦輪碼的高速解碼器設計。首先是一個高速Max-Log-MAP解碼器,我們利用二維的基數-4x4加-比較-選擇單元來降低因為高基數設計所造成的高硬體複雜度,而使之具有更佳的成本效益。此外,我們更進一步利用資料路徑的時序重訂來提高解碼速度。根據實驗結果,此解碼器在0.13μm製程下最高能達到952 MS/s的傳輸速度,晶片的面積是1.96mm2。
由於傳統的渦輪解碼器是由兩個MAP解碼器所組成,因此我們接著提出一個高速的渦輪碼解碼器設計。在此,我們引進了一個新的交錯器概念,其利用了在不同的資料區塊間彼此交換資訊來增進改錯能力。而我們提出了一個蝶狀(Butterfly)架構來實現其行為,且不需要太多複雜的控制電路。最後,我們也採用了時序重訂過後的基數-2x2的Max-Log-MAP解碼器來當為整個渦輪器的組成解碼器。根據實驗結果,此解碼器在0.13μm製程下最高能達到1.06Gb/s的傳輸速度,晶片的面積是17.81mm2。 In this thesis, two high-throughput decoder design about turbo code are presented. The first one is a Max-Log-MAP decoder applied for the soft-input and soft-output (SISO) trellis decoding in the turbo code. The high throughput is achieved with a two-dimensional ACS design on the radix-4x4 trellis structure, resulting in a highly parallel and area-efficient decoder. We further apply the retiming technique to reduce the critical path delay of ACS operation. After 0.13um 1P8M CMOS chip implementation, the decoder occupies 1.96 mm area containing 220K gates. The estimated timing under the 1.08V supply and the worst case corner shows that the test chip can achieve the maximum 952MS/s throughput. Since the turbo decoder is composed by two SISO maximum a posteriori (MAP) component decoders, a following high-throughput turbo decoder design will be proposed. Here we introduce a concept of the inter-block permutation to overcome the long decoding latency caused by the conventional block interleaver. Furthermore, instead of developing a complex control mechanism for the interleaver, we propose a butterfly network utilizing its hardware structure to implement the behavior of inter-block permutation. Based on the inter-block permutation interleaver, we utilize 32 Max-Log-MAP decoders as well as short block length to increase the decoding throughput considerably without suffering performance degradation. And each component decoder is structured by the retimed radix-2x2 ACS unit for a modest hardware cost consideration. After 0.13um 1P8M CMOS chip implementation, a 1.06 Gb/s throughput with 8 decoding iterations is achieved in the 17.81 mm silicon area containing 2.67 gates. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009311668 http://hdl.handle.net/11536/78140 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.