標題: 渦輪碼之互反雙重柵欄:演算法與實現
Turbo Codes with Reciprocal Dual Trellis: Algorithm and Implementation
作者: 林振揚
Lin, Chen-Yang
張錫嘉
Chang, Hsie-Chia
電子工程學系 電子研究所
關鍵字: 渦輪碼;超大型積體電路;高速;低功率;Turbo Codes;VLSI;high speed;low power
公開日期: 2015
摘要: 本論文主要探討高碼率渦輪解碼器的設計方法以及硬體實作。一般支援高碼率碼字的渦輪解碼器通常會面臨解碼速度降低以及龐大的硬體複雜度的問題,然而,對於一個碼率是k/(k+1)之迴旋碼,其對應的互反雙重碼率為1/(k+1),當k大於1時,互反雙重柵欄比起一般的解碼柵欄會有較低的硬體複雜度。儘管如此,目前尚未有根據此解碼柵欄而設計的渦輪解碼器晶片,據此,本論文根據互反雙重柵欄提出可達成高速且低複雜度之渦輪解碼器架構,除了設計架構外,我們也提出適用於硬體實作之事後機率方程式。為了可以使解碼器可以更廣泛地適用於不同之操作碼率,一個適用於週期性打孔(puncture)迴旋碼之可重組式互反雙重柵欄產生方法亦於本論文中提出。根據互反雙重柵欄演算法,我們首先以radix-2之互反雙重柵欄設計碼率為k/(k+2)之渦輪解碼器,在此初步的架構中使用了k套平行之事後機率運算單元,因此使得解碼速度隨著操作碼率提升而增加,接著,本論文提出了兩種基於互反雙重柵欄平行計算架構,進一步提升解碼速度。第一,使用quadratic permutation polynomial (QPP)交錯器設計平行多套soft-in soft-out (SISO)解碼器的架構;第二,我們合併兩段radix-2之互反雙重柵欄,形成radix-4之解碼柵欄,並且發展能支援平行運算出2k個事後機率值之互反雙重柵欄架構。根據radix-4互反雙重解碼柵欄之運算統計特性,我們也發展了低複雜度之radix-4解碼架構。雖然使用平行架構能增加解碼速度,但是也增加了渦輪解碼器的硬體複雜度。為了改善此問題,我們提出了一個時間分工之解碼程序,此方法在不損失解碼速度及錯誤更正能力情形下,能降低一半數量的事後機率運算單元,達成降低硬體複雜度的目的。此外,我們提出了混合柵欄架構,除了保有互反雙重柵欄功能,也支援傳統radix-4柵欄,目的是提高操作在低碼率之解碼速度和支援非週期性穿刺之編碼。 我們根據以上提出的方式來設計三種渦輪解碼器。第一個渦輪解碼器為radix-2的互反雙重柵欄,並且能支援四種不同碼率,在碼率為4/5情況下可以達到101Mb/s之吞吐量。第二個渦輪解碼器中搭配了QPP交錯器和兩個平行之SISO解碼器,在以40奈米製程實作並量測後,在位元數列長4096以及6次解碼迴圈下,吞吐量最高可以達到535 Mb/s。最後我們使用了時間分工的解碼程序並且以radix-4互反雙重柵欄來提升解碼速度,因為節省了一半數量的事後機率運算單元,因此能減少15%之原有SISO解碼器之面積,使的此渦輪解碼器在單獨使用一個SISO解碼器架構下,以600k邏輯閘和152kb之隨機存取記憶體的結果可以達到425 Mb/s。根據實作的結果顯示,本論文所提出之渦輪解碼器適用於需要高碼率及高速之通訊系統。
This dissertation investigates turbo decoders from algorithms to architectures and VLSI implementations for high code rate applications. Traditional turbo decoders dealing with high code rate scheme usually suffer significant degradation of decoding speed and hardware complexity. For a convolutional code with rate-k/(k+1), its corresponding reciprocal dual code with rate-1/(k+1) has smaller code space while k>1, leading to simplified trellis of the high code rate codes. In this dissertation, several architectures based on the reciprocal dual trellis are proposed in pursuit of high speed and low complexity turbo decoders. Other than architectures, we also develop the algorithm of the generator which allows decoders to configure the reciprocal dual trellis for periodical punctured convolutional codes. Moreover, the hardware-friendly extrinsic equations are derived for low cost of circuit complexity. The radix-2 reciprocal dual trellis is applied to the turbo decoder with the turbo code rate k/(k+2). In this architecture, k parallel extrinsic units are applied to decode in one trellis. Hence, the decoding speed increases as the operated code rate rises. Furthermore, we present two architectures to enhance the decoding speed. First, the architecture exploiting the contention-free property of quadratic permutation polynomial (QPP) interleaver and parallel SISO decoders with the reciprocal dual trellis is proposed to design the turbo decoder. Second, the radix-4 trellis structure obtained by merging two consecutive radix-2 reciprocal dual trellises is developed, where 2k extrinsic units can be employed and simultaneously activated. We further develop a simplified trellis structure based on the statistic property of the computation in the radix-4 reciprocal dual trellis. However, the employment of parallel processors results in considerable hardware complexity within the turbo decoder. To ameliorate this situation, a time-multiplexing decoding schedule is proposed. The method can reduce half number of extrinsic units in the SISO decoder without any degradation of throughput and the error correcting ability. In addition, a hybrid trellis architecture consisting of reciprocal dual trellis and radix-4 conventional trellis is proposed to decode non periodical punctured turbo codes and enhance the throughput at low code rate schemes. According to the proposed methods, three multiple code rate turbo decoders are designed with reciprocal dual trellis. The first turbo decoder supporting four code rates with single radix-2 SISO decoder can reach maximum throughput 101 Mb/s at rate-4/5. The second turbo decoder was designed with the QPP interleaver and two parallel SISO decoders. After implemented in 40 nm, the turbo decoder can reach 535 Mb/s throughput while decoding block size 4096 for 6 iterations. The third design applies radix-4 reciprocal dual trellis and exploits the time-multiplexing decoding schedule, leading to 15% hardware reduction of SISO decoder. The turbo decoder employed one SISO processor can achieve 425 Mb/s with 600 k-gates and 152 kb SRAM. The implementation results reveal that the proposed turbo decoders with the reciprocal dual trellis are suitable for communication systems requiring high code rate schemes.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079811833
http://hdl.handle.net/11536/125990
Appears in Collections:Thesis