# 國立交通大學 # 電子工程學系電子研究所 博士論文 以正交頻分多工技術為基礎的低複雜度 無線基頻收發器之研究 Study on Low-Complexity OFDM-Based Wireless Baseband Transceiver 研究生:劉軒宇 指導教授:李鎮宜博士 中華民國九十五年七月 #### 以正交頻分多工技術為基礎的 #### 低複雜度無線基頻收發器之研究 #### **Study on Low-Complexity OFDM-Based** #### **Wireless Baseband Transceiver** 研究生: 劉軒宇 Student: Hsuan-Yu Liu 指導教授: 李鎮宜博士 Advisor: Dr. Chen-Yi Lee #### A Dissertation Submitted to Department of Electronics Engineering & Institute Electronics College of Electrical and Computer Engineering National Chiao Tung University in Partial Fulfillment of Requirements for the Degree of Doctor of Philosophy in Electronics Engineering July 2006 Hsinchu, Taiwan, Republic of China 中華民國 九十五 年 七 月 # 以正交頻分多工技術為基礎的低複雜度 無線基頻收發器之研究 研究生:劉軒宇 指導教授:李鎮宜教授 國立交通大學電子工程學系電子研究所 #### 摘要 在本論文中我們將提出一個結合低複雜度同步器與頻道等化器之以正交頻分多工技術 (OFDM)為基礎的無線基頻收發器。在現有的 54Mb/s 無線區域網路 (WLAN) 設計中 OFDM 基頻處理器的功率消耗相當可觀,約為 200mW 以上,佔物理層 (PHY) 設計功率的 35%以上。而當系統進展到 480Mb/s 超寬頻 (UWB) 時,功率消耗將會隨著速度再度提升,因此低功率設計成為高速基頻收發器的關鍵技術。而在現有 OFDM 收發器中,同步器與頻道等化器需要大量硬體複雜度來求得發射器與接收器之間的訊號偏移和無線頻道衰減,而共佔了 OFDM 設計裡約 75%以上的邏輯閘數目與功率消耗,因此如何設計低複雜度的同步器與頻道等化器遂成為我們低功率 OFDM 收發器上的研究重點。 在本論文中,首先在 54Mb/s WLAN 應用上,我們提出一個同步器設計其包含一個僅使用高能量訊號之自相關計算器 (Auto-correlator; AC) 與一個僅使用高能量係數之匹配濾波器 (Matched filter; MF)。不同於以往之同步器,我們提出之設計僅使用原信號或係數中高能量的部分,以降低訊號量和運算量,因此同步器中的乘法次數與暫存器尺寸可以縮減。我們所提出之設計可以在每個封包 (Packet) 裡減少整個基頻收發器 16.3%的乘法次數(等同於 3160 次)。而另外在頻道等化器中,為了減少多路徑頻道衰減下達到10%封包錯誤率 (Packet Error Rate)所提高之 SNR,並在低硬體複雜度下對抗都普勒(Doppler) 效應造成的時變性頻道,我們採用頻率軸上最小方均錯誤之頻道等化設計(Frequency-Domain Minimum Mean-Square-Error Channel Equalization; FD-MMSE EQ)並提出決策導向頻道追蹤器 (Decision-Directed Channel Tracking; DDCT)。所採用之頻道等 化設計可有效對抗多路徑頻道與減低目前廣泛使用之直接除法式等化器(Direct-Division Equalization)的效能損失。我們所提出之 DDCT 其包含 2 個複數乘法器來追蹤頻道的變化。它可降低在室內多路徑頻道和都普勒效應環境下 5dB~15dB 的頻道偵測的方均誤差 (Mean-square-error)。 在 480Mb/s UWB 應用上,我們提出一個同步器設計其包含一個以次取樣為基礎之AC 與不需使用 Moving-Average 的 MF。我們所提出的設計不只減少乘法次數,還能減少因應於 UWB 之平行架構的硬體複雜度。與一般採用 4 倍平行度與 132MHz 時脈頻率以達到 528Msamples/s 資料處理速度之同步設計相比,我們提出的設計在同樣 132MHz 時脈頻率下僅需要 1 倍平行度以及 1/4 的運算。因此同步設計僅需 4 倍平行設計的 37.6%的邏輯閘數目與 43.3%的功率消耗。在頻道等化器方面,為了消除造成大功率消耗之複數乘除法器,我們提出一個不使用複數乘除法器的頻道等化器,完全以加減法器代替原有的乘除法器。與一般等化器相比,我們所提出之設計僅需 48.6%的邏輯閘數目與 40.4%的功率消耗,並在標準 8%封包錯誤率上僅增加 0.3dB 的 SNR 誤差。 1896 最後基於我們所提出的低複雜度設計,可應用於以 OFDM 為基礎的 WLAN、以 LDPC-COFDM 為基礎之 UWB 、以及多頻帶 OFDM 為基礎之 UWB 的基頻收發器已經在 0.18μm 與 0.13μm CMOS 製程上完成設計與測試。他們可以達到 6~480Mb/s 的高傳輸速率並且在 AWGN 頻道下達到比系統需求好 6.45~9.7dB 的 SNR 效能,而在 11~121-tap 多路徑頻道下也可滿足系統效能需求。我們設計的應用於 54Mb/s WLAN 之 OFDM 收發器的功率消耗是 68mW。而應用於 480Mb/s UWB 之 OFDM 收發器的功率消耗在 0.18μm 與 0.13μm CMOS 製程下分別是 162mW 與 31.2mW。在 UWB 設計中我們提出的低複雜度同步器與頻道等化器可以減少整個 OFDM 收發器 45.3%的邏輯開數目 與 65.1%功率。 # Study on Low-Complexity OFDM-Based Wireless Baseband Transceiver Student: Hsuan-Yu Liu Advisor: Chen-Yi Lee Department of Electronics Engineering and Institute of Electronics, National Chiao-Tung University #### **Abstract** In this thesis we propose Orthogonal Frequency Division Multiplexing (OFDM)-based baseband transceivers comprising the low-complexity synchronizer and channel equalizer schemes. In the 54Mb/s Wireless Local Area Network (WLAN), the existing OFDM baseband chips consumes > 200mW power, which occupies > 35% power of existing Physical Layer (PHY) system. When the system migrates to 480Mb/s Ultra-Wide Band (UWB), the baseband power will furthermore grow following the raised circuit speeds. Hence the low-power design becomes the key technique of a high-speed baseband transceiver. In OFDM transceiver, the synchronizer and channel equalizer require high hardware complexity to acquire signal offsets between transmitter and receiver and to solve the wireless channel fading, therefore occupying > ~75% gate-count and power of OFDM design. Hence the low-complexity synchronizer and channel equalizer schemes are focused in our research work. In this paper, first we propose a synchronizer comprising high-power-signal-used auto-correlator (AC) and high-power-coefficient matched filter (MF) for OFDM-based WLAN system. Different from the existing synchronization algorithm, the proposed synchronizer only uses partial high-power signal and coefficients therefore reducing the amount of used signal and computations. Hence the multiplication amount and register size can be efficiently reduced. Equivalent to 16.3% of complex multiplications (equal to 3160) of the WLAN OFDM baseband transceiver can be reduced. For reducing the increased SNR for 10% PER increased by multipath channel and solving the time-variant channel caused by the Doppler effect with low cost, we employ a frequency-domain minimum mean-square-error channel equalization (FD-MMSE EQ) and propose a decision-directed channel tracking (DDCT). The employed FD-MMSE EQ can efficiently reduce the SNR loss caused by conventional direct-division equalization scheme. The proposed DDCT comprising only 2 complex multipliers is used to track the channel variance. In the indoor multipath channel with Doppler effect the proposed DDCT can reduce the mean-square-error of channel estimation by 5dB~15dB. For 480Mb/s OFDM-based UWB, we proposed a synchronizer comprising sub-sampling-based AC and moving-average-free MF. The proposed synchronizer not only needs fewer multiplications but also needs lower hardoware cost. A general synchronizer needs 4-parallelsim to achieve 528Msamples/s throughput rate with 132MHz clock rate. The proposed synchronizer only needs 1-parallelsim and 1/4 of computations therefore needing 37.6% gate count and 43.3% power of the general synchronizer. Then we propose a divider-and-multiplier-free channel equalizer where the original complex divider and multipliers are completely replaced by adders and subtractions. It only needs 48.6% gate count and 40.4% power of a general OFDM channel equalizer, and the added SNR loss for typical 8% packet error rate is only 0.3dB. The baseband transceivers comprising the proposed low-complexity designs are implemented for OFDM-based WLAN, low-density-parity-check (LDPC)-COFDM-based UWB, and multi-band (MB)-OFDM-based UWB systems with 0.18μm and 0.13μm CMOS process. They can achieve 6~480Mb/s high data rates and better 6.45~9.7dB SNR than system performance requirements. The power of the proposed OFDM transceiver for 54Mb/s WLAN is 68mW in 0.18μm CMOS process. And the power of the proposed OFDM transceivers for 480Mb/s UWB is 162mW in 0.18μm CMOS and 31.2mW in 0.13μm CMOS process. The proposed low-complexity synchronizer and equalizer can reduce 45.3% gate-count and 65.1% power of the UWB OFDM transceiver. #### 誌謝 在 Si2 實驗室從大三專題到博士畢業的 9 年時光是我求學過程中最美好的回憶,非常感謝我的指導教授李鎮宜博士,他創造了非常完善的研究環境,使我們有機會從系統的層面與全方位的角度來思考研究的方向,他待人處世的優雅態度一直是我們學生效仿的對象與學習的目標,他關心學生的生活,在研究與人生的方向上不斷給我們建議和鼓勵,使我們能在如沐春風的環境下漸漸的成長茁壯。感謝實驗室的學長姐與同學:許騰尹學長、許騰仁學長、鍾經哲學長、張錫嘉學長、方信雄學長、李有山學長、謝百舉學長、彭文孝學長、陳寶龍學長、王中正學長、林淑慈學姊、蔡尚峰、劉益全、曾順得、載元邦、張偉信、洪健仁、陳黎峰、林昱偉、林建青。他們會在我迷失方向經常提供有益的建言,也不斷地關心我的生活,使我能有精神上的依靠而不斷奮鬥下去。感謝與我一同研究的學弟妹:陳俊吉、魏弘國、俞壹馨、游瑞元、張瑋哲、陳林宏、廖婉君、楊美慧。在學弟妹的幫忙下,我的研究成果才能夠如此豐碩美好。 在我的博士生涯中見證了實驗室研發的茁壯與成長,大家把一個個關鍵模組漸漸構 建成完整的系統,經歷博士班的訓練使我的思考模式與眼光能更加的廣闊。也非常感謝 我的家庭能夠一直給予我支持,使我能完成博士的學業,最後感謝口試委員的指導與實 貴意見。 # **CONTENTS** *PAGE* | Chapter 1: | Introduction | 1 | |------------------|-----------------------------------------------------------------|----| | 1-1 Thesis Moti | vation | 1 | | 1-1-1 Motiva | tion of Low-Complexity Synchronizer for WLAN System | 3 | | 1-1-2 Motiva | tion of Low-Complexity Channel Equalizer for WLAN System | 6 | | 1-1-3 Motiva | tion of Low-Complexity Synchronizer for UWB System | 7 | | 1-1-4 Motiva | tion of Low-Complexity Channel Equalizer for UWB System | 9 | | 1-2 Thesis Outli | ine | 10 | | Chapter 2: | OFDM Systems and Channel Model | 11 | | 2-1 OFDM-Bas | ed WLAN System | 11 | | 2-2 MB-OFDM | -Based UWB System | 19 | | 2-3 LDPC-COF | DM-Based UWB System | 26 | | | odels of WLAN and UWB systems | | | 2-4-1 Multipa | ath Channel and Doppler Effect of WLAN and UWB systems | 32 | | 2-4-2 Baseba | nd to Passband Conversion and RF Filtering | 38 | | | Frequency Offset and Carrier Phase Noise | | | 2-4-4 Samplin | ng Clock Offset | 47 | | 2-4-5 AWGN | Channel | 50 | | Chapter 3: | Low-Complexity Design for OFDM-Based WLAN System | 52 | | 3-1 Low-Compl | lexity Synchronization for OFDM-Based WLAN System | 53 | | 3-1-1 Synchro | onization Block Diagram for OFDM-Based WLAN System | 54 | | 3-1-2 Genera | l Auto-Correlation-Based PD | 56 | | 3-1-3 Propose | ed High-Power-Signal-Used Auto-Correlation for PD | 57 | | 3-1-4 Genera | l Auto-Correlation-Based CFO Estimation | 60 | | _ | ed High-Power-Signal-Used Auto-Correlation-Based CFO Estimation | | | | l Matched-Filter-Based FWD | | | 3-1-7 Propose | ed High-Power-Coefficient-Used Matched-Filter for FWD | 66 | | 3-2 Low-Compl | lexity Channel Estimation for WLAN System | 68 | | | Channel Equalization with Phase Error Tracking | | | | yed MMSE Channel Equalization for OFDM | | | 3-2-3 Propose | ed Decision-Directed Channel Tracking | 78 | #### **CONTENTS** | 3-2-4 Propose | ed Weighted-Average Phase Error Tracing with Pilot Pre-compensa | tion 83 | |------------------|----------------------------------------------------------------------|------------| | 3-3 Performance | e Analysis of Low-Complexity Design for OFDM-Based WLAN S | ystem . 91 | | 3-3-1 Perform | nance of the Proposed Low-Complexity Auto-Correlation | 91 | | | nance of the Proposed Low-Complexity Matched Filter | | | 3-3-3 Perform | nance of the Proposed DDCT and MMSE EQ | 100 | | 3-3-4 Perform | nance of the Proposed WAPET | 104 | | 3-4 Floating-Poi | nt PER for OFDM-Based WLAN System | 107 | | Chapter 4: | Low-Complexity Design for OFDM-Based UWB System | 113 | | 4-1 Low-Comple | exity Synchronization for OFDM-Based UWB System | 114 | | 4-1-1 Synchro | onization Block Diagram of OFDM-Based UWB System | 115 | | 4-1-2 The Pro | posed Data-Partition-Based Auto-Correlation | 120 | | 4-1-3 Data-Pa | artition-Based and Moving-Average-Free Matched Filter | 123 | | 4-1-4 The Pro | posed Dynamic-Threshold Design | 125 | | 4-2 Low-Comple | exity Channel Equalization for UWB System | 126 | | 4-2-1 Basic D | ivider-Based Channel Equalization with WAPET | 126 | | | posed Divider-and-Multiplier-Free Channel Equalizer | | | 4-3 Performance | e Analysis of Low-Complexity Designs for OFDM-Based UWB Sy | stem 136 | | | nance of the Proposed Sub-sampling-Based Auto-Correlation and M | | | | A DED A l : Cal - D A D :- Ab l - 14 d : C LIW | | | 4-3-2 FER and | d PER Analysis of the Proposed Dynamic-threshold design for UW | /B | | | | | | 4-3-3 CE MSI | E and PER Analysis of the Proposed Divider-free Channel Equalization | ation. 143 | | 4-4 System Perfe | ormance of LDPC-COFDM-Based UWB System and MB-OFDM- | -Based | | UWB System | n | 145 | | 4-4-1 PER of | LDPC-COFDM-Based UWB System | 145 | | | MB-OFDM-Based UWB System | | | Chapter 5: | Hardware Architecture and Baseband Chip Design | 156 | | 5-1 Hardware Do | esign for OFDM-Based WLAN System | 157 | | 5-1-1 Fixed-p | oint Performance of WLAN system | 157 | | _ | re Architecture of the Proposed Designs for WLAN System | | | | xity Analysis of the Proposed Designs for WLAN System | | | _ | ed Baseband Chip for OFDM-Based WLAN System | | | 5-2 Hardware D | esion for OFDM-Based HWB System | 181 | #### **CONTENTS** | 5-2-1 Fixed- <sub>1</sub> | point Performance of OFDM-Based UWB Systems | 181 | |---------------------------|-------------------------------------------------------------|-----------| | 5-2-2 Hardwa | are Architecture of the Proposed Designs for OFDM-Based UW | B Systems | | | | 190 | | 5-2-3 Comple | exity and Power Analysis of the Proposed Designs for OFDM-B | ased UWB | | Systems | S | 198 | | 5-2-4 Propos | ed Baseband Chip for LDPC-COFDM-Based UWB System | 204 | | 5-2-5 Propos | ed Baseband Chip for MB-OFDM-Based UWB System | 208 | | Chapter 6: | Conclusions and Future Work | 214 | | References | | 218 | | Appendix A: Sup | plementary of OFDM-Based System SPEC | 224 | | A-1: System Pa | rameter Derivation | 224 | | A-2: Power Spe | ectrum Density Requirement | 227 | | A-3 RF Band A | llocation, Spreading Scheme, and Overcoming Jamming Technic | que of | | MB-OFDM | System | 228 | | A-4: Conversio | n Scheme from Zero-Pad to Cyclic-Prefix in OFDM-Based UW | B Systems | | | | 232 | | ••••• | | 232 | *PAGE* | Figure 1-1: Power of 54Mb/s WLAN designs | 2 | |-------------------------------------------------------------------------------------|----| | Figure 1-2: Percentages of hardware complexity of OFDM transceiver for WLAN system. | 3 | | Figure 1-3: FER and PER of existing approaches for OFDM-based WLAN system | 4 | | Figure 1-4: Computation of the general auto-correlation and matched filter | 5 | | Figure 1-5: Time-variant channel frequency response for WLAN system. | 6 | | Figure 1-6: FER and PER of LDPC-COFDM-based UWB system | 8 | | Figure 1-7: FER and PER of LDPC-COFDM-based UWB system | 8 | | Figure 2-1: Packet format of IEEE 802.11a system | 14 | | Figure 2-2: Data OFDM symbol format of IEEE 802.11a system | 15 | | Figure 2-3: Block diagram of IEEE 802.11a WLAN system | 17 | | Figure 2-4: Block diagram of IEEE 802.11a baseband system | 18 | | Figure 2-5: Block diagram of baseband and RF for MB-OFDM-based UWB system | 21 | | Figure 2-6: Packet format of MB-OFDM-based UWB system | 23 | | Figure 2-7: OFDM symbol format of MB-OFDM-based UWB system | 23 | | Figure 2-8: Block diagram of MB-OFDM baseband system | 25 | | Figure 2-9: Packet format of LDPC-COFDM-based UWB system | 28 | | Figure 2-10: Block diagram of MB-OFDM baseband system | 29 | | Figure 2-11: Block diagram of the channel model for single-band wireless system | 31 | | Figure 2-12: Block diagram of the channel model for multipath-band wireless system | 31 | | Figure 2-13: Examples of impulse response of IEEE multipath channel | 34 | | Figure 2-14: Examples of frequency response of IEEE multipath channel | 35 | | Figure 2-15: Examples of CM channel impulse response. | 37 | | Figure 2-16: Examples of CM channel frequency response | 37 | | Figure 2-17: A simple example to up-convert the signal from baseband to passband | 39 | | Figure 2-18: Received baseband signal power with RF filtering in MB-OFDM system | 41 | | Figure 2-19: A LPF of UWB RF receiver | 42 | | Figure 2-20: Block diagram with CFO | 43 | | Figure 2-21: Signal distortion with CFO of 20ppm and 0.4ppm of 5GHz | 46 | |-----------------------------------------------------------------------------------------------------------------------------------|-----| | Figure 2-22: Phase noise spectrum of CFO model | 46 | | Figure 2-23: An example of oversampled signal with clock offset | 48 | | Figure 2-24: Frequency-domain signal distortion caused by 20ppm clock frequency offset | 49 | | Figure 2-25: SER with the proposed AWGN generation method | 51 | | Figure 3-1: Block diagram of baseband synchronization for OFDM-based WLAN system. | 55 | | Figure 3-2: Preamble format of IEEE 802.11a system | 55 | | Figure 3-3: Example of the auto-correlation power of the packet detection | 57 | | Figure 3-4: Short symbol power and noise power with 3dB SNR | 58 | | Figure 3-5: Auto-correlation power of high-power and low-power signal | 59 | | Figure 3-6: An example of phase error after CFO estimation in WLAN system | 62 | | Figure 3-7: Phase error of CFO estimation with all signal, high-power signal, and low-pow signal | 63 | | Figure 3-8: An example of matched filter power | 65 | | Figure 3-9: The long symbol power and AWGN power in average 3dB SNR. | 66 | | Figure 3-10: Example of the matched filter power of high-power and low-power coefficien | nts | | | 67 | | Figure 3-11: Example of estimated CFR and true CFR with 50ns RMS and 10dB SNR | 70 | | Figure 3-12: Example of detected phase error in OFDM-Based WLAN system | 71 | | Figure 3-13: Data MSE of perfect EQ schemes | 76 | | Figure 3-14: Example of time-variant CFR with 50Hz Doppler frequency | 79 | | Figure 3-15: Block diagrams of DDCT with feedforward and feedback compensation | 82 | | Figure 3-16: Incorrect phase detection when the phase error exceeds $\pm \pi$ | 84 | | Figure 3-17: Weighting Factors of PET designs | 87 | | Figure 3-18: Block diagram of the proposed non-linear WAPET | 88 | | Figure 3-19: Tracked (a) mean and (b) slope of the phase error caused by 0.1ppm residual CFO and 40ppm SCO during 30 OFDM symbols | | | Figure 3-20: Tracked phase error of WAPET with and without pilot pre-compensation | 90 | | Figure 3-21: FER of the proposed low-complexity PD for OFDM-based WLAN system | 92 | | Figure 3-22: RMSE of the proposed low-complexity CFO estimation for OFDM-based | | | WLAN system | 93 | |----------------------------------------------------------------------------------------------------------|-------| | Figure 3-23: 6Mb/s PER of the proposed low-complexity auto-correlation | 94 | | Figure 3-24: 54Mb/s PER of the proposed low-complexity auto-correlation | 94 | | Figure 3-24-2: FER and CFO estimation error with $\omega = 8$ | 96 | | Figure 3-24-3: CFO estimation RMSE and range for OFDM-based WLAN system | 96 | | Figure 3-25: CFO estimation RMSE and range for OFDM-based WLAN system | 97 | | Figure 3-26: FER of the proposed low-complexity MF for OFDM-based WLAN system. | 98 | | Figure 3-27: 6Mb/s PER of the proposed low-complexity MF in AWGN channel | 99 | | Figure 3-28: 6Mb/s PER of the proposed low-complexity MF in IEEE multipath channel 50ns RMS delay spread | | | Figure 3-29: Mean Square Error of CE and CT schemes with 0Hz Doppler frequency | 101 | | Figure 3-30: Mean Square Error of CE and CT schemes with 50Hz Doppler frequency | 102 | | Figure 3-31: PER of 6Mb/s in IEEE multipath channel with 0Hz Doppler frequency | 103 | | Figure 3-32: PER of 6Mb/s in IEEE multipath channel with 50Hz Doppler frequency | 104 | | Figure 3-33: PER of 54Mb/s in IEEE multipath channel with 50Hz Doppler frequency | 104 | | Figure 3-34: 6Mb/s PER of the proposed WAPET | 105 | | Figure 3-35: 54Mb/s PER of the proposed WAPET | 106 | | Figure 3-36: PER of perfect synchronization in AWGN channel | 108 | | Figure 3-37: PER of the proposed design in AWGN channel | 108 | | Figure 3-38: PER of perfect synchronization in multipath channel | 110 | | Figure 3-39: PER of the proposed design in multipath channel | 111 | | Figure 3-40: SNR loss for 10% PER of OFDM-based WLAN system | 112 | | Figure 4-1: Preamble structure of OFDM-based UWB system | 116 | | Figure 4-2: Data flow of synchronization for OFDM-based UWB system | 117 | | Figure 4-3: Auto-correlation power used for packer detection in UWB system | 118 | | Figure 4-4: Auto-correlation phase used for CFO estimation in UWB system | 118 | | Figure 4-5: Received signal power used for band detection | 119 | | Figure 4-6: Sum of two continuous auto-correlation results for PTD of UWB system | 120 | | Figure 4-7: Normalized auto-correlation power in (a) better channel and (b) worse channel | el122 | | Figure 4-8: Matched-filter power in (a) better channel and (b) worse channel | 125 | |---------------------------------------------------------------------------------------------------|-------| | Figure 4-9: Example of estimated CFR and true CFR with 5ns RMS and 10dB SNR in 528MHz UWB system | 128 | | Figure 4-10: Example of detected phase error in 528MHz UWB system | 129 | | Figure 4-11: Tracked phase error caused by (a) 1pm residual CFO and (b) 40ppm SCO do OFDM symbols | | | Figure 4-12: Block diagram of the general channel equalizer | 132 | | Figure 4-13: Example of (a) constellation and (b) phase probability of the received QPSK symbols | | | Figure 4-14: Block diagram of the proposed divider-and-multiplier-free channel equalize | r 134 | | Figure 4-15: FER with different ω values of the proposed auto-correlation and matched f | | | Figure 4-16: CFO RMSE with different $\omega$ values of the proposed auto-correlation | 138 | | Figure 4-17: PER of 120Mb/s with different ω values of the proposed design | 139 | | Figure 4-18: PER of 480Mb/s with different ω values of the proposed design | 139 | | Figure 4-19: PER of 480Mb/s with different ω values of the proposed design | 141 | | Figure 4-20: FER with different threshold of PTD in the multipath channel | 141 | | Figure 4-21: PER with different threshold of PTD in 120Mb/s data rate | 143 | | Figure 4-22: CE MSE of the proposed channel equalizer. | 144 | | Figure 4-23: PER of proposed channel equalizer in 480Mb/s for MB-OFDM | 144 | | Figure 4-24: PER of LDPC-COFDM-Based UWB system in AWGN channel | 146 | | Figure 4-25: PER of LDPC-COFDM-Based UWB system in multipath channel | 147 | | Figure 4-26: SNR loss for 8% PER of OFDM-based WLAN system in AWGN channel | 149 | | Figure 4-27: PER vs. transmission distances of LDPC-COFDM system | 150 | | Figure 4-28: PER of MB-OFDM-Based UWB system in AWGN channel | 152 | | Figure 4-29: PER of MB-OFDM-Based UWB system in CM multipath channel | 153 | | Figure 4-30 (a): PER vs. transmission distance of 200Mb/s and 480Mb/s | 155 | | Figure 4-30 (b): PER vs. transmission distance of 110Mb/s | 155 | | Figure 5-1: 54Mb/s PER with different ADC/DAC wordlength in AWGN channel | 158 | | Figure 5-2: 54Mb/s PER with different ADC/DAC wordlength in multipath channel | 159 | | Figure 5-3: Block diagram of the proposed fixed-point WLAN baseband system | . 161 | |-----------------------------------------------------------------------------------------------|-------| | Figure 5-4: PER of fixed-point WLAN design in AWGN channel | . 162 | | Figure 5-5: PER of fixed-point WLAN design in multipath channel with RMS=50ns | . 163 | | Figure 5-6: Architecture of the auto-corrugators for OFDM-based WLAN system | . 167 | | Figure 5-7 Design of high-power signal selector and gate | . 167 | | Figure 5-8: Signal behavior of the auto-correlators | . 168 | | Figure 5-9: Architecture of the matched filters for OFDM-based WLAN system | . 169 | | Figure 5-10: Architecture of the proposed channel equalizer | . 170 | | Figure 5-11: Architecture of the proposed WAPET | . 171 | | Figure 5-12: System architecture of the proposed IEEE 802.11a baseband processor | . 177 | | Figure 5-13: Chip microphoto of OFDM-based WLAN baseband processor | . 177 | | Figure 5-14: Percentages of hardware complexity of OFDM transceiver for WLAN system | | | Figure 5-15: 480Mb/s PER with different wordlength setting for LDPC-COFDM system | | | Figure 5-16: 480Mb/s PER with different wordlength setting for MB-OFDM system | . 182 | | Figure 5-17: Block diagram of the fixed-point baseband design for UWB system | . 184 | | Figure 5-18: PER of fixed-point LDPC-COFDM-based UWB system | . 184 | | Figure 5-19: PER of fixed-point MB-OFDM-based UWB system | . 185 | | Figure 5-20: PER vs. transmission distances of fixed-point LDPC-COFDM-based UWB system | . 187 | | Figure 5-21: PER vs. transmission distances of fixed-point MB-OFDM-based UWB syste | m | | Figure 5-22: Architecture of auto-correlators for 528MS/s OFDM-based UWB systems | | | Figure 5-23: Architecture of matched filters for 528MS/s OFDM-based UWB systems | . 193 | | Figure 5-24: Example of signal in the three kinds of the matched filter | . 194 | | Figure 5-25: Architectures of the channel equalizers for UWB system | . 196 | | Figure 5-26: Architecture of the logarithm-based arc-tangent design. | . 198 | | Figure 5-27: System architecture of the proposed baseband processor for OFDM-based Usesystems | | | Figure 5-28: Chip microphoto of the proposed LDPC-COFDM-based UWB baseband | | | processor | 206 | |-----------------------------------------------------------------------------------------------------------|-----| | Figure 5-29: Percentage of gate-count and receiver power of the OFDM transceiver for LDPC-COFDM-based UWB | 209 | | Figure 5-30: Stages of clock buffers of OFDM transceiver | 210 | | Figure 5-31: Chip microphoto of MB-OFDM OFDM transceiver | 212 | | Figure 5-32: Power percentage of MB-OFDM UWB baseband transceiver | 213 | | Figure A-1: Power spectrum mask of IEEE 802.11a WLAN system | 228 | | Figure A-3: Band location of IEEE 802.15.3a OFDM-based UWB system | 230 | | Figure A-4: An example of MB hopping of MB-OFDM-based UWB system | 230 | | Figure A-5: An example of jamming happening in UWB bands | 231 | | Figure A-6: An example of received OFDM symbols with jamming | 232 | | Figure A-7: The transmitted OFDM signal with cyclic prefix | 232 | | Figure A-8: The transmitted OFDM signal with zero pad | 233 | | <i>LIST</i> | OF | TA | BL | ES | |-------------|----|----|----|----| | | | | | | *PAGE* | Table 2-1: Data rate parameters for IEEE 802.11a WLAN system | 12 | |--------------------------------------------------------------------------------------|-----| | Table 2-2: OFDM signal parameters for IEEE 802.11a WLAN system | 13 | | Table 2-3: Central frequencies of RF carriers for IEEE 802.11a WLAN system | 13 | | Table 2-4: Required SNR for 10% PER of IEEE 802.11a system | 16 | | Table 2-5: CFO and SCO specification of IEEE 802.11a system | 16 | | Table 2-6: Power consumption of PHY layer of IEEE 802.15.3a system | 20 | | Table 2-7: Data rate parameters of MB-OFDM-based UWB system | 22 | | Table 2-8: OFDM signal parameters of MB-OFDM-based UWB system | 23 | | Table 2-9: Performance requirement of MB-OFDM-based UWB system | 24 | | Table 2-10: Main parameter of LDPC-COFDM-based UWB system | 27 | | Table 2-11: Data rate parameter of LDPC-COFDM-based UWB system | 27 | | Table 2-12: Required Eb/N0 for 8% PER of MB-OFDM-based UWB system | 29 | | Table 2-13: Characteristic of Multipath channel for 54Mb/s WLAN and 480Mb/s UWB | | | systems | 33 | | Table 2-14: Specification of carrier phase noise | 47 | | Table 3-1: Hardware complexity of CE and EQ designs | 78 | | Table 3-2: Summary of SNR loss variation of the proposed design | 107 | | Table 3-3: SNR for 10% PER of OFDM-based WLAN system in AWGN channel | 109 | | Table 3-4: SNR for 10% PER of OFDM-based WLAN system in time-variant IEEE mu channel | - | | Table 4-1: Summary of SNR loss variation of the proposed design | 145 | | Table 4-2: SNR for 8% PER of LDPC-COFDM System in AWGN channel | 147 | | Table 4-3: SNR for 8% PER of LDPC-COFDM System in multipath channel | 148 | | Table 4-4: Transmission distance for 8% PER of LDPC-COFDM System | 151 | | Table 4-5: SNR for 8% PER of MB-OFDM-Based UWB system in AWGN channel | 152 | | Table 4-6: SNR for 8% PER of MB-OFDM-Based UWB system in CM channels | 153 | | Table 4-7: Transmission distance (meters) for 8% PER of MB-OFDM System | 154 | | Table 5-1: SNR for 10% PER of 54Mb/s WLAN with different wordlengths | 159 | #### LIST OF TABLES | Table 5-2: Power consumption of DAC and ADC for WLAN system | . 160 | |------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------| | Table 5-3: PAPR of the proposed fixed-point design for WLAN system | . 161 | | Table 5-4: SNR for 10% PER of fixed-point WLAN design in AWGN channel | . 163 | | Table 5-5: SNR for 10% PER of fixed-point WLAN design with RMS=50ns | . 164 | | Table 5-6: SNR for 10% PER of fixed-point WLAN processors in AWGN channel | . 165 | | Table 5-7: Coefficient index of the matched filter for OFDM-based WLAN system | . 170 | | Table 5-8: Design complexity of the baseband receiver in each packet for OFDM-based WLAN system | . 174 | | Table 5-9: Hardware cost of CE and EQ designs | . 176 | | Table 5-10: Chip summary of OFDM-based WLAN baseband processor | . 178 | | Table 5-11: Hardware complexity of OFDM-based WLAN baseband processor | . 178 | | Table 5-12: Comparison of baseband power consumption for OFDM-based WLAN system | | | Table 5-13: SNR for 8% PER of 480Mb/s UWB with different wordlengths | . 183 | | Table 5-14: SNR for 8%PER of fixed-point LDPC-COFDM-based UWB system | . 185 | | Table 5-15: SNR for 8% PER of fixed-point MB-OFDM-based UWB system | . 186 | | Table 5-16: Transmission distances for 8% PER of fixed-point LDPC-COFDM-based UW system | | | Table 5-17: Transmission distances for 8% PER of fixed-point MB-OFDM-based UWB system | | | Table 5-18: PAPR of OFDM-based UWB systems | | | Table 5-19: Design complexity of a baseband receiver in each packet for OFDM-based Usesstem | WB | | Table 5-20: Hardware complexity of the proposed synchronizer and general 4-parallelism synchronizer | | | Table 5-21: Hardware complexity of the proposed divider-and-multiplier-free channel equalizer and the general divider-based channel equalizer with 4-parallelism | . 203 | | Table 5-21-2: Complexity of the proposed design, magnitude-and-phase-based design, and divider-and-multiplier-based design. | | | Table 5-22: Chip summary of the proposed LDPC-COFDM-based UWB baseband proces | | | | . 207 | #### LIST OF TABLES | Table 5-23: Hardware complexity of the LDPC-COFDM baseband chip | 207 | |----------------------------------------------------------------------------------|-----| | Table 5-24: Finite stage machine of the proposed multi-stage gated-clock control | 210 | | Table 5-25: Chip summary of MB-OFDM OFDM transceiver | 212 | | Table 6-1: Complexity reduction and SNR loss changing of the proposed design | 217 | | Table 6-2: Chip performance and OFDM hardware complexity | 217 | | Table A-1: Dominated parameters of OFDM-based WLAN system | 225 | | Table A-2: Key parameters of LDPC-COFDM-based UWB system | 226 | | Table A-3: Key parameters of MB-OFDM-based UWB system | 226 | | Table A-4: Band location of IEEE 802 15 3a OFDM-based UWB system | 229 | ## Chapter 1: ### Introduction #### 1-1 Thesis Motivation Orthogonal frequency division multiplexing (OFDM) is widely applied in high-speed wireless local area network (WLAN) such as IEEE 802.11a, Hiperlan/2, and IEEE802.11g standards. The OFDM technique is also considered to provide 480Mb/s high-speed wireless transmission for the ultra-wide band (UWB) system [15, 16]. The OFDM technique brings high channel utilization to efficiently achieve high data rate (480Mb/s for UWB) in a limited bandwidth (only 528MHz for UWB) and provides the robustness to solve the wireless multipath channels (multi-tap Rayleigh fading channel) and to enhance system performance (low bit-error-rate and packet-error-rate). However not only the transmission quality but also the power consumption dominates the competitiveness of wireless products. The low-power design can reduce the charging times and extend the lifetime of wireless portable products such as WLAN cards or wireless USB devices. However with the OFDM design the existing baseband processors consume high power as shown in Figure 1-1. The baseband (BB) receiver power is equal to 35%~53% of whole physical-layer (PHY) which comprises of baseband design, ADC/DAC, and analog RF transceiver. Hence developing a low-power OFDM transceiver becomes the main concern for low-power wireless transceiver. When the system migrates from 54Mb/s WLAN to the 480Mb/s UWB system, the low-power concern becomes more important. Figure 1-1: Power of 54Mb/s WLAN designs For keeping the quality of low design power, the power consumption of overall PHY is suggested in multi-band OFDM (MB-OFDM)-based UWB proposals [15, 16]. Those are only 323mW and 236mW in 0.13µm and 90nm CMOS technologies. And in the baseband design, high-parallelism architectures and high clock rates are needed for at least 528Msamples/s throughput rates [1, 2, 5, 30, 41, 45]. With the increased hardware complexity and clock rates, the UWB baseband design power will grow rapidly. In reference [3], the OFDM modules occupy 83% gate-count and 55% power of the WLAN baseband chip comprising OFDM transceiver and FEC codec. In the OFDM transceiver, not only the noticed FFT/IFFT design, but also the synchronizer and channel equalizer consume high power. The percentage of OFDM receiver power is shown in Figure 1-2. As shown in Figure 1-2, the synchronizer (Sync.) and channel equalizer combining a phase error tracking (PET) totally occupy 75% of gate-count and 80% of OFDM power [3]. And in another existing OFDM transceiver, the channel equalizer even occupies 62.6% gate count [43]. Figure 1-2: Percentages of hardware complexity of OFDM transceiver for WLAN system Hence the power elimination and hardware reduction of synchronizer and channel equalizer is focused in our research topic. However the synchronizer which detects the signal timing and carrier frequency offset (CFO) and the channel equalizer which solves the multipath fading and compensates all remained error caused by CFO and sampling clock offset (SCO) dominate the system performance such as packet error rate (PER) of WLAN and WPAN systems. Hence the challenge of low-power synchronizer and channel equalizer is simultaneously to keep the system performance and efficiently reduce the design power. #### 1-1-1 Motivation of Low-Complexity Synchronizer for WLAN System In the WLAN system, the main performance requirement is to ask PER to be lower than 10% to keep the transceiver quality. And the signal-to-noise ratio (SNR) for 10% PER is highlighted as the performance summary [3, 18, 25]. In the baseband system, the synchronizer detects the valid packet and correct symbol timing. It should reduce the frame error rate (FER) as low as that not degrades the system PER. In IEEE 802.11a WLAN system, the minimum SNR values for 10% PER belong to 6Mb/s data rate mode [18, 25]. Therefore the FER must be less than the PER of 6Mb/s. In the general synchronization scheme [24, 27, 28, 29, 41], full signal of each FFT symbol in the preamble are used therefore resulting great performance. The FER of the general synchronization scheme and PER of existing baseband chips [3, 18, 25] is shown in Figure 1-3. We can find the general synchronization makes FER greatly lower than the PER curves. So the general synchronization is actually over design, and the performance margin which is between the SNR for 10% FER and SNR for 10% is extended to 2.6dB by the general synchronization. Channel Condition: AWGN channel, CFO=40ppm+phase noise, SCO=40ppm Figure 1-3: FER and PER of existing approaches for OFDM-based WLAN system For reducing the synchronizer complexity and keeping system performance, we propose a low-complexity synchronizer with high-power-signal-used (HPSU) auto-correlator and high-power-coefficient-used (HPCU) matched filter. In the synchronizer, the kernel is the auto-correlator and matched filter (i.e. cross-correlation) which detect the peak power of signal correlation to find the correct timing [13, 27, 31]. The main computation of auto-correlator and matched filter is shown in Figure 1-4. Figure 1-4: Computation of the general auto-correlation and matched filter As shown in Figure 1-4, there are several complex multiplications for each circuit result. And the multiplications and registers to store the multiplication inputs dominate the synchronizer complexity [28]. But in the proposed synchronizer, only partial signal of each FFT symbol is used therefore reducing multiplications and the registers size. For achieving the nice trade-off between high system performance and low design complexity, the amount of the partial signal is decided according to system simulation results. The proposed synchronization scheme can reduce 74% multiplications of a general synchronizer. And the additional SNR loss for 10% PER can be limited in 0.1 dB (54Mb/s) $\sim 1.3 dB$ (6Mb/s). # 1-1-2 Motivation of Low-Complexity Channel Equalizer for WLAN System In OFDM system the accurate channel estimation (CE) with Least Square (LS) equalization (EQ) are widely discussed by existing publications for high performance [27, 52]. However the LS EQ limits the performance of accurate CE, and hence we low-complexity frequency-domain minimum employ mean-square-error (FD-MMSE) EQ to release the performance bound limited by LS EQ. Based on comparison of simulation performance and design complexity we can find the employed FD-MMSE EQ can achieve better performance improvement with lower design complexity than existing accurate CE approaches. Otherwise in WLAN system, 50Hz of Doppler frequency is assumed for 5km/hr human mobility. Therefore the channel variance is resulted by the Doppler effect. The time-variant channel frequency response (CFR) is shown in Figure 1-5. The channel magnitude varies 5dB magnitude during 1.2ms, equal to a packet length of 6Mb/s rate. In this case the channel tracking (CT) is needed to enhance channel estimation (CE) accuracy and reduce system PER. Channel Condition: RMS delay = 50ns, Doppler frequency = 50Hz Figure 1-5: Time-variant channel frequency response for WLAN system. In the existing silicon-proven approach, a CT enhances CE mean-square error (MSE) by 2.5dB ~ 3dB with 18 parallel complex multipliers [24]. For achieving higher performance and lower hardware cost, we propose a decision-directed channel tracking (DDCT) scheme. It can efficiently reduce channel error and track channel variance from the error vectors of de-mapping error. The proposed DDCT can enhance CE MSE by 6dB~27dB with only two complex multipliers. And for system PER performance it can reduce 0.9 ~ 1.5dB SNR for 10% PER. The required gate-count is only 60% of the existing WLAN approach [24, 43]. #### 1-1-3 Motivation of Low-Complexity Synchronizer for UWB System Similar to that in WLAN system, the general synchronizer is also over design in the UWB system. Both the large performance margin in FER and the high tolerance to CFO motivate the low-power synchronizer design. The FER of general synchronization scheme and PER of our developed low-density parity-check (LDPC)-COFDM system is shown in Figure 1-6. The performance margin between SNR for 8% FER and SNR for typical 8% PER is increased to 6.2dB by the general synchronization scheme. And the large performance margin can allow the complexity reduction of a synchronizer. Besides, in UWB system the tolerance of CFO estimation error can be larger than that in WLAN system. The SNR loss for typical 8% PER versus CFO estimation error is shown in Figure 1-7. Since the subcarrier spacing of UWB (4.125MHz) is equal to 13.6 times of that of WLAN (312.5KHz), the tolerance to inter-carrier interference (ICI) which is caused by CFO can be naturally higher in UWB systems. And the PER of UWB system is less sensitive to CFO estimation error. for WLAN system and 5ppm for UWB system. This high tolerance also allows the complexity reduction of a synchronizer. Channel Condition: AWGN channel, CFO=40ppm+phase noise, SCO=40ppm Figure 1-6: FER and PER of LDPC-COFDM-based UWB system Figure 1-7: FER and PER of LDPC-COFDM-based UWB system In the UWB system parallel architecture is generally used for achieving high throughput rates with low clock rates [30, 41]. However the parallel architecture increases the power consumption and hardware cost. For achieving high-performance and low-hardware complexity, we propose a sub-sampling-based auto-correlator and moving-average-free matched filter. The proposed design not only reduces the computation of synchronizer but also reduces the hardware cost of the parallel architecture. The proposed synchronizer can reduce 75% complex multiplications of the general approach. It only needs 38% gate count and 43% power of a general 528MS/s 4-parallelism synchronizer in 0.18µm CMOS process. It reduces equivalent 27% power of the whole OFDM transceiver with <0.2dB additional SNR loss for 8% PER. # 1-1-4 Motivation of Low-Complexity Channel Equalizer for UWB System With DDCT, the proposed channel equalizer occupies 50% gate count and 58% power of our OFDM-based WLAN transceiver [3]. And the complex divider and complex multipliers occupy total 90% power of the equalizer. When the system migrates to UWB, the packet length is less than 1/10 and the channel variance within one packet is also smaller. And DDCT is not required. However the complex divider and complex multipliers still occupy 60% equalizer power. For QPSK-OFDM-based UWB system [1, 15, 16], we propose a divider-and-multiplier-free channel equalizer. addition/subtraction of symbol phases instead of complex multiplication/division of symbols. The proposed equalizer scheme only needs 52% gate count and 48% power of a divider-and-multiplier-based equalizer in 0.18µm CMOS process. It reduces equivalent 33% power of the OFDM transceiver with 0.3dB additional SNR loss for 8% PER. #### 1-2 Thesis Outline In this thesis, the system specifications and simulation channel models for OFDM-based WLAN system, proposed LDPC-COFDM-based UWB system, and MB-OFDM-based UWB system are introduced in chapter 2. The proposed low-complexity synchronizer, low-complexity channel equalizer, design performance analysis, and system simulation results for OFDM-based WLAN system are introduced in chapter 3. The proposed low-complexity synchronizer, low-complexity channel equalizer, design performance analysis, and system simulation results for OFDM-based UWB systems are introduced in chapter 4. The fixed-point simulation, design architectures, hardware complexity and power analysis, and the baseband chip designs for OFDM-based WLAN, LDPC-COFDM-based UWB, and MB-OFDM-based UWB in 0.18 µm and 0.13 µm CMOS process are introduced in chapter 5. ## Chapter 2: # **OFDM** Systems and Channel Model In this chapter, the system specification of OFDM-based WLAN and UWB systems will be introduced. The introduced systems comprise IEEE 802.11a WLAN system, multi-band OFDM (MB-OFDM)-based UWB system considered for IEEE 802.15.3a standard institution [15, 16], and the proposed low density parity check (LDPC)-COFDM-based UWB system [1]. And the system parameters, signal format, and system block diagram will be briefly shown. After the introduction of specifications, the practical channel model of wireless systems will be introduced. The introduced important non-ideal impacts comprises multipath fading channel, Doppler effect, carrier frequency offset (CFO), carrier phase noise (CPN), sampling clock offset (SCO), RF filtering, and AWGN. The channel parameters for WLAN and UWB systems and the signal distortion caused by these impacts will be also shown. #### 2-1 OFDM-Based WLAN System IEEE 802.11a standard was instituted in December 1999. It is the earliest IEEE WLAN system using OFDM technique for 54Mb/s data rate and 0~100 meter wireless communication. OFDM technique, which separates transmitted signal into several subcarriers, can be robust to overcome the serial inter-symbol interference (ISI) distortion from the complex channels during high data rates and long distances transmission environment. Hence it has been a popular solution in high-speed or broadcasting system such as WLAN, UWB, DVB applications. IEEE 802.11a system provides maximum 54Mb/s data rare with 20MHz signal bandwidth locating in 5.15~5.825GHz RF band. The main parameters: system data rates, OFDM signal duration, and RF bands for IEEE 802.11a are listed in Table 2-1~Table 2-3. Table 2-1: Data rate parameters for IEEE 802.11a WLAN system | | Cional | -1 | | Coded bits | Data bits | |------------------|---------------|---------------|--------------|--------------|-----------| | Data rate Signal | Constallation | Coding rate | per | per OFDM | | | (Mb/s) | bandwidth | Constellation | (R) | subcarrier | symbol | | (MHz) | | | $(N_{CBPC})$ | $(N_{DBPS})$ | | | 6 | | BPSK | 1/2 | 1 | 24 | | 9 | | BPSK | 3/4 | 1 | 36 | | 12 | | QPSK | 1/2 | 2 | 48 | | 18 | 20 | QPSK | 3/4 | 2 | 72 | | 24 | | 16-QAM | 1/2 | 4 | 96 | | 36 | | 16-QAM | 3/4 | 4 | 144 | | 48 | | 64-QAM | 2/3 | 6 | 192 | | 54 | | 64-QAM | 3/4 | 6 | 216 | As listed in Table 2-1, the data rates from 6Mb/s $\sim$ 54Mb/s are generated with BPSK, QPSK, 16-QAM, and 64-QAM of constellation and $1/2 \sim 3/4$ of FEC coding in the 20MHz-bandwidth WLAN system. As listed in Table 2-2, the 64-point fast Fourier transformation (FFT)-based OFDM technique is used for baseband modulation. The OFDM symbol period is 4.0 $\mu$ s. It consists of FFT symbol (3.2 $\mu$ s) and guard-interval (0.8 $\mu$ s). The 0.8 $\mu$ s guard-interval (GI) is used for solving the possible ISI in 0~100 meters of transmission distance. Table 2-2: OFDM signal parameters for IEEE 802.11a WLAN system | Parameter | Value | |--------------------------------------------|------------| | FFT size (N) | 64-point | | FFT symbol duration (T <sub>FFT</sub> ) | 3.2µs | | Guard-interval duration (T <sub>GI</sub> ) | 0.8µs | | OFDM symbol period (T <sub>OFDM</sub> ) | 4.0μs | | Data subcarriers per OFDM symbol | 48 | | Pilot subcarriers per OFDM symbol | 4 | | Total baseband bandwidth | 20MHz | | Subcarrier spacing | 312.5KHz | | Used baseband bandwidth | 16.5625MHz | Table 2-3: Central frequencies of RF carriers for IEEE 802.11a WLAN system | Band | Carrier Frequency (MHz) | |-----------------------------|------------------------------------------------| | Lower and Middle U-NII Band | 5180, 5200, 5220, 5240, 5260, 5280, 5300, 5320 | | Higher U-NII Band | 5745, 5765, 5785, 5805 | As listed in Table 2-2, in each OFDM symbol, the number of data subcarriers ( $N_{SD}$ ) is 48. That means 48 constellation-mapped symbols are transmitted during 4.0 $\mu$ s OFDM symbol. Besides, there are 4 pilot subcarriers in each OFDM symbol. The pilot subcarriers can be used by error-tracking designs of the baseband synchronization of the receiver. As listed in Table 2-3, in the lower and middle U-NII band, there are 8 RF carriers from 5180MHz to 5320MHz used for IEEE 802.11a standard. In the higher U-NII band, there are 4 carriers from 5745MHz to 5805MHz. And each space between two neighbor carriers is 20MHz. According to Table 2-1 and Table 2-2, we can find the data rates are related to signal bandwidth, used data subcarrier number, constellation methods, and FEC coding rate. The data rate calculation of OFDM-based system is discussed in Appendix A-1. IEEE 802.11a system is using packet-based burst transmission. The packet-based signal format is shown in Figure 2-1. The packet mainly consists of the preamble, the header, and the data OFDM symbols. The preamble and header can efficiently assist the baseband receiver to correctly recover the following data OFDM symbols. The preamble which consists of short and long symbols can be used for automatic gain control (AGC), synchronization, and channel estimation. The header contains the important system parameters such as data rate and data length. The system parameters can assist the receiver to correctly demodulate the received signal. The data OFDM symbols including the transmitted data consists of several OFDM symbols. In IEEE 802.11a system, the typical transmitted data amount is 1000 bytes. Figure 2-1: Packet format of IEEE 802.11a system When the data rate is 54Mb/s, the $N_{DBPS}$ is equal to 216 as listed in Table 2-1. So, there are 38 OFDM symbols needed to transmit the 1000 data bytes (ceil(8000/216)=38). And the length of data OFDM symbols is $38\times4 = 152\mu s$ . The data OFDM symbol format is shown in Figure 2-2. It consists of 3.2µs FFT symbol and 0.8µs guard interval. The guard interval is the cyclic prefix of FFT symbol. It's used to handle the ISI mainly caused by multipath fading channel and assists to the signal behavior of frequency-domain channel equalizer of the receiver. Figure 2-2: Data OFDM symbol format of IEEE 802.11a system 1896 In the system performance, the standard requirement for IEEE 802.11a is the packet error rate (PER) should be lower than 10%. And the typical transmit data amount per packet is 1000 data bytes. The maximum SNR value to achieve 10% PER is listed in Table 2-4 [18, 25]. That means that the SNR to achieve 10% PER can not exceed the constraints. The specification of CFO and SCO effects of IEEE 802.11a system is listed in Table 2-5. The ±20ppm CFO of RF frequency from 5180MHz to 5805MHz listed in Table 2-3 will be equal to ±116.1KHz. The ±20ppm SCO will cause the time-domain signal drift and frequency-domain signal rotated. Hence the time-domain synchronization and frequency-domain phase error tracking (PET) is needed in OFDM system [9, 10, 11, 12, 13, 14]. The other system requirement: power spectrum density is discussed in Appendix A-2. Table 2-4: Required SNR for 10% PER of IEEE 802.11a system | Data Rate (Mb/s) | SNR for 10% PER (dB) | |------------------|----------------------| | 6 | 9.7 | | 9 | 10.7 | | 12 | 12.7 | | 18 | 14.7 | | 24 | 17.7 | | 36 | 21.7 | | 48 | 25.7 | | 54 | 26.7 | Table 2-5: CFO and SCO specification of IEEE 802.11a system | Effect | Range | |---------------------------------|--------| | Carrier Frequency Offset (CFO) | ±20ppm | | Sampling Frequency Offset (SCO) | ±20ppm | The block diagram of the WLAN system is shown in Figure 2-3. The media access control (MAC) which links to software part is used to allocate channel bands and control the transmission and receiving commends. The baseband design is used to develop the correct signal format and solve the channel effects. And it links to RF with digital-to-analog converter (DAC) and analog-to-digital converter (ADC). The RF is used to up-convert and down-convert the signal between 20MHz baseband and 5180MHz~5805MHz passband. It also amplifies and filters the signal to satisfy the system requirement. Figure 2-3: Block diagram of IEEE 802.11a WLAN system For the system requests to baseband design, the baseband design comprises the complete transmitter including FEC encoder and QAM-OFDM modulator and the receiver including the synchronization, QAM-OFDM demodulator, and FEC decoder. The block diagram of IEEE 802.11a baseband system is shown in Figure 2-4. As shown in Figure 2-4, the transmitter consists of the FEC encoder, QAM mapping, IFFT, guard-interval insertion, preamble insertion, shaping filter, and peak-to-average power ratio (PAPR) clipping. They are used to satisfy the system specification containing data rates, signal format, and power spectrum mask. The peak-to-average power ratio (PAPR) injuring the linearity of RF power amplifier (PA) is also suppressed by the clipping design. Figure 2-4: Block diagram of IEEE 802.11a baseband system And then the data is sent to the channel model. The channel model consists of timing-drift model, multipath channel model, AWGN model, CFO model, and VGA model. They are used to simulate the RF and wireless channel effects. We can find the non-ideal impacts containing SCO, multipath fading, CFO, RF thermal noise, and the variant gain amplification (VGA) effect of the RF receiver. And then the channel signal is sent to the baseband receiver. The baseband receiver is used to estimate and then compensate the RF and channel effects. And then the data can be demodulated and then sent to the MAC. In the initial, the AGC is used to correct the RF VGA gain. It makes the swing level of RF output signal match to the ADC range. And then the timing synchronization, sampling phase control, and frequency synchronization are used to detect the packets distorted by channel effects, and to solve the SCO and CFO distortion. After synchronization, the most channel effect of received signal can be eliminated. And then the data is sent to from FFT to de-scrambler to finish the demodulation. Based on the baseband blocks, the data modulation and demodulation of IEEE 802.11a baseband system can be correctly finished. #### 2-2 MB-OFDM-Based UWB System UWB system is promoted by industry from 2002. It provides 480Mb/s high data rate in 528MHz wide bandwidth and 0~10 meter transmission distance. The high-speed UWB system is instituted by IEEE 802.15.3a working group. In the standard institution, there are two techniques considered for 480 Mb/s high-speed UWB systems. The first one is direct sequence (DS)-UWB system. This system is promoted by Motorola and Freescaler. Its advantage is using direct sequence spread spectrum (DSSS) technique to solve the large jamming power which causes -8dB signal-to-interference ratio (SIR) from a satellite [15]. And the DS-UWB system has the feature of low design complexity. So it's a nice choose to reduce the time to market. And the industry has developed many ripe DS-UWB systems and relative products. The second main choice of UWB system is multi-band (MB)-OFDM UWB system. It's promoted by MB-OFDM alliance (MBOA) mainly consists of Intel, TI, manufacturer. and many wireless IC This system consists oftiming-frequency-interleaved (TFI) RF hopping and OFDM technique. From OFDM-based WLAN system such as IEEE 802.11a and ETSI HipterLAN/2, OFDM is widely used to solve the complex multipath fading problem for high performance and high speed transmission. In UWB system it's used to solve the multipath fading with 0~15ns RMS delay spread in 528MHz wide bandwidth. It can also solve the large jamming with MB technique. Different from DS-UWB, the challenge of OFDM circuit is the low-power problem. So low-power is the main concern of OFDM-based UWB system. The predicted power of PHY layer in 130nm and 90nm CMOS process is listed in Table 2-6. As listed in Table 2-6, the power of 480Mb/s is 236mW and 323mW in 90nm and 0.13μm CMOS process. That will be only 23%~32% of the 54Mb/s WLAN design in 0.25μm CMOS process [23, 44]. Since the power may be increased with data rates and signal bandwidth, the low-power design becomes the main concern of OFDM-based UWB system. Table 2-6: Power consumption of PHY layer of IEEE 802.15.3a system | | Data Rate | | | CCA | Power Save | |---------|-----------|----------|---------|---------|-------------| | Process | (Mb/s) | Transmit | Receive | (Signal | (Deep Sleep | | | (1010/5) | | | Detect) | Mode) | | | 110 | 93 mW | 155 mW | 94 mW | 15 W | | 90 nm | 200 | 93 mW | 169 mW | 94 mW | 15 W | | | 480 | 145 mW | 236 mW | 94 mW | 15 W | | | 110 | 117 mW | 205 mW | 117 mW | 18 W | | 130 nm | 200 | 117 mW | 227 mW | 117 mW | 18 W | | | 480 | 180 mW | 323 mW | 117 mW | 18 W | Different from OFDM-based WLAN system, MB-OFDM system uses MB hopping technique and the spreading method to solve large jamming of from the satellite. The block diagram of baseband and RF design with MB control signal is shown in Figure 2-5. The baseband transmitter should feedforward control the Band ID of RF transmitter. And the synchronization of baseband receiver should detect the band boundary and feedback control the band control signal feedback to RF receiver design. Based on the accurate band detection, the correct baseband signal can be sent. And IEEE 802.15.3a working group defined 5 band groups for MB hopping. The RF band allocation, baseband spreading scheme, and overcoming jamming technique of MB-OFDM system is supplemented in Appendix A-3. Figure 2-5: Block diagram of baseband and RF for MB-OFDM-based UWB system The data rate parameters and OFDM signal parameters are listed in Table 2-7 and Table 2-8. With fixed QPSK constellation and $1/3 \sim 3/4$ FEC coding rate, $53.3 \sim 480$ Mb/s data rates can be provided in the 528MHz-bandwidth UWB system. As listed in Table 2-8, the OFDM symbol duration is 312.5ns and the number of used data subcarriers of each OFDM symbol is 100. The derivation of system parameters is also discussed in Appendix A-1. Table 2-7: Data rate parameters of MB-OFDM-based UWB system | Data rate | Signal | | FEC | Spreading | |-----------|-----------|---------------|-------------|-----------| | | bandwidth | Constellation | Coding rate | factor | | (Mb/s) | (MHz) | | (R) | (S) | | 53.3 | | QPSK | 1/3 | 4 | | 80 | | QPSK | 1/2 | 4 | | 110 | | QPSK | 11/32 | 2 | | 160 | 529 | QPSK | 1/2 | 2 | | 200 | 528 | QPSK | 5/8 | 2 | | 320 | | QPSK | 1/2 | 1 | | 400 | | QPSK | 5/8 | 1 | | 480 | 777 | QPSK | 3/4 | 1 | 1896 The packet format of MB-OFDM system is shown in Figure 2-6. As shown in Figure 2-6, the preamble consists of 30 OFDM symbols used for AGC, synchronization, and channel estimation. And the Header includes 200 bits of baseband parameter and MAC control signal. The packet all consists of OFDM symbols. The OFDM symbol format is shown in Figure 2-7. As shown in Figure 2-7, the OFDM symbol consists of the FFT symbol and zero pad (ZP). Similar to the guard interval of IEEE 802.11a system, the pre-ZP is used to increase the distance between two FFT symbols to decrease the ISI distortion. And the post-ZP is used for RF hopping after transmitting the FFT symbol. However the guard-interval should be the cyclic prefix in the OFDM system. The detailed scheme to make ZP become cyclic prefix is supplemented in Appendix A-4. Table 2-8: OFDM signal parameters of MB-OFDM-based UWB system | Parameter | Value | |-----------------------------------|------------| | FFT symbol duration | 242.42ns | | Zero pad duration | 70.08ns | | OFDM symbol period | 312.5ns | | Data subcarriers per OFDM symbol | 100 | | Pilot subcarriers per OFDM symbol | 12 | | Subcarrier spacing | 4.125MHz | | Total baseband bandwidth | 528MHz | | Used baseband bandwidth | 466.125MHz | Figure 2-6: Packet format of MB-OFDM-based UWB system Figure 2-7: OFDM symbol format of MB-OFDM-based UWB system According the specification of MB-OFDM system [16], the Eb/N0 constraint and required transmission distance for 8% PER is listed in Table 2-9. And the equivalent SNR values are also listed. Table 2-9: Performance requirement of MB-OFDM-based UWB system | Data Data | Transmission | Eb/N0 | SNR for 8% | |-----------|--------------|------------|------------| | Data Rate | distance | for 8% PER | PER | | (Mb/s) | (meters) | (dB) | (dB) | | 110 | 10 | 12.9 | 7.1 | | 200 | 4 | 18.3 | 15.1 | | 480 | 2 | 20.5 | 21.1 | As listed in Table 2-9, the minimum transmission distance of 110Mb/s ~ 480Mb/s main data rates are 10 meters, 4 meters, and 2 meters. It's found the transmission distance of higher data rate is shorter. That is because in a mode of lower data rate, the spreading method is used so the data correctness is more robust to channel noise. So the SNR to achieve 8% PER can be lower and the tolerated signal power degradation caused by path loss can be higher. Hence the modes of lower data can achieve higher transmission distances. In the specification of CFO and SCO, the parameters are the same as OFDM-based WLAN system. The tolerated CFO and SCO are also ±20ppm. To satisfy the MB-OFDM-based specification, the developed block diagram is shown in Figure 2-8. Figure 2-8: Block diagram of MB-OFDM baseband system The difference in the baseband designs between MB-OFDM system and the MB-OFDM OFDM-based **WLAN** system system contains the spreading/dispreading blocks, band control/detection blocks, and MB channel equalization block. And the transmitter (TX) RF and receiver (RX) RF blocks are also added to simulate the MB hopping effect. In the transmitter, the frequency-domain spreading is used to spread the data subcarriers as the complex symmetric. And the time-domain spreading is used to duplicate the transmitted signal. And when the spreading is used the system is more robust to channel noise and jamming and the data rate will be lower. In the end of baseband transmitter, the TX band control is used to control the band ID and sent it to RF. And then the TX RF will up-convert the transmitted signal to the pass band. In the receiver, the RX band detection is used to detect the band ID with the received signal, AGC signal, and synchronization signal. Since the RX band detection can detect the correct band ID of the received signal, the received signal can be correctly down-converted in the RX RF block. In the frequency domain, since the channel frequency response of different bands will be different, the MB channel equalization needs to estimate all the channel frequency response. The MB channel equalization is used to first estimate the channel response of all used bands and then compensates the received data. After the channel equalization, the channel distortion can be just removed. And then the de-spreading can be just used to recover the transmitted data through frequency-domain and time-domain spreading. And then the data is sent through de-QPSK, FEC decoder, and de-scrambler. Finally it will be sent back to MAC to finish the baseband data flow. #### 2-3 LDPC-COFDM-Based UWB System Besides MB-OFDM system, there are many different OFDM-based system for high data rate and high performance UWB application. One of these systems is LDPC-COFDM system. LDPC-COFDM system is using LDPC coding for forward error correction. Different from Convolutional encoding/Viterbi decoding, the LDPC coding is the block code and does not need the use of interleaver. From 2003, we have begun developing the LDPC-COFDM system for UWB and future advanced applications. The main parameters of LDPC-COFDM-based UWB system are listed in Table 2-10. Table 2-10: Main parameter of LDPC-COFDM-based UWB system | Parameter | Value | |---------------------|------------------------------| | Signal bandwidth | 528MHz | | Data rate | 120Mb/s, 240Mb/s, 480Mb/s | | FFT size | 128 point | | LDPC code parameter | (600, 450) semi-regular code | As listed in Table 2-10, the provided data rates are 120Mb/s, 240Mb/s, and 480Mb/s and the system is based on 128-point FFT and (600, 450) semi-regular LDPC code. In LDPC code, the block length is 600 and the number of message nodes is 450. That means in each 600 bits of LDPC encoder output, there are 450 bits used to transmit data and 150 bits used for error correction of the LDPC decoder. So the FEC coding rate is 450/600 = 3/4. Based on the LDPC coding, the data rate parameters are listed in Table 2-11. Table 2-11: Data rate parameter of LDPC-COFDM-based UWB system | Data rate (Mb/s) | Constellation | Coding rate (R) | Spreading factor | Average data bits per OFDM symbol $(N_{DBPS})$ | |------------------|---------------|-----------------|------------------|------------------------------------------------| | 120 | QPSK | 3/4 | 4 | 37.5 | | 240 | QPSK | 3/4 | 2 | 75 | | 480 | QPSK | 3/4 | 1 | 150 | Based on FEC coding rate = 3/4 and spreading factor the same as MB-OFDM system, the provided data rates are 120Mb/s, 240Mb/s, and 480Mb/s. The spreading method creates different data rates. Similar to MB-OFDM system, the LDPC-COFDM system also uses frequency-domain spreading methods to increase the robustness to channel noise and the jamming in the transmitted bands. The packet format of LDPC-COFDM system is shown in Figure 2-9. Figure 2-9: Packet format of LDPC-COFDM-based UWB system The same as the MB-OFDM system, the preamble consists of 30 OFDM symbols. The packet all consists of OFDM symbols and the OFDM symbol format is the same as that of MB-OFDM system. Therefore total 70ns of zero pad to decrease the ISI distortion. After the adding of zero pad in the receiver, the circular convolution of channel impulse response and transmitted signal can be also resulted in the receiver. The Eb/N0 constraint and required transmission distance for 8% PER of LDPC-COFDM system is listed in Table 2-12. Similar to MB-OFDM system, the required transmission distances of 120Mb/s, 240Mb/s, and 480Mb/s are also 10 meters, 4 meters, and 2 meters. To satisfy the system specification and performance requirement of LDPC-COFDM system, the developed baseband block diagram is shown in Figure 2-10. | | | Transmission distance | | |------------------|-----------------------|-----------------------|--| | Data Rate (Mb/s) | Eb/N0 for 8% PER (dB) | (meters) | | | 120 | 12.9 | 10 | | | 200 | 18.3 | 4 | | | 480 | 20.5 | 2 | | Table 2-12: Required Eb/N0 for 8% PER of MB-OFDM-based UWB system Figure 2-10: Block diagram of MB-OFDM baseband system Since the proposed LDPC-COFDM system is focusing the combination of LDPC coding and OFDM system, the MB hopping of RF technique is not included. It can be found the difference from the existed WLAN and UWB system is using the LDPC encoder and decoder. Based on the system block diagram, the transmitted data can be encoded by LDPC design and modulated by QPSK-OFDM design. After shaping and clipping design, the transmitted signal can satisfy the PSM requirement and RF power amplifier specification. Through the channel model, the distortion in data caused by clock offset, multipath channel, RF effects, and AWGN is simulated. In the receiver, first the AGC and synchronization are used to estimate and compensate the channel distortion. And then the received signal can be correctly demodulated. Through the LDPC decoder and de-scrambler, the data can be recovered and finally sent to the MAC. #### 2-4 Channel Models of WLAN and UWB systems In wireless system, the channel model is developed to simulate the signal distortion caused by wireless environment and RF circuits. In the wide-band and high-speed OFDM-based wireless system, the main channel non-ideal impacts are multipath fading, multi-band hopping for multi-band system, Doppler effect, clock offset, frequency offset, RF filtering effect, and AWGN. The block diagram of channel models for the single-band wireless system and multi-band (MB) wireless system are shown in Figure 2-11 and Figure 2-12. As shown in Figure 2-11, between transmitter (TX) baseband design and receiver (RX) baseband design, there are 9 blocks: D/A quantize, clock offset, TX RF low-pass filter (LPF), multipath channel, CFO effect, RX RF LPF, AWGN, variant gain amplifier (VGA), and A/D quantize. They are developed to simulate the channel distortion of the baseband signal. The Doppler effect which causes time-variant multipath channel is included in the multipath channel block. The band-pass filters (BPF) of RF are converted to parts of LPF of RF. And the VGA block is used to simulate the RF amplification effect and AGC behavior. As shown in Figure 2-12, there are 4 blocks added to the channel model: up-sampling, baseband (BB) to passband (PB) block, passband back to baseband block, and down-sampling. Figure 2-11: Block diagram of the channel model for single-band wireless system Figure 2-12: Block diagram of the channel model for multipath-band wireless system The up-sampling is used to multiply the time resolution of simulated signal. Therefore the watched bandwidth can be increased from only baseband to the overall passband. Through the BB to PB block, the frequency position of transmitted signal can be shifted from the baseband to the targeted RF band. For example in MB-OFDM-based UWB system the middle of frequency position will be shifted from 0Hz to 3432 ~ 4488MHz of band group 1. Of course the frequency position of multipath response will be also shifted to RF bands. Through the linear convolution with channel impulse response and phase rotation caused by CFO, the frequency position of received signal will be shifted back to the baseband in the PB to BB block. The band ID of PB to BB block is controlled by the RX baseband design. In the RX baseband design the band detection of the synchronization design will detect the band boundary of the received signal. The baseband will control the band ID of RF therefore the PB to BB block is iteratively controlled by baseband design. After the received signal is shifted back to the baseband and through the RX RF LPF, the time resolution of simulation data can be decreased to reduce the simulation complexity and wasted time. Through the AWGN channel, VGA, and A/D quantize, the signal will be sent to the baseband design. The main channel effects: multipath channel, Doppler effect, MB hopping, RF filtering, up-sampling, CFO effect and phase noise, clock offset, and AWGN will be introduced below. # 2-4-1 Multipath Channel and Doppler Effect of WLAN and UWB systems Multipath channel effect is one of serious channel effects deeply degrading the system performance. In the channel model the transmitted signal will be convoluted by an equivalent channel impulse response (CIR). The convolution can be derived as $$Y(n) = \sum_{k=0}^{\infty} X(k)H(n-k)$$ (2-1) Where Y(n) is the output of multipath channel, X(n) is the input and H(n) is the CIR. The typical multipath channels for IEEE 802.11a system and OFDM-based UWB systems are separately IEEE multipath channel model [19], CM channel model [20], and Intel channel model [32]. The characteristics of these channel models for WLAN and UWB system are respectively listed in Table 2-13. The channel impulse response (CIR) and channel frequency response (CFR) of these channel models are shown in Figure 2-13 and Figure 2-14. Table 2-13: Characteristic of Multipath channel for 54Mb/s WLAN and 480Mb/s UWB systems | | Characteristic | Parameter Example | Average Tap | |---------------------|------------------|----------------------|-----------------| | | Characteristic | Farameter Example | Number (>-46dB) | | IEEE channel | | RMS=50ns | 10.93 | | model [19] for | Rayleigh fading | | | | WLAN | | RMS=100ns | 20.92 | | CM sharmal | | CM1 (RMS delay=5ns) | 28.83 | | CM channel | Daylaigh fading | CM2 (RMS delay=8ns) | 38.76 | | model [20] for UWB | Rayleigh fading | CM3 (RMS delay=15ns) | 72.83 | | OWB | | CM4 (RMS delay=25ns) | 121.72 | | Intel channel | Saleh-Valenzuela | RMS=5ns | 18.4 | | model [32] for | model [40] and | | | | UWB | Rayleigh fading | RMS=15ns | 45.8 | Figure 2-13: Examples of impulse response of IEEE multipath channel As shown in Figure 2-13, we can find as the root-mean-square (RMS) delay spread of CIR becomes higher, the maximum delay spread and tap number of CIR will become higher. Assume the input of channel model is a FFT symbol, and then we can find the channel output can be derived as $$Y(n) = \sum_{k=0}^{N-1} X(k)H(n-k)$$ $$= \sum_{m=0}^{N+L-2} \sum_{k=0}^{N} X(k)H(m-k) \, \delta(n-m)$$ $$= \sum_{m=0}^{N+G-1} \sum_{k=0}^{N} X(k)H(m-k) \, \delta(n-m) + \sum_{m=N+G}^{N+L-2} \sum_{k=0}^{N} X(k)H(m-k) \, \delta(n-m)$$ $$= Y_{OFDM}(n) + Y_{ISI}(n)$$ (2-2) Figure 2-14: Examples of frequency response of IEEE multipath channel Where N is the length of FFT symbol, L is the length of CIR, G is the length of guard interval or zero pad, $Y_{OFDM}(n)$ is the signal within the OFDM symbol with intra-symbol interference and $Y_{ISI}(n)$ is the inter-symbol interference to the late OFDM symbols. So as $L-2 \geq G$ , the $Y_{ISI}(n)$ will exist and cause the interference to the later OFDM symbols. And in the frequency domain, the complex multipath channel will cause the frequency-selective fading. Since in the time domain the transmitted signal will be convoluted by the CIR, the frequency-domain signal will be multiplied by CFR. That means the CFR will low magnitudes will deeply reduce the power of the transmitted signal. When the transmitted signal power is reduced, the SNR will become lower and system performance will be degraded. As shown in Figure 2-14, as the RMS delay spread is increased from 25ns to 100ns, the frequency-selective fading will become deeper from -3.3dB to -12dB. To analyze the multipath channel for UWB system, the CM channel is used as an example. The CIR and CFR of CM channel for UWB are shown in Figure 2-15 and Figure 2-16. As shown in Figure 2-15, the CM channel including CM 1 ~ CM 4 will have maximum delay spread of 105ns ~ 360ns. The CIR duration exceeds the zero pad duration so it will cause seriously ISI in the baseband signal. As shown in Figure 2-16, the most serious frequency-selective fading of CM1 ~ CM 4 channel will be -5.5dB ~ -20dB. And the complexity of CM channel is CM 4 > CM 3 > CM 2 > CM1. Through the multipath channel, the signal will be distorted and the channel estimation and equalization are needed to solve the distortion. They are used to solve the seriously ISI and recover the frequency-domain signal from the frequency-selective fading. For simulating the practical time-variant characteristic to the applied channel model, the Doppler effect can be also modeled according to Jake's Doppler spectrum and time-variant multipath modeling method [21, 46]. First the Doppler frequency can be derived as $$f' = f_0 \frac{v \pm v_r}{v \pm v_t}$$ $$f_D = f' - f_0 = f_0 \frac{\pm v_r \mp v_t}{v \pm v_t}$$ (2-3) Figure 2-15: Examples of CM channel impulse response Figure 2-16: Examples of CM channel frequency response Where f' is the changed RF center frequency, $f_0$ is the original RF center frequency, $v_0$ is the velocity of the light, $v_1$ is the velocity of the receiver, $v_1$ is the velocity of the transmitter, and $f_D$ is the CFO also called the Doppler frequency. And in WLAN system with 5km/hr velocity of each human, the Doppler frequency is $5 \text{GHz} \times 2 \times 5000/3600/(3 \times 10^8) = 46 \text{Hz}$ . Since the moving direction and transmission direction is not always the same, the velocity in the transmission direction will be multiplied by cosine function. And according to Jake's Doppler spectrum the practical frequency offset of each transmitted path can be approximated to $$\Delta f = f' - f_0 = f_0 \frac{\left(\pm v_r \mp v_t\right) \times \cos \theta}{v \pm v_t \times \cos \theta} \approx f_D \times \cos \theta \tag{2-4}$$ Where $\theta$ is the angle between the directions of moving and transmission. And in multipath channel any $\theta$ is possible so it's uniformly distributed. And then according to multipath channel modeling method with Doppler effect [46], the different practical frequency offsets are added in the paths of multipath channel. And the CIR and CFR will be time-variant. The time-variant CFR with 50Hz Doppler frequency in WLAN system has been shown in Figure 1-5. #### 2-4-2 Baseband to Passband Conversion and RF Filtering For multi-band system, the baseband to passband conversion is needed to correctly simulate the band hopping condition and simulate the adjacent channel interference from neighbor bands. The up conversion of baseband signal can be derived as $$PB(t) = BB(t) \times \exp(j2\pi f_0 t)$$ $$\downarrow \qquad (2-5)$$ $$PB(f) = BB(f - f_0)$$ Where P(t) is the passband signal, B(t) is the baseband signal, and f is the center frequency of the band. However in the baseband platform based on C language or Matlab language, the used signal type is the discrete time signal. And the watchable frequency range will be limited to one times of signal bandwidth. Hence the up-sampling is needed for up conversion to passband. The simple example to up convert the baseband signal to the passband signal is shown in Figure 2-17. Figure 2-17: A simple example to up-convert the signal from baseband to passband In the simple example, the original baseband signal consists of 8 taps in $0\sim8$ ns and the frequency-domain signal is limited in 1GHz signal bandwidth (-0.5GHz $\sim$ +0.5GHz). The watchable frequency range is equal to the signal bandwidth. After the 4 times of up-sampling, the time-domain signal consists of 8x4=32 taps in $0\sim8$ ns. And the watchable frequency range is extended from $\pm0.5$ GHz to $\pm2$ GHz. That is because the 4 times of up-sampling will increase the watchable frequency range. After the watchable frequency range is extended, the up-conversion can be proceeded. The up-conversion of the discrete-time signal can be derived as $$PB(n) = BB(n) * exp(j2\pi n f_0/N)$$ (2-6) Where PB(n) is the passband signal, BB(n) is the up-sampled baseband signal $f_0$ is the targeted RF center frequency, and N is the up-sampling rate. After the up-conversion, the time-domain signal will be multiplied by the sinusoid and the frequency-domain signal will be shifted to the passband. As shown in Figure 2-17, since $f_0$ in (2-6) is set as 1GHz, the frequency-domain signal will be finally shifted to the position with 1GHz center frequency. In the channel model, the RF filters needs to be added to simulate the ISI effect caused by the practical filters. Besides the ISI effect, the RF filters can be also used to simulate the signal behavior of MB hopping system. The power of the received signal with and without RF filtering simulation is shown in Figure 2-18. Figure 2-18: Received baseband signal power with RF filtering in MB-OFDM system From Figure 2-18, it can be found that the signal power with and without RF filtering is so different. In the MB-OFDM system, the received signal consists of signal carried in band #1, band #2, and band #3. If the baseband receiver controls the RF to down-convert the signal all with band #1 center frequency, through the RF LPF the power of received baseband signal originally carried in band #2 and band #3 will be eliminated. That is because the signal originally carried in band #2 and band #3 will be out of baseband after being down-converted with band #1 center frequency. And then the through the RF LPF, the signal power will be reduced as shown in the upper plot of Figure 2-18. And we can use this effect to find the correct band boundary and band ID in the synchronization design. And if the RF filtering effect is not simulated, whatever the band control is correct or not, the signal power will not be influenced as shown in the lower plot of Figure 2-18. Without the RF filtering effect, the channel model simulation will not be as practical as the true wireless system. The example of a LPF of the UWB RF receiver is shown in Figure 2-19. Figure 2-19: A LPF of UWB RF receiver To cover the 528MHz bandwidth of UWB system, the 3dB bandwidth of the RF LPF is about 660MHz. Since the DC offset of OFDM-based system is as low as possible, the RX LPF has about -40dB gain in the position of frequency = 0. Through the RF LPF, the practical signal behavior of band mismatch can be simulated and used for band control and synchronization. #### 2-4-3 Carrier Frequency Offset and Carrier Phase Noise The CFO is caused by the mismatch in RF local frequency between the transmitter and the receiver. Another reason to cause the CFO is the Doppler frequency. The block diagram with CFO is shown in Figure 2-20. Figure 2-20: Block diagram with CFO As shown in Figure 2-20, between the transmitter and receiver the CFO ( $\Delta f$ ) exists. The baseband signal and passband signal of transmitter can be derived as $$SB_T(t) = A \cdot exp[j \cdot m(t)] \tag{2-7}$$ $$SP_T(t) = SB_T(t) \cdot \exp(j \cdot 2\pi f t) = A \cdot \exp[j \cdot m(t) + j \cdot 2\pi f t]$$ (2-8) Where $SB_T(t)$ is the baseband signal, m(t) is the transmitted message, $SP_T(t)$ is the passband signal, and f is the local frequency of RF transmitter. As shown in Figure 2-20, the local frequency of RF receiver is $(f + \Delta f)$ . So the received signal can be seen as $$SP_R(t) = A \cdot exp\{j \cdot [m(t) + 2\pi f t]\}$$ = $$A \cdot exp \{ j \cdot [m(t) + 2\pi (f + \Delta f) t - 2\pi \Delta f t] \}$$ $$= A \cdot \exp \left\{ j \cdot [m(t) + 2\pi (f + \Delta f) t + \Delta \theta] \right\}$$ (2-9) Where $SP_R(t)$ is the passband signal of receiver equal to $SP_T(t)$ , $\Delta\theta$ is the phase rotation equal to $-2\pi\Delta ft$ . Through down-converted with local frequency $f+\Delta f$ , the baseband signal can be derived as $$SB_R(t) = SP_R(t) \cdot exp[-j \cdot 2\pi (f + \Delta f) t]$$ $$= A \cdot \exp[j \cdot (m(t) + \Delta \theta)] = SB_T(t) \cdot \exp(-j2\pi\Delta ft)$$ (2-10) Where $SB_R(t)$ is the baseband signal of the receiver. So the CFO will cause the phase rotation $-2\pi\Delta ft$ linear to time in the baseband. The another method to derive the CFO distortion is $$SB_{R}(t) = \frac{A}{T} \int_{0}^{T} \cos(2\pi f t + m(t)) \cdot \exp\left[-2\pi j (f + \Delta f)t\right] dt$$ $$= A \cdot \exp\left[j \cdot m(t)\right] \cdot \exp\left[-j2\pi \Delta f t\right] = SB_{T}(t) \cdot \exp\left[-j2\pi \Delta f t\right]$$ (2-11) Therefore the CFO will cause the linear phase rotation. Beside the mismatch of TX/RX RF, another reason to cause CFO is the Doppler effect in the mobile system. The 50Hz Doppler frequency for WLAN system will increase CFO by maximum $\pm 0.01$ ppm. In the frequency domain, the CFO will cause the increase of mean phase and inter-carrier interference (ICI). According to [12], the frequency-domain signal with CFO can be derived as $$Y(k) = (X_k H_k) \cdot \left\{ \frac{\sin \pi \varepsilon}{N \sin(\pi \varepsilon / N)} \right\} \cdot e^{j\pi \varepsilon (N-1)/N}$$ $$+ \sum_{\substack{l=-K \\ l \neq k}}^{K} (X_l H_l) \cdot \left\{ \frac{\sin(\pi \varepsilon)}{N \sin(\pi (l-k+\varepsilon)/N)} \right\} \cdot e^{j\pi \varepsilon (N-1)/N} \cdot e^{-j\pi (l-k)/N}$$ $$= Y_S(k) + Y_{ICI}(k)$$ (2-12) Where Y(k) is the frequency-domain signal in the receiver, $X_k$ is the original transmitted subcarrier signal, $H_k$ is the CFR in the frequency position k, N is the FFT size, $\varepsilon$ is the CFO, $Y_S(k)$ is the received signal linear to $X_K$ , and $Y_{ICI}(k)$ is the signal linear to the other subcarriers. As CFO become higher, the power of sink function $\left\{\frac{\sin(\pi\varepsilon)}{N\sin(\pi(l-k+\varepsilon)/N)}\right\}$ will become higher and the ICI will become more serious. The example of signal distortion caused by CFO of 20ppm and 0.4ppm in IEEE 802.11a system is shown in Figure 2-21. As the CFO is equal to 20ppm, in the time domain it will cause the rapid linear phase rotation, and in the frequency domain the caused ICI fuzzy the correct signal. So the frequency synchronization is needed to solve the CFO in the time domain with the characteristic of the linear phase rotation. Therefore in the frequency domain the CFO distortion can be reduced. Assume after the synchronization the CFO is reduced to 0.4ppm. As shown in Figure 2-21, with 0.4ppm CFO the phase rotation in time domain becomes small. And in the frequency domain, the ±5dB power distortion with 20ppm CFO can be reduced to about 0dB. However the mean phase of frequency-domain subcarriers is decreased linearly. That is because the 0.4ppm residual CFO still causes the small phase rotation in the time domain. That will make the initial phase of each FFT symbol increased with the time. So in the frequency domain, the mean phase of the subcarriers will be increased as the OFDM symbol number is increased. Therefore, the phase error tracking is needed to track and compensate the phase error in the frequency domain [9, 14]. In the practical RF circuit, the local oscillator has thermal noise therefore the CFO will not be a constant with the variation of supply voltage and temperature. Hence the carrier phase noise (CPN) needs to be considered [33, 34, 35, 36]. The example of the phase noise distributed in the frequency domain is shown in Figure 2-22. According to the CPN distribution, the CFO can be modeled with time-variant drift. The specification of simulated CPN according to the state-of-the-art is listed in Table 2-14. Figure 2-21: Signal distortion with CFO of 20ppm and 0.4ppm of 5GHz Figure 2-22: Phase noise spectrum of CFO model | | CPN @ 100KHz CFO | CPN @ 1MHz CFO | |---------------------------|------------------|----------------| | 5GHz WLAN [33,34] | -100dBc/Hz | -120dBc/Hz | | 3.1 ~ 10.6GHz UWB [35,36] | -100dBc/Hz | -110dBc/Hz | Table 2-14: Specification of carrier phase noise #### 2-4-4 Sampling Clock Offset As CFO, the clock frequency offset exists between transmitter and receiver causes because of the imbalance of clock source, supply voltage, and temperature. It including clock frequency offset makes signal drifted in the time domain. In the C or Matlab-based platform, the difficulty to simulate clock offset is to result the signal drift with fine offset such as 20ppm equal to 1/50000. One method is using up-sampling method to find the drifted signal. Assume the clock period T is M times of clock offset $\Delta T$ . By the up-sampling method we need to up-sample by M times and then find the drifted signal. This method can be modeled as $$Y(n\Delta T) = \begin{cases} X(nT/M), & \text{as } n/M \text{ is an integer} \\ \text{interpolation result of } x(nT), & \text{otherwise} \end{cases}$$ $$X[n(T - \Delta T)] = Y[\Delta T(nM - n)]$$ (2-13) Where $Y(n\Delta T)$ is the up-sampled signal with time resolution as $\Delta T$ , X(nT) is the original signal with time resolution as T, $X[n(T-\Delta T)]$ is the signal with clock offset $\Delta T$ . Although we can directly find the signal with clock offset by this method, the up-sampling function will linearly increase the data amount processed by system platform and the simulation will be hurriedly increased. To solve the problem, the clock offset model with RF filters is used. Assume a RF filter will length (2N+1)T is used, the signal with clock offset can be derived as $$Y(nT) \approx X(nT) \otimes h(nT) \Big|_{-NT \le nT < NT}$$ $$Y(nT - \Delta T) = X(nT) \otimes h(nT - \Delta T) \Big|_{-NT \le nT - \Delta T < NT}$$ $$= \sum_{k=-N}^{k=N} X(nT - kT) \cdot h(kT - \Delta T)$$ (2-14) Where T is the clock period, Y(nT) is the filter output, X(nT) is the filter input, $\Delta T$ is the clock offset, and $Y(nT-\Delta T)$ is the filter output with clock offset. In (2-14), $h(nT-\Delta T)$ is used to present the signal changed by timing drift. We just need to extract the $h(kT-\Delta T)$ instead of finding the large amount of the up-sampled signal. An example of signal through the clock offset model (2-14) is shown in Figure 2-23. To obviously present the signal drifted with timing, the signal is also drawn with 8 times oversampling rate. As shown in Figure 2-23, since the clock frequency offset exists, the signal with clock offset will far away the original signal gradually. Figure 2-23: An example of oversampled signal with clock offset In the OFDM-based system, the clock offset will cause the linear phase rotation between the subcarriers. The linear phase rotation caused by clock offset can be derived with discrete Fourier transformation (DFT) as $$Y[k, \Delta T] = \sum_{n=0}^{N-1} Y(nT - \Delta T) \cdot e^{-j\frac{2\pi k nT}{N}}$$ $$= \left[ \sum_{n'=0}^{N-1} Y[n'T] \cdot e^{-j\frac{2\pi k n'T}{N}} \right] \cdot e^{-j\frac{2\pi k \Delta T}{N}} = Y[k,0] \cdot e^{-j\frac{2\pi k \Delta T}{N}}$$ (2-15) Where Y[k, $\Delta$ T] is the frequency-domain subcarrier with clock phase offset $\Delta$ T, Y(nT- $\Delta$ T) is time-domain signal with clock phase offset, and N is the FFT size. As listed in (2-15), the phase rotation $2\pi k\Delta$ T/N which is linear to frequency position k appears due to the clock phase offset $\Delta$ T. And an example of frequency-domain signal distortion with 20ppm clock frequency offset is shown in Figure 2-24. Figure 2-24: Frequency-domain signal distortion caused by 20ppm clock frequency offset As shown in Figure 2-24, within each OFDM symbol there are linear phase rotation linear to the frequency position and clock phase offset. The slope of phase rotation is linear to clock phase offset. Since the clock phase offset is the integration of clock frequency offset and time, the slope of phase rotation is increased as the OFDM number is increased. Therefore the phase-error tracking design is needed to track and compensate the phase error in the frequency domain. #### 2-4-5 AWGN Channel AWGN channel is used to simulate the thermal noise effect of the RF circuit. It's a normal distribution noise. In the Matlab platform, we use normalized random function "randn()" to generate the AWGN. The function based on Matlab code can be derived as $$N = 10^{((S-SNR)/10)};$$ $$AWGN = sqrt(N)^{*}[ randn(1,1:length(AWGN)) + j*randn(1,1:length(AWGN))]; \quad (2-16)$$ Where S is the signal power in dB, SNR is the signal to noise ratio in dB, and N is the noise power. Therefore the AWGN formatted in the matrix with 1 x length(AWGN) size is generated. To verify the performance of the generated AWGN, we simulate the symbol error rate (SER) of QPSK modulation with the AWGN function. The simulated result compared with a referenced result [22] is shown in Figure 2-25. As shown in Figure 2-25, the SER with the proposed AWGN generation overlaps that of referenced textbook. That means the generated AWGN can be used for obvious simulation and performance analysis. Figure 2-25: SER with the proposed AWGN generation method ### Chapter 3: ## Low-Complexity Design for OFDM-Based WLAN System In this chapter we first propose the low-complexity timing and frequency synchronization where the design complexity can be efficiently reduced with keeping high system performance. And then we propose the high-performance channel equalization (EQ), channel tracking scheme, and phase error tracking (PET) design to solve the deep multipath fading, time-variant fading, and residual CFO and SCO in OFDM-based WLAN system. As that we discuss in chapter 1 the baseband design dominates the power consumption of PHY system because of the high complexity of the OFDM technique. Hence we try to develop an efficient complexity-reduction scheme for synchronization. In the timing and frequency synchronization the kernel designs are the auto-correlation (AC) and the matched filter (MF). The design complexity of the AC and MF is dominated by the amount of the complex multiplier and the memory size [2, 28, 30]. To achieve the optimal trade-off between the system performance and synchronization complexity, here we propose the high-power-signal-used (HPSU) AC and high-power-coefficient-used (HPCU) MF design. In the proposed synchronization the used signal amount can be reduced by 50% ~ 75% therefore the multiplication amount and memory size of synchronization can be efficiently reduced. To efficiently reduce the equalization error and track the time-variant channel, we propose the decision-directed channel tracking (DDCT) and employ the frequency-domain minimum mean-square-error (FD-MMSE) EQ for WLAN system. And then the weighted-average phase error tracking (WAPET) is also developed to solve the phase error caused by residual CFO and SCO. The employed FD-MMSE EQ can reduce 1.8dB SNR for 10% PER when compared with conventional least square (LS) EQ and the DDCT can save the SNR loss by 0.8~1.9dB when Doppler frequency is 50Hz. The proposed WAPET can reduce 0.6~2.3dB SNR loss when compared with other referenced approaches. The proposed synchronization, EQ, CE, and WAPET algorithms are development based on the trade-off of performance and complexity, so the performance analysis is very crucial. Hence we will also discuss the system performance of the proposed floating-point designs for OFDM-based WLAN. The key performance including PER, frame error rate (FER), and CFO estimation root-mean-square-error (RMSE) will be shown. The proposed designs are introduced in section 3-1 and section 3-2. And the relative performance and system PER performance are shown in section 3-3 and section 3-4. # 3-1 Low-Complexity Synchronization for OFDM-Based WLAN System In the WLAN system the preamble has non-constant signal power and the high-power signal is more robust to the channel effects. In this section we proposed a HPSU AC and HPCU MF to achieve a nice trade-off of low complexity and high performance. With the use of only high-power signals, the used signal amount can be reduced. And the main synchronization computation including multiplication for correlation and memory to store used signal can be linearly reduced. Although the synchronization performance will be degraded by the decreasing of signal amount, it can still be kept with a little loss with the use of high-power signals. After the design introduction, the trade-off between design complexity and performance will be discussed in section 3-3. ### 3-1-1 Synchronization Block Diagram for OFDM-Based WLAN System In the baseband receiver, the synchronization is used to detect valid packet, detect the correct FFT window, and estimate and then compensate the CFO distortion. The block diagram of the baseband synchronization for OFDM-based WLAN system is shown in Figure 3-1. And the preamble format which can be used for synchronization is shown in Figure 3-2. In the initial of signal receiving, the automatic gain control (AGC) is used to detect the RF gain error and feedback responses the VGA control signal to RF. After the success of AGC, the AC begins to do the packet detection (PD) with the correlation power of the received signal. As shown in Figure 3-2, the PD needs to find out the preamble before the end of the short symbols. In the same time the AC also does the coarse CFO estimation with the correlation phase to estimate CFO within $\pm 625$ KHz, equal to $\pm 125$ ppm KHz of the 5GHz RF frequency. After the successful PD, the MF begins to do the FFT-window detection (FWD) to find the boundary of the FFT window with the MF power. In the meantime the AC also does the fine CFO estimation within $\pm 156.25$ KHz CFO equal to $\pm 31.25$ ppm of the RF frequency. And the standard requirement of CFO is $\pm 20$ ppm in each transmitter and receiver. Hence the CFO range is -40ppm $\sim +40$ ppm (TX+RX) which exceeds the range of the fine CFO estimation. Hence the coarse CFO estimation is still needed. After the success of FWD and the compensation of CFO error in the complex multiplier, the long symbols and following Header and data OFDM symbols will be sent to the FFT design for channel estimation and data demodulation. Figure 3-1: Block diagram of baseband synchronization for OFDM-based WLAN system Figure 3-2: Preamble format of IEEE 802.11a system ### 3-1-2 General Auto-Correlation-Based PD The auto-correlation (AC) is widely used for packet detection and CFO estimation in the OFDM-based wireless system [10, 13, 24, 26, 27, 28, 29, 41]. That is because that initial signal of the preamble usually consists of repeated symbols. And the repeated symbols can result the peak power of the AC and the phase error linear to CFO. The AC can be derived as $$\Lambda_{AC}(m) = \sum_{p=0}^{Ls-1} \sum_{n=0}^{Ds-1} r_{(m+p)Ds-n} \cdot r *_{(m+p+1)Ds-n}$$ (3-1) Where $\Lambda_{AC}(m)$ is the AC , m is the starting OFDM symbol, $r_{(m+p)Ds-n}$ is the received signal of the n-th cycle of the (m+p)-th repeated symbol, $D_S$ is the cycle of the period of a repeated symbol, $L_S$ is the amount of used repeated symbol, and $r(k-m-pD_S)$ is the received sample in the k-m-th cycle of the p-th repeated symbol. In the short symbols, the $D_S$ is equal to 16 and $L_S$ is less than 10. To understand the effect of AC for PD, an example of AC power $|\Lambda_{AC}(m)|^2$ with channel of SNR = 20dB and $L_S$ = 1 is shown in Figure 3-3. When the valid packet comes the AC power will obviously rise since the periodic short symbols can result the high AC power. And then the synchronization can be acknowledged to receive a valid packet. Figure 3-3: Example of the auto-correlation power of the packet detection Based on the AC function, the PD can be described as $$\left| \Lambda_{AC}(m) \right|^2 \ge \lambda_{AC} \times P_{m+1}^2 \tag{3-2}$$ Where $\lambda_{AC}$ is the set threshold to distinguish the AC power of valid short symbols from that of noise, and $P_{m+1}$ is the average signal power of (m+1)-th OFDM symbol. So (3-2) is equivalent to the normalized AC power $|\Lambda_{AC}(m)|^2 / P_M^2$ compared with the threshold. ### 3-1-3 Proposed High-Power-Signal-Used Auto-Correlation for PD To develop the low-complexity AC algorithm, first we analyze the signal format of the used short symbols. There are 16 points in a used short symbol. The signal power of a short symbol is not constant. It has 2dB peak to average power ratio (PAPR) and 7dB maximum to minimum power ratio. The short symbol power and the noise power with average 3dB SNR is shown in Figure 3-4. In the channel, the low-power signal will be more easily distorted by noise. From Figure 3-4, we can find the signal power in symbol index = 1, 3, 5, 7 is lower than the noise power. Hence in these symbols the signal distortion will be more serious. And the accuracy of the AC will be also distorted by the low-power symbols. The AC power of respective 50% high-power signal and 50% low-power signal is shown in Figure 3-5. Figure 3-4: Short symbol power and noise power with 3dB SNR Figure 3-5: Auto-correlation power of high-power and low-power signal As shown in Figure 3-5, it can be found that the relationship in AC power is high-power signal > all signal >low-power signal when the packet comes. The reason why the AC of all signal is lower than that of high-power signal is because the low-power signal which is also included in all signal degrades the AC power. And of course if only low-power signal is used, the AC power will be degraded and the PD accuracy is reduced. To simultaneously reduce the used signal amount and keep design performance we need to extract the high-power signal for AC. According to the analysis, the proposed HPSU AC algorithm can be derived as $$\Lambda_{AC}(m) = \sum_{p=0}^{Ls-1} \sum_{n=0}^{D_S/\omega_{AC}-1} r_{(m+p)Ds-n} \cdot r *_{(m+p+l)Ds-n}$$ (3-3) Where $D_S$ is the cycle of the period of a repeated symbol, $\omega_{AC}$ is the reduction factor of the AC, $D_S/\omega_{AC}$ is the used high-power signal amount of a repeated symbol, n is the index of high-power signal. When we want to use 50% high-power signal, the $\omega_{AC}$ will be set to 2. And when the reduction factor $\omega_{AC}$ is increased, the used complex multiplication, complex addition, and needed size of the memory to store the used signal can be also reduced. The reduction factor $\omega_{AC}$ of (3-3) needs to be analyzed based on system performance to achieve low complexity and high performance. The performance analysis will be discussed in section 3-3-1. For finding the high-power signal, one method is to use even/odd method to find 50% high-power signal. As shown in Figure 3-4 we can find the high-power signal appears in even symbol index (2, 4, 6, 8) and the others are low-power. Hence we can detect the average power of even symbols and odd symbols of the received signal. The advantage of the even/odd method is that the complexity of comparison can be less than that to extract the true high power signal. #### 3-1-4 General Auto-Correlation-Based CFO Estimation The AC is also widely used for CFO estimation [10, 24, 26, 27, 28]. In the below we will first discuss the feature of CFO estimation for OFDM-based WLAN system, and then analyze the estimation effect of the proposed HPSU design. In the wireless system, the CFO exists since the imbalance in RF between the transmitter and receiver. Usually $\pm 20 \sim \pm 25$ ppm CFO is requested to tolerant in the baseband design. Since the CFO causes the linear phase error, it will be linear to the phase of AC of two repeated symbols. Hence the AC-based CFO estimation [10, 24, 27, 28, 31] can be derived as $$\hat{\epsilon} = \frac{1}{2\pi D_S T} tan^{-1} \left\{ Im \left[ \Lambda_{AC}(m) \right] / Re \left[ \Lambda_{AC}(m) \right] \right\}$$ (3-4) Where $\hat{\epsilon}$ is the estimated CFO, D<sub>S</sub>T is the period of one repeated symbol, and $\Lambda_{AC}(m)$ is the AC result. After CFO estimation, the compensation can be done in the baseband receiver. The CFO compensation can be derived as $$\hat{r}_k = r_k \cdot \exp\left(j2\pi \,\hat{\in}\, kT\right) \tag{3-5}$$ Where $r_k$ is the received signal, $\hat{\epsilon}$ is the estimated CFO, T is the sample period, $exp(j2\pi \hat{\epsilon} kT)$ is the compensating phasor, and $\hat{r}_k$ is the compensated signal. In IEEE 802.11a system, the preamble consists of short and long symbols. And the CFO estimation comprised coarse estimation for high range and fine estimation for high accuracy. The CFO estimation range can be derived as $$\hat{\epsilon} \leq \frac{\pm \pi}{2\pi NT} \tag{3-6}$$ Where the $\hat{\epsilon}$ is the estimated CFO, T is the point duration of one received signal, and N is the point amount of one repeated symbol. The arc-tangent value of (3-4) is limited in $\pm \pi$ (range of $2\pi$ ) because $\pm \pi + \theta$ and $\pm \pi + \theta + 2n\pi$ can not be distinguished in the arc-tangent output. Hence the estimated CFO is limited in the range as (3-6). The CFO estimation ranges of short and long symbols are respectively $\pm \pi/2\pi/0.8\mu s = 625 \text{KHz}$ (125ppm of RF frequency) and $\pm \pi/2\pi/3.2\mu s = 156.25 \text{KHz}$ (31.25ppm of RF frequency). In IEEE 802.11a system the ±20ppm CFO exists in each the transmitter and receiver. Therefore the total CFO from transmitter and receiver will be $\pm 40$ ppm (TX CFO+RX CFO). Since the $\pm 40$ ppm CFO range exceeds the range of fine estimation, the coarse estimation is needed. And the fine CFO estimation is also used to achieve more accurate estimation than coarse estimation. The phase error after CFO estimation and then compensation with 40ppm CFO and 10dB SNR condition is shown in Figure 3-6. Before CFO compensation, the phase error caused by 40ppm CFO is seriously rotated between $\pm \pi$ . After the coarse CFO estimation with 5ppm CFO error, the phase error can be suppressed however it's obviously increased. The residual CFO will cause seriously ICI effect in the frequency domain. And through the fine estimation and compensation with -0.4ppm CFO error, the phase error can be efficiently eliminated approaching to zero. So the combination of coarse and fine estimation is needed to solve the CFO distortion of OFDM-based WLAN system. Figure 3-6: An example of phase error after CFO estimation in WLAN system # 3-1-5 Proposed High-Power-Signal-Used Auto-Correlation-Based CFO Estimation Similar to the PD, the CFO estimation based on the AC design can be also considered to be low complexity. The phase error after CFO estimation with all signal, 50% high-power signal, and 50% low-power signal of each short and long symbol is shown in Figure 3-7. As shown in Figure 3-7, the 50% high-power signal can get the higher estimation accuracy therefore causing lower phase error. So to use the high-power signal for CFO estimation can efficiently reduce the signal amount, reduce design complexity and keep the estimation accuracy. Figure 3-7: Phase error of CFO estimation with all signal, high-power signal, and low-power signal Similar to the AC of PD (3-3), the CFO estimation based on the proposed HPSU AC can be derived as $$\hat{\in} = \frac{1}{2\pi NT} tan^{-1} \left\{ \left( Im \sum_{n=0}^{D_S/\omega_{AC}-I} r(n).r^*(n+N) \right) / \left( Re \sum_{n=0}^{D_S/\omega_{AC}-I} r(n).r^*(n+N) \right) \right\}$$ (3-7) Where $D_S$ is the reduction factor. Therefore the needed complex multiplication and memory size can be reduced from $D_S$ of (3-1) to $D_S/\omega_{AC}$ . When the used signal amounts of AC for PD and CFO estimation are the same, the AC circuit can be shared to achieve low hardware area. Hence in the performance analysis to decide the $\omega_{AC}$ value, we will assume the $\omega_{AC}$ values for PD (3-3) and CFO estimation (3-7) are the same. #### 3-1-6 General Matched-Filter-Based FWD In OFDM system, signal within the correct FFT window needs to be sent to the FFT design for OFDM demodulation. The synchronization needs to detect the correct FFT window in the preamble. For the accurate FFT-window detection (FWD), the MF is widely used [27, 29, 57, 58]. It is also call as the cross-correlation algorithm. The MF algorithm can be derived as $$\Lambda_{MF}(n) = \sum_{k=0}^{L-1} r(n-k) \cdot c^{*}(k)$$ (3-8) Where $\Lambda$ AC(k) is the MF output, r(n) is the received signal, c(k) is the MF coefficient, L is the amount of MF coefficient, also called the filter tap number. The MF coefficients are equal to the time-inversed long symbol. Hence 64-tap matched filter is used for IEEE 802.11a OFDM-based WLAN system. And when the received signal is the same as the long symbol, we can find the maximum MF power. The example of MF power $|\Lambda_{AC}(k)|^2$ in the IEEE multipath channel [19] with line of sight (LOS) is shown in Figure 3-8. When the multipath channel has LOS, the received signal in the correct FFT window will result the maximum peak of the MF. In other timing, the MF also has high output power because the signal is distorted by channel noise and multipath channel fading. The timing of matched-filter peak power can be derived as $$K_{peak} = \underset{k}{\operatorname{arg}} \max \left\{ \left| \Lambda_{MF}(k) \right|^{2} \right\}$$ (3-9) Where $K_{peak}$ is the timing with peak power and $\Lambda_{MF}(k)$ is the MF output. Channel Condition: 10dB SNR and multipath channel with 25ns RMS delay spread Figure 3-8: An example of matched filter power When the multipath channel impulse response has LOS, the timing with peak power can be looked as the correct FFT window boundary. When the multipath channel impulse response does not have LOS, the correct FFT window boundary is earlier than the timing with peak power. And the correct FFT window can be found with sub-optimal timing location algorithm [27]. ### 3-1-7 Proposed High-Power-Coefficient-Used Matched-Filter for FWD Similar to the short symbol, the long symbol power is also not the constant. The signal power and AWGN power in average 3dB SNR is shown in Figure 3-9. As shown in Figure 3-9, there are 64 points (64 coefficients of the MF) in the long symbol. The long symbol has 3dB PAPR and 13dB maximum-to-minimum power ratio. As the AWGN is added, the low-power signal will be seriously distorted therefore the MF power is also degraded. Figure 3-9: The long symbol power and AWGN power in average 3dB SNR. To understand the behavior in MF power with high-power and low-power signal (coefficients) under noise distortion, the power of the MFs respectively containing 25% high-power coefficients, 75% low-power coefficients, and 100% all coefficients is shown in Figure 3-10. As shown in Figure 3-10, the timing to achieve peak power of 100% coefficients is the same as that of 25% high-power coefficients. But the peak power of the 75% low-power-coefficient is not so obvious. That means that although the coefficient amount of low-power coefficients is larger than that of high-power coefficients, the MF power is still dominated by the high-power coefficients. So to use the high-power signal as the MF coefficients can efficiently reduce the design complexity and keep the FWD accuracy. Figure 3-10: Example of the matched filter power of high-power and low-power coefficients According to the behavior analysis of the MF design, the proposed low-complexity MF power for FWD can be derived as $$\Lambda_{MF}(n) = \sum_{k=0}^{L/\omega_{MF}-1} r(n-k) \cdot c^*(k) \bigg|_{k=index \ of \ high-power \ coefficient}$$ (3-10) Where $\omega_{MF}$ is the reduction factor of the MF, $L/\omega_{MF}$ is the coefficient amount, and k is the index of high-power coefficients. When $\omega_{MF}$ is higher than 1, the complex multiplication and addition of the MF can be reduced from L to $L/\omega_{MF}$ . And in the hardware architecture, the amount of the parallel complex multipliers can be also reduced to $1/\omega_{MF}$ of that in (3-8). The required amount of high-power coefficient $L/\omega_{MF}$ is decided based on the trade-off of design performance and complexity. This trade-off with the design performance will be shown in section 3-3-2. # 3-2 Low-Complexity Channel Estimation for WLAN System As discussed in chapter 2, in OFDM system the channel estimation/equalization (CE/EQ) and phase error tracking (PET) occupies about 95% SNR loss for typical 10% PER. And this SNR loss degrades PER performance and reduces the achievable transmission distance. Hence in this section we propose DDCT and WAPET and employ an improved low-complexity FD-MMSE EQ to efficiently reduce the SNR loss and even solve the time-variant multipath channel in the indoor environment. For the low-complexity target, the tracking frequency of DDCT can become slower and feedback channel updating is used to share the EQ circuit. And the algorithms of signal sum are modified to reduce the needed memory to store the sum inputs. The detailed design is introduced below. ### 3-2-1 Basic Channel Equalization with Phase Error Tracking In OFDM baseband receiver, after the FFT the received time-domain signal is converted to the frequency-domain signal. The received frequency-domain signal needs to be sent through CE, equalization (EQ), and PET to compensate the channel fading, phase error caused by CFO and SCO [14, 17, 18, 24, 25, 26, 27, 31]. In a general OFDM-based baseband transceiver, CE based on complex division is used to estimate the CFR. The basic zero-forcing (ZF) CE algorithm can be derived as $$H_E(K) = Y_L(K) / X_L(K)$$ $$= \frac{X_L(K)H(K) + \Delta\omega(K)}{X_L(K)} = H(K) + \Delta\omega(K) / X_L(K) = H(K) + \Delta H(K)$$ (3-11) Where $H_E(k)$ is the estimated channel frequency response (CFR), $Y_L(k)$ is the received frequency-domain preamble, $X_L(k)$ is the defined and transmitted frequency-domain preamble, H(k) is the true CFR, and $\Delta w(K)$ is the AWGN within $Y_L(k)$ . This channel estimation is also called zero-forcing (ZF) algorithm [27, 31]. The ZF algorithm is realized with the complex division in frequency domain according to the digital signal processing (DSP) effect that timing-domain circular convolution will be converted to frequency-domain multiplication. In the channel the timing-domain preamble is convoluted by the channel impulse response (CIR). And then in the receiver after FFT it will be the multiplication of frequency-domain preamble and CFR. And the CE error is caused by the white noise $\Delta w(K)$ . The example of estimated CFR and true CFR of IEEE multipath channel [19] with RMS = 50ns and SNR = 10dB is shown in Figure 3-11. The difference between the true and estimated CFR is the estimation error $\Delta W/X_L(K)$ caused by the noise. After CE, the channel fading in the received data can be compensated by a basic least-square equalization (LS EQ) [27, 52, 56]. The LS EQ can be derived as $$X_{E}(K) = Y_{D}(K) / H_{E}(K) = Y_{D}(K) \cdot [Y_{L}(K) / X_{L}(K)]^{-1}$$ (3-12) Where $X_E(k)$ is the equalized data subcarrier and $Y_D(k)$ is the received data subcarrier. After the equalization, the data will be sent to PET to be compensated with the phase error mainly caused by CFO and SCO. Figure 3-11: Example of estimated CFR and true CFR with 50ns RMS and 10dB SNR As discussed in the chapter 2, the residual CFO and SCO will cause the increasing of mean phase error and linear phase error respectively. Hence the phase error can be estimated with the minimum-square-error algorithm and pilot signal. The mean phase error detection can be derived as $$\psi = \underset{\psi}{arg min} \left\{ \sum_{K} [\theta_{K} - \psi]^{2} \right\}$$ $$\rightarrow \frac{\partial \left( \sum_{K} [\theta_{K} - \psi]^{2} \right)}{\partial \psi} = 0 \rightarrow \psi = \frac{1}{P} \sum_{K} \theta_{K}$$ (3-13) Where $\psi$ is the detected mean phase error, $\theta_K$ is the phase error of each pilot, K is the subcarrier index of pilots, and P is the pilot number. And we can find to achieve minimum square error the estimated mean phase error will of course be the mean of pilot phase error. To achieve the minimum square error of linear phase error, the detection of the phase error slope can be derived as $$L = \underset{L}{arg min} \left\{ \sum_{K} [\theta_{K} - KL]^{2} \right\}$$ $$\rightarrow \frac{\partial \left( \sum_{K} [\theta_{K} - KL]^{2} \right)}{\partial L} = 0 \rightarrow L = \frac{\sum_{K} k \cdot \theta_{K}}{\sum_{K} K^{2}}$$ (3-14) Where L is the detected slope of the linear phase error, and LK is the linear phase error linear to subcarrier index K. According to this method we can find the mean phase error and linear phase error as shown in Figure 3-12. Charlier Condition: Residual Cr O = 0. Tppm, 300 = 40ppm, 3NR = 200b Figure 3-12: Example of detected phase error in OFDM-Based WLAN system As shown in Figure 3-12, the phase error caused by only 0.1ppm residual CFO and 40ppm SCO ideally contains mean and phase error in 20MHz bandwidth. However when the AWGN is joint, the phase error will be distorted to result rapid transition. Because the phase error detection based on minimum-square-error algorithm is used, the mean and linear phase error distorted by AWGN can be still individually detected. Finally the total detected phase error can be approached to the true phase error caused by only CFO and SCO. ### 3-2-2 Employed MMSE Channel Equalization for OFDM To efficiently improve the system performance under multipath channel, minimum mean square error (MMSE)-based channel equalization (EQ) has been developed for single-carrier system such as QPSK, CDMA, and so on [47, 48]. The conventional MMSE EQ is derived based on time-domain computation so it requires high complexity [54, 55]. Otherwise several accurate channel estimation (CE) such as linear minimum mean-square-error (LMMSE), singular value decomposition (SVD), and maximum-likelihood (ML) CE have been proposed [49, 50, 51]. However these CE approaches were also developed based on time-domain computation and required high complexity. To efficiently improve system performance and achieve low hardware complexity, we derive a frequency-domain MMSE (FD-MMSE) EQ here. The derivation of the FD-MMSE EQ is similar to the conventional time-domain MMSE (TD-MMSE) EQ. But because we use the frequency-domain signal instead of time-domain signal, the hardware complexity of EQ can become much lower. The mean square error (MSE) of equalization can be derived as $$\varepsilon = \mathrm{E}\left\{ \left| Y(K) \cdot C(K) - X(K) \right|^{2} \right\}$$ (3-15) Where $\varepsilon$ is the mean square error of equalized data, K is subcarrier index; Y(K) is the received signal through channel distortions; C(K) is the compensating signal; X(K) is the transmitted signal. For the MMSE criterion, we should minimize $\varepsilon$ , and therefore C(K) can be derived as $$C(K) = \underset{A(K)}{\operatorname{arg min}} \{ \varepsilon \}$$ (3-16) And $\varepsilon$ can be decomposed as $$\varepsilon = E \{ Y(K) \cdot C(K) - X(K) |^{2} \}$$ $$= E \{ [Y(K) \cdot C(K) - X(K)] \cdot [Y(K) \cdot C(K) - X(K)]^{*} \}$$ $$= E \{ |Y(K) \cdot C(K)|^{2} - X(K) \cdot Y(K) * \cdot C(K) *$$ $$- X(K) * \cdot Y(K) \cdot C(K) + |X(K)|^{2} \}$$ (3-17) To find the C(K) which leads to MMSE of equalized data, the differentiation equation for value minimization can be used, and then C(K) can be derived as $$C(K) = c_{1} + jc_{2}$$ $$\frac{\partial \varepsilon}{\partial C} = 0 \rightarrow \begin{cases} \frac{\partial \varepsilon}{\partial c_{1}} = 0 \\ \frac{\partial \varepsilon}{\partial c_{2}} = 0 \end{cases} \xrightarrow{\partial \varepsilon} \begin{cases} \frac{\partial \varepsilon}{\partial c_{1}} = 0 \\ \frac{\partial \varepsilon}{\partial c_{2}} = 0 \end{cases}$$ $$\Rightarrow \begin{cases} 2c_{1}E\{|Y(K)|^{2}\} - E\{X(K) \cdot Y(K)^{*} + X(K)^{*} \cdot Y(K)\} = 0 \\ 2c_{2}E\{|Y(K)|^{2}\} - jE\{-X(K) \cdot Y(K)^{*} + X(K)^{*} \cdot Y(K)\} = 0 \end{cases}$$ $$\Rightarrow \begin{cases} c_{1} = \operatorname{Re}\{\frac{E\{X(K) \cdot Y(K)^{*}\}}{E\{|Y(K)|^{2}\}}\} \\ c_{2} = \operatorname{Im}\{\frac{E\{X(K) \cdot Y(K)^{*}\}}{E\{|Y(K)|^{2}\}}\} \end{cases} \xrightarrow{C(K)} \frac{E\{X(K) \cdot Y(K)^{*}\}}{E\{|Y(K)|^{2}\}}$$ Where $c_1$ and $c_2$ are real part and imaginary part of C(K). In IEEE 802.11a system we can estimate two CFR from the used 2 OFDM symbols. And the compensating value can be derived as $$C(K) = \frac{(Y_{L1}(K)^* + Y_{L2}(K)^*)X_L(K)}{|Y_{L1}(K)|^2 + |Y_{L2}(K)|^2}$$ (3-19) Where $Y_{L1}(K)$ and $Y_{L2}(K)$ are the first and the second OFDM symbol used for CE in IEEE 802.11a system; and $X_L(K)$ is the transmitted OFDM symbol for CE. We call the compensating value derived based on MMSE of each OFDM subcarrier as the "basic MMSE EQ design". To understand the improvement of optimal performance by MMSE EQ, the MSE of LS EQ and MMSE EQ are compared with each other. Before the analysis of data MSE, we derive the expected compensating value as time is assumed to be infinite. The optimal compensating value of LS EQ can be derived as $$C_{LS}(K) = 1/E[H_E(K)] \approx 1/H(K)$$ (3-20) Where H(K) is the true channel. So in LS EQ the optimal compensating value is the inverse of true CFR. The optimal compensating value of MMSE EQ can be derived as $$C_{MMSE}(K) = \frac{E[X(K)H(K) * + \Delta\omega(K)8]X(K)}{E[X(K)H(K) + \Delta\omega(K)|^{2}]} = \frac{E[X(K)|^{2}H(K) *]}{E[X(K)|^{2}H(K)|^{2}] + \sigma^{2}}$$ $$= \frac{E[H(K) *]}{E[H(K)|^{2}] + \sigma^{2} / E[X(K)|^{2}]}$$ (3-21) Where H(K) is the true CFR; $\sigma^2$ is the noise power. The data MSE of perfect LS EQ can be derived as $$\varepsilon_{LS} = E\left\{Y(K) \cdot C_{LS}(K) - X(K)\right\}^{2} \approx \frac{\sigma^{2}}{E[|H(K)|^{2}]}$$ (3-22) And the data MSE of perfect MMSE EQ can be derived as $$\epsilon_{\text{MMSE}} = E \left\{ Y(K) \cdot C_{\text{MMSE}}(K) - X(K) \right|^{2} \right\} \\ = E \left\{ \left| \left[ X(K)H(K) + \Delta \omega(K) \right] \cdot C_{\text{MMSE}}(K) - X(K) \right|^{2} \right\} \\ = E \left\{ \left| \frac{H * (K) \cdot \Delta \omega(K) - \sigma^{2} \cdot X(K) / E[|X(K)|^{2}]}{E[|H(K)|^{2}] + \sigma^{2} / E[|X(K)|^{2}]} \right|^{2} \right\} \\ = \frac{\sigma^{2}}{E[|H(K)|^{2}] + \sigma^{2} / E[|X(K)|^{2}]} = \frac{\sigma^{2}}{E[|H(K)|^{2}] + \beta \cdot SNR^{-1}}$$ (3-23) where $\Delta\omega(K)$ is the while noise; $\sigma 2$ is the noise power; $\beta$ is the ratio of used subcarrier number over FFT point. And for WLAN system $\beta$ is 52/64. Compared with (3-22) and (3-23) we can find $\epsilon_{MMSE} < \epsilon_{LS}$ . These two data MSE in IEEE 802.11a system with 50ns RMS delay spread of IEEE Rayleigh fading channel [19] are shown in Figure 3-13. Figure 3-13: Data MSE of perfect EQ schemes From (3-19) we can know the compensating value of the basic FD-MMSE EQ is calculated from received preambles. However we hope to make FD-MMSE EQ also employ accurate CE and more approximate to the optimal value as (3-21). Hence we derive an "improved FD-MMSE EQ" instead of the basic MMSE EQ in (3-19). The compensating value of the improved FD-MMSE EQ can be derived as $$C_{MMSE}(K) = \frac{E[H_E(K)^*]}{E[|H_E(K)|^2] + \sigma_E^2 / E[|X(K)|^2]}$$ (3-24) Where $H_E(K)$ is the estimated CFR with the two OFDM symbols of IEEE 802.11a system; $\sigma_E 2$ is the estimated noise power; $E[|X(K)|^2]$ is the average transmitted preamble power, which can be fixed as 1 in IEEE 802.11a system. And we can estimate the noise power as $$\frac{1}{N} \sum_{K=0}^{N-1} |Y_{P1}(K) - Y_{P2}(K)|^2 = \frac{1}{N} \sum_{K=0}^{N-1} |\Delta \omega_1(K)_1 - \Delta \omega_2(K)|^2 \approx 2\sigma^2$$ $$\rightarrow \sigma_E^2 = \frac{1}{2N} \sum_{K=0}^{N-1} |Y_1(K) - Y_2(K)|^2$$ (3-25) Where $Y_{P1}(K)$ and $\Delta\omega_1(K)$ are the received preamble and while noise of 1st FFT symbol; $Y_{P2}(K)$ and $\Delta\omega_2(K)$ are the received preamble and while noise of 2nd FFT symbol; $\sigma^2$ is the noise power; and $\sigma_E^2$ is the estimated noise power. The improvement from the basic FD-MMSE EQ to the improved FD-MMSE EQ is that (A): the compensating value is more close to the optimal value, (B): accurate CE schemes can be applied to make the compensating value more close to the optimal value $C_{MMSE}(K)$ . Hence we employ the improved FD-MMSE EQ in WLAN system. The hardware complexity of the employed FD-MMSE EQ design, existing high-accuracy CE, and conventional TD-MMSE EQ is listed in Table 3-1. From Table 3-1 we can find the complexity of FD-MMSE EQ is a little higher than that of ZF CE and much lower than that of the existing high-performance designs. The amount of complex multiplications of FD-MMSE EQ is only $3/4M \approx 1.17\%$ of that of conventional TD-MMSE EQ for WLAN system with M = 64. To furthermore enhance system performance and solve the time-variant channel fading, a channel tracking scheme is developed and introduced next. Table 3-1: Hardware complexity of CE and EQ designs | Design | Complex Multiplications | Real Divisions | |-------------------------|-------------------------|----------------| | Employed FD-MMSE EQ | 3M/2 | M | | Conventional TD-MMSE EQ | $2M^2$ | $M^2$ | | ZF CE | М | 0 | | SVD or ML CE | $M^2/2$ | 0* | | LMMSE CE | $M^2$ | M | <sup>\*</sup> M is the FFT point number ### 3-2-3 Proposed Decision-Directed Channel Tracking In the indoor environment for WLAN system the user may have a human's velocity. As discussed in section 2-4 the 50Hz Doppler frequency will be resulted by the WLAN system with 10km/h opposite and 5GHz RF band. The Doppler frequency will cause each path of multipath channel to rotate with different frequency. With 50Hz Doppler frequency the frequency-domain CFR will vary slowly. The time-variant CFR with 50ns RMS delay spread and 50Hz Doppler frequency is shown in Figure 3-14. <sup>\*</sup> In SVD and ML CE, they assume the divisions can be pre-computed. Channel Condition: RMS delay = 50ns, Doppler frequency = 50Hz Figure 3-14: Example of time-variant CFR with 50Hz Doppler frequency As shown in Figure 3-14, the CFR varies with time because of the Doppler effect. And during 1.2ms time the maximum variation of the magnitude of the example will achieve 5dB. And this large variation will make CE inaccurate and distort the received QAM symbols. There have been several CE or channel tracking (CT) schemes published for OFDM-based system in time-variant channel. A matrix-based pilot-added CT method is proposed for OFDM-based WLAN system and actually implementation in 0.18µm CMOS process [24, 43]. It uses pilot information to interpolate out the channel variance and channel error of all data and pilot subcarriers. And this method can achieve lower 2.5 ~ 3.0dB of CE mean-square-error (MSE) than general ZF approach [24]. However this approach needs 18 parallel complex multipliers for the large-matrix calculation. Otherwise a decision-directed channel tracking (DDCT) approach with use of received data Y(K) is proposed in [52]. But in OFDM system not only the multipath fading but also the residuary CFO and SCO injury the system performance. Hence the received data Y(K) with these channel distortions can not be directly used for DDCT, unless the design pays more circuits to generate the received data only distorted by multipath fading. To reduce the hardware cost and to use the signal only distorted by channel fading, we propose a DDCT with the use of equalized data. Different from [24] the proposed DDCT uses all data and pilot information instead of only pilot information. And only 2 complex multipliers are required in the proposed CT. Compared with ZF approach, it can achieve lower 6.0~27dB of CE MSE. And below is the introduction of the proposed DDCT algorithm. The proposed DDCT is proposed to extract and remove the channel error caused by noise and Doppler effect. According to (3-11) and (3-12) we can find $$X_{E}(K) = Y_{D}(K) / H_{E}(K) = X_{D}(K) H(K) / H_{E}(K) + \Delta \omega(K)$$ $$X_{E}(K) / X_{D}(K) = H(K) / H_{E}(K) + \Delta \omega(K) / H_{E}(K)$$ (3-26) Where $X_E(K)$ is the data subcarrier equalized with error CFR $H_E(K)$ ; $Y_D(k)$ is the received data subcarrier; $X_D(k)$ is the transmitted data subcarrier; H(K) is the true CFR, $\Delta\omega(K)$ is the white noise of $Y_D(k)$ , $H_E(K)$ is the estimated CFR, $\Delta H(K)$ is the CE error in $H_E(K)$ . Since white noise has characteristic of zero-mean, the correct channel can be tracked as $$E[X_{E}(K)/X_{D}(K)] = E[H(K)/H_{E}(K) + \Delta\omega(K)/H_{E}(K)]$$ $$\approx H(K)/H_{E}(K)$$ $$\rightarrow H(K) \approx H_{E}(K) \times E[X_{E}(K)/X_{D}(K)]$$ (3-27) In time-variant channel, the moving-average needs to be used instead of infinite-average scheme. Hence the proposed DDCT can be modified as $$H_{N}(K) = H_{E}(K) \times R_{N}(K)$$ $$= H_{E}(K) \times \frac{1}{W} \sum_{n=N-W+1}^{N} \left[ X_{E,n}(K) / X_{D,n}(K) \right]$$ (3-28) Where $H_N(K)$ is the corrected CFR in N-th OFDM symbol; $H_E(K)$ is the initial CFR; W is the window of tracking OFDM symbols. In (3-28) it looks that W OFDM symbol of data ratio $(X_{En}(K)/X_{Dn}(K))$ should be summed for the DDCT. To reduce the redundant summation complexity the moving-average can be derived as $$R_{N}(K) = \frac{1}{W} \sum_{n=N-W+1}^{N} \left[ X_{E,n}(K) / X_{D,n}(K) \right]$$ $$\to R_{N}(K) = R_{N-1}(K) +$$ $$\frac{1}{W} \left[ X_{E,N}(K) / X_{D,N}(K) - X_{E,N-W}(K) / X_{D,N-W}(K) \right]$$ (3-29) Where $R_N(N)$ the calculated ratio of updated CFR $H_N(K)$ over the old CFR $H_{N-1}(K)$ . According to (3-29) the moving-average circuit of DDCT can be simplified. We can have two methods to compensate the error caused by time-variant channel: feedback update of CFR as (3-28) and feedforward compensation. The feedforward compensation can be derived as $$X_{C}(K) = Y_{D}(K)/H(K) = X_{E}(K)/R_{N}(K)$$ (3-30) So we can find the feedforward compensation needs one complex divider. The block diagrams the proposed feedback compensation and the feedforward compensation of DDCT are shown in Figure 3-15. ## (a) DDCT with Feedforward Compensation ## (b) DDCT with Proposed Feedback Compensation Figure 3-15: Block diagrams of DDCT with feedforward and feedback compensation To achieve the low EQ complexity, we employ the feedback CFR update. This feedback CFR update not only corrects the time-variant channel but also increases the equalized data correctness and enhances the accuracy of PET. # 3-2-4 Proposed Weighted-Average Phase Error Tracing with Pilot Pre-compensation In the PET design, the minimum-square-error algorithm (3-13) and (3-14) can make the detected phase error close to the true values. We also name the detection as the direct detection method. However there are still two problems to injury the phase detection accuracy. The first problem is the channel noise [14]. The AWGN will cause the true and detected phase error become difference and then the compensated data subcarrier will still be rotated by the residual phase error. And a linear least square estimation (LLSE) approach reduces the noise between OFDM symbols with a linear weighted tracking [14]. It tracks the phase error with the linear weighting factors between each OFDM symbol. However the true phase error of earlier OFDM symbols is smaller than that of latter OFDM symbols. And it's more easily distorted by channel noise. Hence the linear-weighted method is not robust enough to channel noise. 1896 The second problem is the phase error range [25]. When the packet length is longer than about 1ms, the mean phase error caused by residual CFO will possibly exceed the $\pm\pi$ range. Therefore the direct detection (3-13) and (3-14) will not find the correct value. An example of incorrect phase detection when the true phase error exceeds $\pm\pi$ is shown in Figure 3-16. When the true phase error exceeds $\pm\pi$ , the pilot phase error will also distribute in positive and negative range. Hence the phase detection (3-13) and (3-14) will miss the true phase error information. And the total detected phase error (0+KL) will be far away the true phase error. In reference [25], the PET approach sieves out the pilots with large phase error $\geq \pm\pi$ and then detected the phase error without the use of these large-phase-error pilots. However this method is not well when all pilots have large phase error. And this PET method with fewer pilots will degrade the PET performance. To solve the two problems in PET, we propose the non-linear weighted-average phase error tracking (WAPET) with pilot pre-compensation. The non-linear WAPET can suppress the channel noise with non-linear weighting factors and the pilot pre-compensation can make the PET correctly track the phase error exceeding $\pm\pi$ . The non-linear weighting factors between OFDM symbols can be more efficient to suppress the channel noise than the same weighting factors between OFDM symbols. And different from [25], in the proposed design all pilots can be still used when the pilot phase error $\geq \pm\pi$ . The proposed non-linear WAPET based on the detection comprising (3-13) and (3-14) and the weighted average algorithms can be derived as Figure 3-16: Incorrect phase detection when the phase error exceeds $\pm \pi$ $$\begin{cases} \psi_{N} = \sum_{n=1}^{N} \left[ W_{\psi} (I - W_{\psi})^{N-n} \cdot \frac{1}{P} \sum_{K} \theta_{K,n} \right] \\ L_{N} = \sum_{n=1}^{N} \left[ W_{L} (I - W_{L})^{N-n} \cdot \frac{\sum_{K} K \cdot \theta_{K,n}}{\sum_{K} K^{2}} \right] \end{cases}$$ (3-31) Where $\psi_N$ is the tracked mean phase error of N-th OFDM symbol, $L_N$ is the tracked slope of linear phase error of N-th OFDM symbol, $W_{\psi}$ is the root of weighting factor of mean PET, $W_L$ is the root of weighting factor of linear PET, $\theta_{K,n}$ is the detected pilot phase error of n-th OFDM symbol, P is the pilot number, and K is the pilot subcarrier index. In (3-31) the tracked phase error will be the weighted average value during all OFDM symbols. However the non-linear WAPET needs to store N of phase error values. To achieve the low complexity, the non-linear WAPET can be modified as $$\begin{cases} \psi_{N} = W_{\psi} \frac{1}{P} \sum_{K} \theta_{K,N} + (I - W_{\psi}) \cdot \sum_{n=1}^{N-1} \left[ W_{\psi} (I - W_{\psi})^{N-1-n} \cdot \frac{1}{P} \sum_{K} \theta_{K,n} \right] \\ = W_{\psi} \cdot \frac{1}{P} \sum_{K} \theta_{K,N} + (1 - W_{\psi}) \cdot \psi_{N-1} \\ L_{N} = W_{L} \frac{\sum_{K} K \cdot \theta_{K,n}}{\sum_{K} K^{2}} + (I - W_{L}) \cdot \sum_{n=1}^{N-1} \left[ W_{L} (I - W_{L})^{N-1-n} \cdot \frac{\sum_{K} K \cdot \theta_{K,n}}{\sum_{K} K^{2}} \right] \\ = W_{L} \cdot \frac{\sum_{K} k \cdot \theta_{K,n}}{\sum_{K} K^{2}} + (1 - W_{L}) \cdot L_{N-1} \end{cases}$$ (3-32) According to (3-32) the tracked phase error mean $\psi_N$ and slope $L_N$ can be the weighted average of detected error and the tracked error $\psi_{N-1}$ and $L_{N-1}$ of the previous OFDM symbol. Therefore the memory size to store the WA inputs can be reduced to 1/N of that in (3-31). When the burst noise causes large detection error, the WA can reduce the influence of the large detection error. To achieve correct tracking of the phase error exceeding $\pm \pi$ , the pre-compensation of pilot subcarriers is added in the PET. Before PET the pilot subcarriers can be compensated with the phase error tracked in the previous OFDM symbol. Therefore only the difference of the phase error between the previous and the present OFDM symbol needs to be tracked with the pilot. The PET with pilot pre-compensation can be derived as $$\Delta \theta_{K,N} \equiv \theta_{K,N} - \psi_{N-1} - KL_{N-1}$$ $$\psi_{N} = W_{\psi} \cdot \frac{1}{P} \sum_{K} (\theta_{K,N}) + (1 - W_{\psi}) \cdot \psi_{N-1}$$ $$= W_{\psi} \cdot \frac{1}{P} \sum_{K} (\Delta \theta_{K,N}) + (1 - W_{\psi}) \cdot \psi_{N-1} + \frac{W_{\psi}}{P} \sum_{K} \psi_{N-1} + \frac{W_{\psi}L_{N-1}}{P} \sum_{K} K$$ $$= W_{\psi} \cdot \frac{1}{P} \sum_{K} (\Delta \theta_{K,N}) + \psi_{N-1} = W_{\psi} \cdot \Delta \psi_{N} + \psi_{N-1}$$ $$L_{N} = W_{L} \cdot \frac{\sum_{K} (K \cdot \theta_{K,N})}{\sum_{K} K^{2}} + (1 - W_{L}) \cdot L_{N-1}$$ $$= W_{L} \cdot \frac{\sum_{K} (K \cdot \Delta \theta_{K,N})}{\sum_{K} K^{2}} + (1 - W_{L}) \cdot L_{N-1} + W_{L}L_{N-1} \sum_{K} K^{2} + W_{L}\psi_{N-1} \sum_{K} K^{2}$$ $$= W_{L} \cdot \frac{\sum_{K} (K \cdot \Delta \theta_{K,N})}{\sum_{K} K^{2}} + L_{N-1} = W_{L} \cdot \Delta L_{N} + L_{N-1}$$ (3-33) Where $\Delta\theta_{K,N}$ is defined as the pilot phase after the pre-compensation with the previous tracked phase error $\psi_{N-1}+K\times L_{N-1}$ , $\Delta\psi_N$ can be seen as the difference of mean phase error between the previous and the present OFDM symbol, $\Delta L_N$ can be seen as the difference of phase error slope between the previous and the present OFDM symbol, and $\psi_N$ and $L_N$ are the tracked mean and slope of phase error the same as (3-32). Since in OFDM system the sum of pilot index $\Sigma K$ is usually equal to zero, the PET with pilot pre-compensation can be simplified as (3-33). The non-linear weighting factors of (3-33) have the linear exponents of weighting factor roots $W\psi$ and $W_L$ . The example of non-linear weighting factors with $W\psi=0.5$ and linear weighting factors in the 20-th OFDM symbol is shown in Figure 3-17. And the block diagram of the proposed non-linear WAPET with pilot pre-compensation in (3-33) is shown in Figure 3-18. Figure 3-17: Weighting Factors of PET designs Figure 3-18: Block diagram of the proposed non-linear WAPET In the block diagram of the proposed PET, first the pilot subcarriers will be sent for pre-compensation and phase error tracking. And then the phasor conversion will generate the compensating phasor $\exp\{-j(\psi_N+KL_N)\}$ . Finally the pilot and data subcarriers will be compensated with the phasors and then sent to DDCT. The example of tracked phase error with $W_{\psi}=0.5$ and $W_L=0.5$ during 30 OFDM symbols is shown in Figure 3-18. We can find the mean and slope of true phase error is linear to OFDM symbols. In Figure 3-18 (a), we can find the noise will cause larger percentage of error in the earlier OFDM symbols. In the forth OFDM symbol the noise causes 80% error and in the 17-th OFDM symbol the noise only causes 14% error. Hence the proposed non-linear WAPET which is more dependent on the estimation in latter OFDM symbols can achieve higher PET accuracy. As shown in Figure 3-18, compared with the direct detection, the tracked phase error of the proposed WAPET can be more close to the true ones. (a) Tracked Mean phase error caused by CFO (b) Tracked phase error slope caused by SCO Channel condition: Residual CFO = 0.1ppm, SCO=40ppm, SNR=20dB Figure 3-19: Tracked (a) mean and (b) slope of the phase error caused by 0.1ppm residual CFO and 40ppm SCO during 30 OFDM symbols The example of non-linear WAPET with and without pilot pre-compensation is shown in Figure 3-20. Without pilot pre-compensation, the tracked phase error will be far away the true one when OFDM symbol becomes more. With pilot pre-compensation, the tracked phase can be close to the true one even the phase error exceeding $\pm \pi$ . (a) Tracked Mean phase error caused by CFO (b) Tracked phase error slope caused by SCO Channel condition: Residual CFO = 0.1ppm, SCO=40ppm, SNR=20dB Figure 3-20: Tracked phase error of WAPET with and without pilot pre-compensation # 3-3 Performance Analysis of Low-Complexity Design for OFDM-Based WLAN System In this section the design performance including the FER of the proposed PD, the RMSE of the proposed CFO estimation, the FER of the proposed FWD, and PER curves based on the proposed AC, MF, MMSE EQ, DDCT, and WAPET will be shown and discussed. In the PER simulation, each transmitted packet contains 1000 data bytes for the standard request. The channel condition mainly contains IEEE multipath channel with 50ns RMS delay spread (average 10.93 taps) [19], 40ppm CFO with phase noise effect [33, 34], 40ppm SCO, and 50Hz Doppler frequency (DF). ## 3-3-1 Performance of the Proposed Low-Complexity Auto-Correlation The proposed HPSU AC is used for PD and CFO estimation. For accurate coarse and fine CFO estimation, in the below simulations the parameter of (3-3): $L_S$ is set as 2 in short symbols and 1 in long symbols. Actually the $L_S$ of short symbols is lower than [27] which sets $L_S$ as 10 in short symbols therefore the $L_S$ setting as 2 also achieves low auto-correlation complexity. To understand the PD performance based on the proposed AC, the FER of PD with different $\omega_{AC}$ values and $L_S$ =2 in (3-3) is shown in Figure 3-21. The FER is to calculate the error rate to detect both packet and FFT-window. When the packet is successful detected however maybe the success of PD is too late to make the FWD error. And then the frame is still error. So we can find not only the PD accuracy but also the influence of PD to the overall synchronization correctness in the FER. To avoid the degradation of typical 10% PER for IEEE 802.11a system, we focus to make FER below 10%. As shown in Figure 3-21, when the reduction factor $\omega_{AC}$ is increased from 1 to 2, the SNR for 10% FER is only increased by 0.1dB. Although the used signal amount is reduced by half, the used high-power signal can assist to keep the design performance with only 0.1dB additional SNR loss. And the 0.1dB SNR loss could be acceptable for the trade-off of high performance and low complexity. And when $\omega_{AC}$ is increased to 4 or 8, the SNR loss for 10% FER will be increased by 2.0dB $\sim$ 3.8dB. Figure 3-21: FER of the proposed low-complexity PD for OFDM-based WLAN system To understand the CFO estimation performance based on the proposed HPSU AC. The RMSE of CFO estimation with different $\omega_{AC}$ values is shown in Figure 3-22. And the phase noise is also added in the channel model [33, 34]. To achieve low residual CFO to reduce the frequency-domain ICI, the CFO estimation error is suggested to be less than about $\pm 0.8$ ppm [24]. As shown in Figure 3-22, when the $\omega_{AC}$ value becomes higher, the CFO estimation error will be increased. That is because that the amount of the signal used for CFO estimation becomes lower. For 0.8ppm CFO error, the design with $\omega_{AC}$ = 2 has 2.4dB SNR loss than that with $\omega_{AC}$ = 1. And the design with $\omega_{AC}$ $\geq$ 4 will result $\geq$ 5.6dB SNR loss for 0.8ppm CFO estimation error. The degraded CFO estimation performance may also cause the PER performance loss. Figure 3-22: RMSE of the proposed low-complexity CFO estimation for OFDM-based WLAN system To understand the PER performance based on the proposed low-complexity AC, the PER curves of 6Mb/s (lowest data rate) and 54Mb/s (highest data rate) in AWGN channel and the multipath channel with 50ns RMS delay spread and frequency selective fading $\geq$ -15dB are shown in Figure 3-23 and Figure 3-24. As shown in Figure 3-23, in 6Mb/s mode and AWGN channel, when $\omega_{AC}$ is 1 and 2, the SNR loss for 10% PER is only 0.1dB and 0.15dB compared with the perfect PD and CFO estimation design. That means the increased FER and CFO estimation error of $\omega_{AC}$ = 2 increase the SNR loss by only 0.05dB. However when the $\omega_{AC}$ is increased to $\geq$ 4, the SNR loss for 10% PER will be arrived at 0.75dB and 2.1dB. Channel Condition: AWGN or RMS=50ns, CFO=40ppm+phase noise, SCO=40ppm, Doppler=50Hz Figure 3-23: 6Mb/s PER of the proposed low-complexity auto-correlation Channel Condition: AWGN or RMS=50ns, CFO=40ppm+phase noise, SCO=40ppm, Doppler=50Hz Figure 3-24: 54Mb/s PER of the proposed low-complexity auto-correlation Also as shown in Figure 3-23, when the multipath channel with 50ns RMS delay spread is simulated, the difference of SNR loss between different $\omega_{AC}$ values looks smaller than that in AWGN channel. When the multipath channel is simulated, the SNR for 10% PER will be increased from about 2dB to about 6dB in 6Mb/s data rate because of the frequency-selective fading. As shown in Figure 3-21, when the SNR is moved from 2dB to 6dB, the FER will be decreased from about 1% to less than $10^{-4}$ . And as shown in Figure 3-22, the CFO estimation error in SNR = 6dB is only about 20% of that in SNR = 2dB. In the multipath channel with 50ns RMS delay spread, the SNR loss for 10% PER of $\omega_{AC}$ = 1, 2, 4, and 8 is 0.4dB, 0.6dB, 0.8dB, and 1.4dB. The FER with $\omega$ = 8 are shown in Figure 3-24-2 (a). We can find the SNR for 10% FER in multipath channel is 3.1dB. But in Figure 3-23, the SNR for 10% PER of perfect synchronization in multipath channel is 6dB. Since the SNR for 10% PER in multipath channel is higher than that for 10% FER, the PER degradation caused by FER in multipath channel can be less than that in AWGN channel. The CFO estimation error with $\omega$ = 4 are shown in Figure 3-24-2 (b). We can find the CFO estimation RMSE curves of AWGN and multipath channel are similar to each other. That is because the auto-correlation used for CFO estimation can be robust to multipath channel. We can find the CFO estimation error is lower when SNR condition is higher. And the SNR for 10% PER in multipath channel is higher than that in AWGN channel. Hence the CFO estimation error which degrades the SNR for 10% PER can be less when the channel condition becomes from AWGN to multipath. And the PER degradation caused by FER and CFO estimation error in multipath channel can be less than that in AWGN channel. Figure 3-24-2: FER and CFO estimation error with $\omega = 8$ To understand the performance difference between high-power-signal-used and random-signal-used design, the PER curves with $\omega_{AC}$ =1~4 are drawn in Figure 3-24-3. We can find the PER with $\omega_{AC}$ =2 and random-signal-used design is higher than that with high-power-signal-used design and lower that with $\omega_{AC}$ = 4. By using high-power-signal-used scheme the SNR for 10% PER can be saved. Figure 3-24-3: CFO estimation RMSE and range for OFDM-based WLAN system The RMSE and range of the proposed CFO estimation is shown in Figure 3-25. With the estimation in both short and long symbols, the estimation range can be -120 $\sim$ 120ppm CFO (TX+RX). Figure 3-25: CFO estimation RMSE and range for OFDM-based WLAN system ## 3-3-2 Performance of the Proposed Low-Complexity Matched Filter To understand the influence of the proposed HPCU MF on FWD, the FER curves of different $\omega_{MF}$ values are shown in Figure 3-26. As shown in Figure 3-26, when the coefficient amount (tap number of the MF) is reduced from 64 to 32 or to even 16, the synchronization loss will have 0.05dB~0.1dB SNR loss for 10% FER. The SNR loss is increased by the coefficient reduction. And when the coefficient amount is reduced to 8 or 4, the SNR loss for 10% FER will be obviously increased to 2.2dB or even the FER is not efficient to be lower than 10%. To understand the influence of the proposed HPCU MF on the system PER, the PER curves of different $\omega_{MF}$ are simulated. Since the 6Mb/s which achieves 10% PER in the lowest SNR region will be most influenced by the FER performance, we discuss the PER of 6Mb/s data rate. Figure 3-26: FER of the proposed low-complexity MF for OFDM-based WLAN system Similar to Figure 3-24, the PER curves with different $\omega_{MF}$ of proposed MF are close to each other in 54Mb/s mode. So just the PER curves of 6Mb/s mode are analyzed to decide the value of $\omega_{MF}$ . The 6Mb/s PER of different $\omega_{MF}$ values in AWGN channel and IEEE multipath channel [19] with 50ns RMS delay spread and > -15dB frequency-selective fading is shown in Figure 3-27 and Figure 3-28 respectively. As shown in Figure 3-27, the SNR loss for 10% PER of $\omega_{MF}$ = 1, 2, and 4 is 0.07, 0.08 and 0.1dB compared with the perfect FWD design. And when $\omega_{MF}$ is increased to 8, the SNR loss for 10% PER will be increased to 0.5dB. And the PER of $\omega_{MF}$ = 16 is not efficient to achieve < 10%. As shown in Figure 3-28, the SNR loss for 10% PER of $\omega_{MF}$ = 1, 2, and 4 is 0.1, 0.2 and 0.24dB compared with the perfect FWD design. And when $\omega_{MF}$ is increased to 8, the SNR loss for 10% PER will be increased to 1.6dB. And the PER of $\omega_{MF}$ = 16 is not efficient to achieve < 10%. If the SNR loss caused by MF should be kept in 1.0dB, $\omega_{MF}$ = 4 can be chosen. AWGN, CFO=40ppm+phase noise, SCO=40ppm, Doppler=50Hz Figure 3-27: 6Mb/s PER of the proposed low-complexity MF in AWGN channel Figure 3-28: 6Mb/s PER of the proposed low-complexity MF in IEEE multipath channel with 50ns RMS delay spread To achieve low MF complexity and high PER performance, we can use $\omega_{MF} = 4$ . In the 16-tap MF, the required multiplication can be reduced to 25% of a general 64-tap MF for IEEE 802.11a WLAN system. And the added synchronization loss for 10% PER caused by FWD error can be limited in 0.03dB SNR for AWGN channel and 0.8dB SNR for the multipath channel. #### 3-3-3 Performance of the Proposed DDCT and MMSE EQ Since CE, CT and EQ methods all will influence the system performance under multipath channel, we discuss the performance of both the proposed DDCT and frequency domain-MMSE (FD-MMSE) EQ here. First the mean square error (MSE) of existing LMMSE [49], ML [51], DDCT [52] and proposed DDCT is discussed. From MSE of CE we can know the CE accuracy and tracking effect of time-variant channel. The simulation condition of CE MSE is 6Mb/s and each packet including typical 1000 data bytes, that means the BPSK mapping is applied and each packet has about 330 data OFDM symbols. And the multipath channel is the IEEE Rayleigh fading channel with 50ns RMS delay spread [19]. The window of moving-average of proposed DDCT is set as 32, which can satisfy both the average effect and tracking accuracy. The MSE of CE under time-invariant channel (Doppler frequency = 0Hz) and time-variant channel with Doppler frequency = 50Hz are respectively shown in Figure 3-29 and Figure 3-30. In Figure 3-29 we can find the LMMSE and ML CE scheme can reduce about 6 and 10dB MSE when compared with the basic zero-forcing (ZF) CE. We can find that the DDCT designs which use each data OFDM symbol for CT can achieve more accurate CE than LMMSE and ML design which only use preamble for CE. And both the existing DDCT design and the proposed design can reduce 8~13dB MSE when compared with ZF CE. Figure 3-29: Mean Square Error of CE and CT schemes with 0Hz Doppler frequency They also reduce about 6.5dB MSE of the LMMSE design in the high SNR region. In the low SNR i.e. SNR = 0dB the DDCT has higher MSE because of the larger noise injures the DDCT accuracy. The MSE of CE in the time-variant channel with 50Hz Doppler frequency is drawn in Figure 3-30. The simulation result presents the importance of channel tracking in time-variant channel. Since ZF, ML, LMMSE CE schemes only estimate channel in the initial preamble, the variation caused by Doppler effect can not be acquired. Compared with the initial CE approaches the DDCT [52] and the proposed DDCT can achieve reduce about 5~15dB MSE when SNR is higher than 5dB. We can find the CE performance of proposed DDCT and the referenced DDCT [52] are approximated to each other. But when CFO and SCO are considered in simulation, the CE accuracy of referenced DDCT will be seriously degraded. That consideration will be discussed in later PER simulations. Figure 3-30: Mean Square Error of CE and CT schemes with 50Hz Doppler frequency After discussing the CE accuracy, we present the system PER simulation of the proposed DDCT and FD-MMSE EQ. To understand the performance degraded by serious channel variation, the PER of different CE and EQ approaches is simulated in 6Mb/s data rate, of that the packet length is longest, almost as 9x of packet length of 54MB/s. First the 6Mb/s PER in IEEE multipath channel with 50ns RMS delay spread and 0Hz Doppler frequency is shown in Figure 3-31. In this simulation we can find the performance of different CE and EQ approaches in the time-invariant channel. In Figure 3-31 we can find the improvement of the FD-MMSE EQ design. First, the employed FD-MMSE EQ can have lower 1.2dB SNR than the basic FD-MMSE EQ for typical 10% PER. With the same ZF CE scheme the FD-MMSE EQ can achieve lower 1.7dB SNR than the LS EQ design. When the proposed DDCT is also used, the SNR can be furthermore reduced by 0.7dB. And hence the FD-MMSE EQ can achieve the same SNR as the perfect LS EQ in 10% PER. Figure 3-31: PER of 6Mb/s in IEEE multipath channel with 0Hz Doppler frequency Otherwise we can also find the FD-MMSE EQ can achieve lower SNR for 10% PER than the accurate LMMSE/ML CE scheme. Although the SNR loss of improved MMSE EQ is higher than that of conventional time-domain (TD) MMSE EQ, the complex multiplications of FD-MMSE EQ is only 1.17% of that of conventional TD-MMSE EQ as listed in Table 3-1. So the FD-MMSE EQ can achieve a nice trade-off between system performance and hardware complexity. The 6Mb/s PER in IEEE multipath channel with 50ns RMS delay spread and 50Hz Doppler frequency is shown in Figure 3-32. First we can find if the referenced DDCT [52] is directed used, the PER will be kept as 1. That is because the referenced DDCT directly uses FFT outputs for tracking without the consideration of CFO and SCO problems. We can find the proposed DDCT can save ~1.9dB SNR loss caused by channel variations when compared with the ZF CE without tracking. If the FD-MMSE EQ is used with the proposed DDCT, the proposed design can reduce 1.5dB SNR than the adjusted referenced DDCT design. Figure 3-32: PER of 6Mb/s in IEEE multipath channel with 50Hz Doppler frequency The 54Mb/s PER in IEEE multipath channel with 50ns RMS delay spread and 50Hz Doppler frequency is shown in Figure 3-33. The proposed DDCT can achieve better 0.9dB SNR than ZF CE without tracking scheme. Figure 3-33: PER of 54Mb/s in IEEE multipath channel with 50Hz Doppler frequency ### 3-3-4 Performance of the Proposed WAPET To understand the performance of the proposed non-linear WAPET with pilot pre-compensation, the 6Mb/s and 54Mb/s PER curves of the proposed WAPET in the IEEE multipath channel is shown in Figure 3-34 and Figure 3-35. Note that the DDCT and LS EQ are used in the simulation. As shown in Figure 3-34, compared with the perfect PET, the SNR loss for 10% PER of the proposed WAPET design, LLSE design [14], general direct detection, and WAPET without pilot pre-compensation (pre-comp.) is 1.2dB, 2.5dB, 3.5dB, and > 12dB. Since the proposed non-linear WAPET method can achieve higher PET accuracy, it achieves better performance than the existing approaches. Since the packet length of 6Mb/s comprise 336 OFDM symbols, the phase error of the large OFDM symbol number will exceed $\pm \pi$ which has been shown in Figure 3-20. Hence the PET without pilot pre-compensation is not efficient to achieve <10% PER. Compared with the existing scheme the proposed design can reduce 1.3 dB ~ 2.3dB SNR for 10% PER in 6Mb/s. Figure 3-34: 6Mb/s PER of the proposed WAPET Figure 3-35: 54Mb/s PER of the proposed WAPET As shown in Figure 3-35 in 54Mb/s mode, the PER of different PET schemes are more close to each other when compared with that in 6Mb/s. That is because each packet of 54Mb/s only has 39 OFDM symbol. And the difference of PET error is smaller with the small OFDM number. And since the packet has only 39 OFDM symbol, the phase error will not exceed $\pm \pi$ . Hence performance of the WAPET without pilot pre-compensation is close to that of the proposed design. In 54Mb/s the SNR loss for 10% PER of the proposed WAPET design, LLSE design [14], general direct detection, and WAPET without pilot pre-compensation is 1.2, 1.9, 2.0, and 1.4dB. Compared with the existing scheme the proposed design can reduce 0.6dB $\sim$ 0.7dB SNR for 10% PER in 54Mb/s. After the performance discussion of the proposed HPSU AC, HPCU MF, DDCT, and WAPET, the SNR loss variation for 10% PER is listed in Table 3-2 as a summary. Proposed Design **SNR Loss Variation** 1. Add $0.05 \sim 0.4$ dB SNR in 6Mb/s AC 2. Add 0.07 ~0.1dB SNR in 54Mb/s MF 1. Add $0.03 \sim 0.8$ dB SNR in 6Mb/s 1.Reduce 0.5~1.8dB SNR than LS EQ **MMSE EQ** 2. The same performance as perfect LS EQ when using DDCT Reduce 1.5dB SNR than design without DDCT in 6Mb/s **DDCT** simulation with 50Hz Doppler frequency 1. Reduce 1.3~2.3dB SNR in 6Mb/s (longer packet) **WAPET** 2. Reduce $0.6 \sim 0.7 dB$ SNR in 54Mb/s Table 3-2: Summary of SNR loss variation of the proposed design # 3-4 Floating-Point PER for OFDM-Based WLAN System After the detailed performance discussion of the proposed key designs, the system floating-point PER curves of all data rates from 6Mb/s to 54Mb/s will be shown. In the PER simulation, each transmitted packet contains 1000 data bytes for the standard request. The main channel condition of this section also contains IEEE multipath channel with 50ns RMS delay spread average 10.93 taps) [19], 40ppm CFO with phase noise effect, 40ppm SCO, and 50Hz Doppler frequency. The PER curves of the perfect synchronization (including perfect CE) and the proposed design in AWGN channel are respectively shown in Figure 3-36 and Figure 3-37. AWGN, CFO=40ppm+phase noise, SCO=40ppm, Doppler=50Hz Figure 3-36: PER of perfect synchronization in AWGN channel AWGN, CFO=40ppm+phase noise, SCO=40ppm, Doppler=50Hz Figure 3-37: PER of the proposed design in AWGN channel The perfect synchronization means that the estimated timing, CFO, and CFR are the same as the true ones. Therefore no synchronization error exists in the compensated data, and the PER is just degraded by the channel AWGN. The SNR values for 10% PER in AWGN channel are listed in Table 3-3. As listed in Table 3-3, the system constraint is derived with 10dB RF noise figure [18, 25, 38]. Compared with the perfect synchronization design, the proposed design has average 2.85dB SNR loss. And it can achieve average 6.61dB gain for 10% PER compared with the allowed SNR values of the system constraint. Table 3-3: SNR for 10% PER of OFDM-based WLAN system in AWGN channel | Data Rate (Mb/s) | Perfect Synchronization (dB) | Proposed Design (dB) | System Constraint (dB) | SNR Loss (dB) | |------------------|------------------------------|----------------------|------------------------|---------------| | 6 | -0.3 | 2.3 | 9.7 | 2.6 | | 9 | 2.5 | 4.0 | 10.7 | 1.5 | | 12 | 2.8 | 5.65 6 | 12.7 | 2.85 | | 18 | 5.5 | 8.35 | 14.7 | 2.85 | | 24 | 8.6 | 11.2 | 17.7 | 2.6 | | 36 | 11.7 | 15.0 | 21.7 | 3.3 | | 48 | 15.9 | 19.4 | 25.7 | 3.5 | | 54 | 17.2 | 20.8 | 26.7 | 3.6 | | Avg. | 7.99 | 10.84 | 17.45 | 2.85 | | Note | A | B (should < C) | С | B-A | The SNR values of the system constraint can be seen as the RF SNR calculated with RF noise figure. Since the SNR for 10% PER of the proposed design is lower than the system constraint, in the SNR region of the system constraint the PER of the proposed design is lower than 10% to satisfy the system requirement. The PER curves of the perfect synchronization and the proposed design in the time-variant IEEE multipath channel with 50ns RMS delay spread and > -15dB frequency-selective fading are respectively shown in Figure 3-38 and Figure 3-39. And the SNR values for 10% PER in the time-variant multipath channel are listed in Table 3-4. As listed in Table 3-4, the proposed design has average 2.4dB SNR loss compared with the perfect synchronization. And compared with the system constraint the proposed design has average 1.96dB gain for 10% PER. Under the serious 11-tap time-variant multipath channel, the proposed design can still achieve the system performance requirement. Figure 3-38: PER of perfect synchronization in multipath channel Figure 3-39: PER of the proposed design in multipath channel As listed in Table 3-3 and Table 3-4, the proposed design has average 2.86dB and 2.39dB SNR loss in AWGN channel and time-variant multipath channel respectively. The SNR loss is mainly caused by synchronization error, CE error, and PET error. The distribution of SNR loss of 6Mb/s and 54Mb/s in AWGN channel is shown in Figure 3-40. Only 3% SNR loss is added by the proposed low-complexity AC and MF. And $19\% \sim 56\%$ SNR loss can be reduced by the proposed DDCT. Since the SNR region to achieve 10% PER of 6Mb/s is lower than that of 54Mb/s, the synchronization loss caused by FER and CFO estimation error of 6Mb/s is higher than that of 54Mb/s. Based on the proposed low-complexity synchronization and high-performance CE, the design complexity and system performance can achieve a nice trade-off. The design complexity will be discussed with the hardware architecture in Chapter 5. Table 3-4: SNR for 10% PER of OFDM-based WLAN system in time-variant IEEE multipath channel | Data Rate (Mb/s) | Perfect<br>Synchronization | Proposed<br>Design | System<br>Constraint | SNR Loss (dB) | |------------------|----------------------------|--------------------|----------------------|---------------| | | (dB) | (dB) | (dB) | | | 6 | 4.4 | 7.2 | 9.7 | 2.8 | | 9 | 8.6 | 10.1 | 10.7 | 1.5 | | 12 | 7.3 | 9.6 | 12.7 | 1.7 | | 18 | 11.6 | 13.4 | 14.7 | 1.8 | | 24 | 12.6 | 15.1 | 17.7 | 2.5 | | 36 | 17.2 | 19.8 | 21.7 | 2.6 | | 48 | 20.4 | 23.2 | 25.7 | 2.8 | | 54 | 22.9 | 25.5 | 26.7 | 2.6 | | Avg. | 12.26 | 15.0 | 17.45 | 2.74 | | Note | A | B (should < C) | С | B-A | Figure 3-40: SNR loss for 10% PER of OFDM-based WLAN system ## Chapter 4: ## Low-Complexity Design for OFDM-Based UWB System In the UWB system with power constraint as listed in chapter 1, the power reduction becomes the main design concern. In the OFDM system the synchronization and channel equalizer which occupy 22% and 58% power respectively dominate the power consumption of OFDM transceiver. Therefore in this chapter we propose low-complexity timing synchronization, frequency synchronization, and channel equalizer designs for low-power and low-area UWB processor. Following the methodology of high-power-signal-used synchronization for OFDM-based WLAN system, we also develop the sub-sampling-based synchronization for UWB system. In the proposed synchronization not only the used signal amount is reduced but also the used signal timing is averagely partitioned. With the data partition method, the synchronizer computation can be reduced averagely therefore the parallel architecture of a high-throughput UWB synchronizer can be simplified to a serial architecture. And the area and power can be reduced. Besides, a dynamic-threshold design is also proposed to enhance the accuracy of timing synchronization. It can make the timing synchronization adapt to the channel condition and achieve the lower frame error rate (FER) than the fixed-threshold design with a low overhead. In the existing channel equalizer the complex divider and multipliers occupy 90% power. Therefore we propose a divider-and-multiplier-free equalizer for 480Mb/s UWB system. In the proposed equalizer the power, area, and the critical path delay can be simultaneously reduced lower than the existing equalizer therefore the proposed equalizer can achieve low power and high throughput for UWB. After the algorithm description in section 4-1 and 4-2, the performance will be also discussed in section 4-3 and 4-4. The design performance of synchronization: frame error rate (FER), CFO estimation root-mean-square-error (RMSE), the design performance of equalizer: channel estimation (CE) MSE, and the system performance of the main data rates for LDPC-COFDM and MB-OFDM systems will be shown. # 4-1 Low-Complexity Synchronization for OFDM-Based UWB System In UWB system the design power is rapidly increased for ≥ 480Mb/s high data rate and > 500MHz wide bandwidth. So the low-power design becomes the main concern for the high-speed wireless application. Hence in this section we propose a data-partition-based auto-correlation and moving-average-free matched filter for low-power UWB design. Similar to section 3.1, the used signal amount of synchronization can be reduced based on the trade off between performance and design complexity. Especially in CFO estimation, as discussed in chapter 1, the UWB design is less sensitive to CFO than WLAN system. Therefore larger CFO estimation error can be tolerant in UWB system and the design complexity reduction of CFO estimation can be more efficient. Different from the proposed low-complexity design for WLAN system, the proposed low-complexity UWB design not only reduce the design complexity but also can simplify the design architecture. It can simplify the conventional parallel architecture to a serial architecture to achieve ≥ 528MS/s high throughput with only 132MHz quarter clock speed. The design architecture will be discussed in chapter 5. And as shown in the performance discussion in section 3.3, the synchronization loss is added by only 0.21 ~ 0.25dB SNR for 8% PER of UWB system. Besides the low-complexity design, we also propose the dynamic-threshold design to improve the system performance. The threshold to detect the preamble timing can be dynamically and automatically tuned to adapt the channel environment. And then it can achieve lower FER than the fixed-threshold designs. It can achieve maximum 2.33dB SNR improvement for 8% PER compared with the fixed-threshold designs. ## 4-1-1 Synchronization Block Diagram of OFDM-Based UWB System The synchronization for OFDM-based UWB system includes not only PD, CFO estimation, and FWD but also the band detection and preamble-timing detection (PTD) to control the RF frequency and distinguish the preamble timing for MB-OFDM UWB system. It is used to find the correct timing and frequency information of the preamble structure defined in OFDM-based UWB system. The structure of PLCP preamble defined in UWB specification [15] is shown in Figure 4-1. The preamble comprises 21 packet sequence (PS), 3 frame sequence (FS), and 6 channel estimation sequence (CES). The data flow of the UWB synchronization is shown in Figure 4-2. Figure 4-1: Preamble structure of OFDM-based UWB system In the initial of the preamble, the synchronization should detect the packet initial and then find the correct band ID for correct RF down-conversion. As discussed in chapter 3, the auto-correlation can be used for PD and CFO estimation because the preamble consists of repeated symbols in the OFDM-based WLAN system. In the OFDM-based UWB system, the preamble also consists of 21 repeated PS in the initial therefore the auto-correlation can be used for the timing detection. And then the CFO can be also estimated with the auto-correlation and the FFT-window can be detected with the matched filter before the end of PS. And then the PTD should be used to distinguish the FS from PS. The difference between FS and PS is that the FS is sign-inversed of PS [15]. In the boundary between PS and FS, the sum of two continuous auto-correlation results will be hurriedly decreased so the PTD can be also done with auto-correlation. After the PTD, the initial position of CEP can be found. Then the CEP, following header and data OFDM symbols will be sent to the FFT design for channel estimation and correct data demodulation. Figure 4-2: Data flow of synchronization for OFDM-based UWB system The same as WLAN system, the PD and CFO estimation can be done with the auto-correlation. And the power and phase of auto-correlation is respectively shown in Figure 4-3 and Figure 4-4. When the valid packet comes, the power and phase can be used to detect the packet and estimate the CFO. After PD, the band detection is needed to make RF correctly down-convert the received signal. We can use the power of LPF output signal to find the band ID. First we control the RF to use fixed band #1 for down-conversion. Through the down-conversion and RF LPF, the signal originally transmitted in band #2 and band #3 will be suppressed by the LPF. The signal power with fixed band #1 down-converter is shown in Figure 4-5. Figure 4-3: Auto-correlation power used for packer detection in UWB system Figure 4-4: Auto-correlation phase used for CFO estimation in UWB system As shown in Figure 4-5, the power of signal originally transmitted in band #2 and band #3 will be suppressed to approach to zero. We can use the feature to find the rough band boundary of band #1. And we also use the auto-correlation power to check the band #1 signal belongs to the correct preamble. And then the FWD will find the accurate band boundary and FFT window boundary. For PTD, the sum of two continuous auto-correlation results can be used to distinguish the PS and FS which are sign-inversed of each other. It can be derived as Figure 4-5: Received signal power used for band detection $$\left|\Lambda_{AC}(m) + \Lambda_{AC}(m-1)\right|^2 \le \lambda_{PTD} \times \left[p_m + p_{m-1}\right]^2 \tag{4-1}$$ Where $\Lambda_{AC}(m)$ is the auto-correlation result of m-th and (m+1)-th OFDM symbol, $\lambda_{PTD}$ is a threshold value, and $P_m$ is the signal power sum of the m-th OFDM symbol. If m-th OFDM symbol belongs to PS and (m+1)-th OFDM symbol belongs to FS, the sign-inversed characteristic will let $\Lambda_{AC}(m)$ be sign-inversed of $\Lambda_{AC}(m-1)$ . Thus $|\Lambda_{AC}(m)+\Lambda_{AC}(m-1)|^2$ will become smaller than the product of threshold $\lambda$ and sum of the signal power. An example of sum of two continuous auto-correlation results for PTD is shown in Figure 4-6. When the received signal changes from PS to FS, the auto-correlation will be the rapidly degraded and then the boundary between PS and FS can be found. Based on the proposed synchronization methods the correct timing and frequency for MB-OFDM demodulation can be synchronized in the preamble. Figure 4-6: Sum of two continuous auto-correlation results for PTD of UWB system ## 4-1-2 The Proposed Data-Partition-Based Auto-Correlation For PD, CFO estimation, and PTD, the auto-correlation can be used in the preamble-based OFDM system [10, 13, 24, 26, 27, 28, 29]. Similar to (3-1), the algorithm used in the existing approaches can be derived as $$\Lambda_{AC}(m) = \sum_{n=0}^{N-1} r_{m \times N+n} \times r_{(m+3) \times N+n}^*$$ (4-2) Where N is the sample amount of a repeated symbol, and $r_{m \times N+n}$ is the received sample in the n-th cycle of the m-th repeated symbol. Different from (3-1), the correlation distance is increased from N to 3N. That is because in the MB-OFDM system with a default mode 1, the N-th and (N+3n)-th OFDM symbols are transmitted in the same RF band. Therefore N-th OFDM symbol is only coherent to (N+3n)-th OFDM symbols. So the correlation distance is increased to 3N. In the UWB system, the preamble comprises the repeated OFDM symbols and each of which has 165 samples [15]. So N is equal to 165 in the OFDM system with 128-point FFT symbol and 37-sample guard-interval. And the 165 samples $r_{m \times N+n}$ will be stored and multiplied in (4-2). To reduce register access and design complexity, a data-partition-based auto-correlation algorithm is proposed and derived as $$\Lambda_{AC}(m) \approx \omega \sum_{n=0}^{\lfloor N/\omega \rfloor - 1} r_{m \times N + \omega n} \times r_{(m+3) \times N + \omega n}^*$$ (4-3) Where $\omega$ is the reduction factor, $r_{m \times N + \omega n}$ and $r_{(m+1) \times N + \omega n}$ are the used samples. In the proposed algorithm, input data are partitioned into $\omega$ groups, and only one group of data is used. Thus the amount of register accesses and multiplications can be reduced to $1/\omega$ . The auto-correlation power can be used to detect valid packet. The algorithm of PD can be derived as $$\left| \Lambda_{AC}(m) \right|^2 \ge \lambda_{PD} \times P_{m+3}^{2} \tag{4-4}$$ Where $\Lambda_{AC}(m)$ is the auto-correlation result of m-th and (m+1)-th repeated symbol, $\lambda_{PD}$ is a pre-defined threshold value, and $P_{m+3}$ is the sum of signal power of (m+3)-th OFDM symbol. An example of the proposed auto-correlation with different reduction factor $\omega$ and different channel conditions is shown in Figure 4-7. The examples of normalized auto-correlation power $|\Lambda_{AC}(m)|^2/P_{m+1}^2$ is simulated in a high SNR condition of an AWGN channel (better channel) and a low SNR condition of an indoor multipath channel for UWB system (worse channel) [20]. The correct preamble is set to begin in 0ns. Before 0ns only the noise comes. And the normalized auto-correlation power of received noise may become higher as $\omega$ is increased. That means the larger $\omega$ value will cause the false-alarm of PD more easily. So it's important to find a $\omega$ value to simultaneously keep synchronization performance and reduce design complexity. Figure 4-7: Normalized auto-correlation power in (a) better channel and (b) worse channel In other words, the auto-correlation result can be also used for CFO estimation. The same as (3-4), the CFO estimation for OFDM-based UWB system can be derived as $$\hat{\epsilon} = \frac{1}{2\pi \times 3NT} \tan^{-1} \left\{ \frac{\operatorname{Im} \left[ \Lambda_{AC}(m) \right]}{\operatorname{Re} \left[ \Lambda_{AC}(m) \right]} \right\}$$ (4-5) Where $\hat{\epsilon}$ is the estimated CFO, N is the sample amount of an OFDM symbol, T is the sample period, and $\Lambda_{AC}(m)$ is the auto-correlation result. The CFO estimation range with the use of PS can be $\pm 0.5/3$ N = $\pm 0.5/937.5$ ns = 537KHz equal to 53.7ppm of 10GHz RF frequency. So the CFO estimation can satisfy the $\pm 20$ ppm CFO range (TX+RX= $\pm 40$ ppm < $\pm 53.7$ ppm). After CFO estimation, the phase rotation caused by CFO can be compensated as (3-5). And then the matched filter can be used for FWD without CFO distortion. ### 4-1-3 Data-Partition-Based and Moving-Average-Free Matched Filter In order to detect the correct FFT-window boundary, the matched filter can be used [27, 29]. Similar to (3-9), the algorithm used in existing approaches can be derived as $$\Lambda_{MF}(k) = \sum_{n=0}^{N-1} r_{k+n} \times C_n^*$$ (4-6) Where N is the sample amount of a FFT symbol, k is the FWD timing from 0 to N-1, $r_{k+n}$ is the received sample after CFO compensation, and $C_n$ is the coefficient of the matched filter. The conventional matched filter in (4-6) needs to store the different received samples $r_{k+0} \sim r_{k+N-1}$ in the registers according to different FWD timing k. We propose a moving-average-free matched filter which only stores $r_0 \sim r_{N-1}$ to reduce the storing times and power consumption of the registers. Since the OFDM symbol is repeated, the received samples have a period of N samples. And the received sample $r_{k+n}$ , where $k+n \ge N$ , can approximate to $r_{k+n-N}$ . And then the received samples $r_{k+n}$ , where $n=0 \sim N-1$ , can approximate to $r_L$ , where $n=0 \sim N-1$ and $n=0 \sim N-1$ . After the data rescheduling, the used received samples can be only $n=0 \sim N-1$ for different FWD timing $n=0 \sim N-1$ for different FWD timing $n=0 \sim N-1$ for different FWD timing $n=0 \sim N-1$ for different FWD timing $$\Lambda_{MF}(k) = \sum_{n=0}^{N-l} r_{k+n} \times C_n^* \approx \sum_{n=0}^{N-k-l} r_{k+n} \times C_n^* + \sum_{n=N-k}^{N-l} r_{k+n-N} \times C_n^* = \sum_{l=0}^{N-l} r_{l} \times C_{l-k+\lceil (k-l)/N \rceil \times N}^*$$ (4-7) Where the used received samples $r_L$ are fixed as $r_0 \sim r_{N-1}$ , and the matched filter coefficients $C_{L-k+\lceil (k-L)/N \rceil \times N}$ are still $C_0 \sim C_{N-1}$ . Since the proposed algorithm can only use fixed N received samples to calculate all outputs of the matched filter, the moving-average design can be removed. The computation of the proposed moving-average-free matched filter can be also reduced by the data-partition method. Thus the proposed matched filter algorithm can be derived as $$\Lambda_{MF}(k) \approx \omega \sum_{\ell=0}^{\lfloor N/\omega \rfloor - 1} r_{\omega\ell} \times C_{\omega\ell-k+\lceil (k-\omega\ell)/N \rceil \times N}^*$$ (4-8) Where the index $\omega$ is the reduction factor as in (4-3). Similar to (4-3), the stored sample amount and multiplications of (4-8) can be reduced to $1/\omega$ . And the number of filter taps can be also reduced. The matched-filter output power can be used to detect the correct FFT-window boundary. The timing with matched-filter peak power can be derived as $$K_{peak} = \underset{k}{\operatorname{arg max}} \left\{ \left| \Lambda_{MF}(k) \right|^{2} \right\}$$ (4-9) Where $K_{peak}$ is the timing with peak power and $\Lambda_{MF}(k)$ is the matched filter output. The matched-filter power of the received preamble in the channel conditions which is the same as in Figure 4-7 is shown in Figure 4-8. The correct FFT-window boundary of Figure 4-8 is set to 0ns. As $\omega$ is increased, the number of matched-filter power peak will be increased. And the sub-optimal timing location algorithm can be used [27]. First the searched number of highest peaks is pre-defined according to the $\omega$ value. Then the peaks will be found and the earliest peak can be identified as the FFT-window boundary. For example the searched number of the highest peaks for $\omega$ = 4 can be defined as 2. And as shown in Figure 4-8 with $\omega$ = 4, the earliest peak of the highest 2 peaks can be exactly on the correct FFT-window boundary (0ns). This sub-optimal timing location algorithm can efficiently recover the correct FFT window. Figure 4-8: Matched-filter power in (a) better channel and (b) worse channel ### 4-1-4 The Proposed Dynamic-Threshold Design After FWD, the synchronizer can start the PTD to find the boundary between PS and FS of the preamble. As shown in (4-1) and Figure 4-6, we can use sum of two continuous auto-correlation results to detect the timing. For accurate PTD, a dynamic-threshold design, which adapts $\lambda_2$ value to the channel condition, is proposed. The adapted threshold can be derived as $$\lambda_{PTD} = \frac{\left| \Lambda_{AC}(m-1) + \Lambda_{AC}(m-2) \right|^2}{\left[ p_{m-1} + p_{m-2} \right]^2} \times \varepsilon \tag{4-10}$$ Where $\epsilon$ is a fixed ratio to shift the level of $\lambda_{PTD}$ to perform accurate PTD, and the threshold value $\lambda_{PTD}$ can be updated according to auto-correlation result $\Lambda_{AC}(m)$ and sum of signal power $P_{m-1}$ and $P_{m-2}$ . Simulation result of section 4-3 shows the proposed dynamic threshold design can achieve the lower FER and PER than those designs with fixed threshold values. # 4-2 Low-Complexity Channel Equalization for UWB System ### 4-2-1 Basic Divider-Based Channel Equalization with WAPET As discussed in chapter 3, the proposed channel equalizer consists of channel estimation (CE), equalization, channel tracking (CT) to solve Doppler effect, and weighted-average phase error tracking (WAPET) for WLAN system. In the UWB system the longest packet length is only 102.4µs, equal to 1/10 of that of WLAN system. As shown in Figure 3-13 in the short packet time of WLAN the channel variance will be less than only 0.5dB in the condition of 50Hz Doppler frequency. And we found when system migrates to higher-speed UWB, the SNR loss caused by 50Hz Doppler will not exceed 0.1dB. Hence the channel tracking is not required and the channel equalizer for UWB system can only comprise CE, equalization, and WAPET. In conventional CE method, the channel frequency response (CFR) is estimated with a complex divider. The same as (3-11), the CE algorithm can be derived as $$H_E(k) = Y_L(k) / X_L(k) = H(k) + \Delta \omega / X_L(k)$$ (4-11) Where $H_E(k)$ is the estimated channel frequency response (CFR), $Y_L(k)$ is the received frequency-domain preamble, $X_L(k)$ is the defined frequency-domain preamble, H(k) is the true CFR, and $\Delta w$ is the AWGN within $Y_L(k)$ . This is also called zero-forcing (ZF) algorithm [27]. Based on the ZF algorithm, the example of estimated CFR and true CFR of UWB CM channel with RMS = 5ns and SNR = 10dB is shown in Figure 4-9. As shown in Figure 4-9, the difference between the true and estimated CFR is the estimation error $\Delta W/X_L(K)$ caused by the noise. After CE, the channel fading in the received data can be equalized. The equalization can be derived as $$X(k) = Y(k) / H_E(k) = Y(k) \cdot [Y_L(k) / X_L(k)]^{-1}$$ (4-12) Where X(k) is the equalized data subcarrier and Y(k) is the received data subcarrier. From (4-12) we can find that to solve the channel fading the complex division is needed in either CE or equalization. Figure 4-9: Example of estimated CFR and true CFR with 5ns RMS and 10dB SNR in 528MHz UWB system After the equalization the data will be also compensated with the phase error mainly caused by CFO and SCO. The same as (3-13) and (3-14) the direct detection of the phase error can be derived as $$\begin{cases} \psi = \frac{1}{P} \sum_{K} \theta_{K} \\ L = \frac{\sum_{K} K \cdot \theta_{k}}{\sum_{K} K^{2}} \end{cases}$$ (4-13) Where $\psi$ is the estimated mean phase error, $\theta_K$ is the phase error of each pilot, K is the subcarrier index of pilots, P is the pilot number, L is the slope of the linear phase error, and LK is the linear phase error linear to subcarrier index K. According to (4-13) we can find the mean phase error and linear phase error can be detected from the pilot phase error. An example of detected phase error is shown in Figure 4-10. As shown in Figure 4-10, the phase error caused by only 1ppm residual CFO and 40ppm SCO ideally contains mean and phase error. However when the AWGN is joint, the phase error will be distorted to result rapid transition. Because of the minimum-square-error algorithm (4-13), the mean and linear phase error distorted by AWGN can be still individually detected. And the total detected phase error can be approached to the true value. Figure 4-10: Example of detected phase error in 528MHz UWB system **Channel Condition:** To avoid the performance of phase error detection to be degraded by a burst Residual CFO = 1ppm, SCO = 40ppm, SNR = 10dB noise and the accumulated phase error $> \pm \pi$ , the WAPET algorithm is also used. The same as (3-24), the non-linear WAPET with pilot pre-compensation can be derived as $$\begin{cases} \Delta\theta_{K,N} \equiv \begin{cases} \theta_{K,N} & , & as N = 1 \\ \theta_{K,N} - \psi_{N-1} - KL_{N-1}, otherwise \end{cases} \\ \psi_{N} = \begin{cases} \frac{1}{P} \sum_{K} \Delta\theta_{K,N} & , & as N = 1 \\ W_{\psi} \cdot \frac{1}{P} \sum_{K} \Delta\theta_{K,N} + \psi_{N-1}, & otherwise \end{cases} \\ L_{N} = \begin{cases} \frac{\sum_{K} K \cdot \Delta\theta_{K,N}}{\sum_{K} K^{2}} & , & as N = 1 \\ W_{L} \cdot \frac{\sum_{K} K \cdot \Delta\theta_{K,N}}{\sum_{K} K^{2}} + L_{N-1}, & otherwise \end{cases}$$ (4-14) Where $\Delta\theta_{K,N}$ is defined as the pilot phase after the pre-compensation with the previous tracked phase error $\psi_{N-1}+K\times L_{N-1}$ , $\Delta\psi_N$ can be seen as the difference of mean phase error between the previous and the present OFDM symbol, $\Delta L_N$ can be seen as the difference of phase error slope between the previous and the present OFDM symbol, and $\psi_N$ and $L_N$ are the tracked mean and slope of phase error. The example of tracked phase error during OFDM symbols is shown in Figure 4-11. ### (a) Tracked Mean phase error caused by CFO (b) Tracked phase error slope caused by SCO Channel condition: Residual CFO = 1ppm, SCO=40ppm, SNR=10dB Figure 4-11: Tracked phase error caused by (a) 1pm residual CFO and (b) 40ppm SCO during OFDM symbols As shown in Figure 4-11, the tracked phase error can be more close to the true error compared with the directly detected one. The WAPET can suppress the PET inaccuracy caused by AWGN. After the tracing and compensation of PET, the data can be sent to de-QPSK of UWB system. Based on the data flow from (4-11) to (4-15), the block diagram of the general channel equalization with PET is show in Figure 4-12. Figure 4-12: Block diagram of the general channel equalizer As shown in Figure 4-12, first the received preamble $Y_L(K)$ is sent to the complex divider and then the $H_E(K)$ can be estimated with the defined preamble $X_L(K)$ . To equalize the data with the complex multiplication not division, the inversed of estimated CFR $1/H_E(K)$ is stored in the RAM. And then the data Y(K) is received and then compensated with $1/H_E(K)$ . In the PET, first the pilot is sent to arc-tangent and then the pilot phase error is extracted. Through the phase error detection and tracking, the phase error $\theta_N$ +KL $_N$ is tracked and converted to the phasor. And then the data is compensated with the phase error and then sent to the De-QPSK. ### 4-2-2 The Proposed Divider-and-Multiplier-Free Channel Equalizer In OFDM-based system only the QPSK is used for the constellation mapping. The use of QPSK instead of QAM can reduce the peak-to-average power ratio (PAPR) of the signal sent to the power amplifier of RF transmitter. The QPSK constellation of received symbols and the probability of the symbol phases are shown in Figure 4-13. Figure 4-13: Example of (a) constellation and (b) phase probability of the received QPSK symbols From the constellation we can find the de-QPSK can be done by the symbol phases. And as shown in the probability of symbols phases, the phases of received symbols are centralized to $\pm \pi/4$ and $\pm 3\pi/4$ according to the de-QPSK outputs. Without amplitude (de)modulation, the de-QPSK can be done with the received symbol phases. And the signal flow of the CE, equalization, PET can be simplified to only solve the phase error and then sent the data phase to de-QPSK. Without the amplitude processing, the dividers and multipliers can be eliminated in the proposed channel equalizer. The proposed divider-and-multiplier-free channel equalizer with PET is shown in Figure 4-14. Figure 4-14: Block diagram of the proposed divider-and-multiplier-free channel equalizer In the proposed channel equalizer, first the all received symbols are sent to the logarithm-based arc-tangent and then converted to symbol phases. And then the received preamble phase $P_{YL}(K)$ will be sent to channel phase estimation (CEP) to estimate the channel phase $P_{HE}(K)$ with the defined preamble phase $P_{XL}(K)$ . And then the received data phases $P_Y(K)$ will be equalized with the estimation channel phase in the channel phase equalization (CPQ). The CPE and CPQ can be derived as $$P_{HE}(K) = P_{YL}(K) - P_{XL}(K)$$ $$P_X(K) = P_Y(K) - P_{HE}(K) = P_Y(K) - [P_{YL}(K) - P_{XL}(K)]$$ (4-15) Where $P_{HE}(K)$ is the estimated channel phase, $P_{YL}(K)$ is the received preamble phase, $P_{XL}(K)$ is the defined preamble phase, $P_{XL}(K)$ is the equalized data phase, and $P_{YL}(K)$ is the received data phase. When the equalized data phases are sent to PET, the PET can be directly done without the use of arc-tangent. Hence the original arc-tangent of the PER can be removed. And then the compensated data phase will be sent to de-QPSK. Since the phase estimation and compensation can be done with only the addition and subtraction (4-15), the dividers and multipliers are not needed. And the arc-tangent block is just moved from the PET of the general equalizer to the front of the proposed equalizer. Therefore the arc-tangent is not the additional design overhead. And the design complexity including area and power consumption can be reduced with the elimination of dividers and multipliers. The general arc-tangent is realized based on a divider to result $Imaginary\{Y(K)\}/Real\{Y(K)\}. \ However \ the \ divider \ will \ also \ occupy \ large$ percentage of design complexity. In the proposed equalizer the arc-tangent is realized with the logarithm function. The logarithm-based arc-tangent can be derived as $$\psi = \left( tan^{-1} \left\{ log^{-1} \left[ log(Q) - log(I) \right] \right\} + sign(I) \times \frac{\pi}{2} \right) \times \left( -1 \right)^{Sign(Q)} (4-16)$$ Where $\psi$ is the output phase, I is the real part of input symbol, and Q is the imaginary part of input symbol. The output phase is generated based on the comparison of log {Q} and log {I} instead of the division of Q and I. The function stage can be separated from one stage of division to the two stages of logarithm look-up table (LUT) and subtraction of log(|Q|) and log(|I|). Hence the pipeline design can be conveniently applied with a little of overhead in the fetch. Hence both the clock speed and design complexity can be reduced with the elimination of dividers. Since the critical path delay is reduced, not the same as other OFDM blocks which need 4 data paths to achieve 528MS/s throughput with only 132MHz clock speed, the proposed equalizer can be implemented with only two data paths with 264MHz higher clock speed in 0.18µm CMOS process. The performance discussion containing CE RMSE and design PER with the proposed equalizer will be discussed in section 4-3. The design architecture will be discussed in chapter 5. ## 4-3 Performance Analysis of Low-Complexity Designs for OFDM-Based UWB System In this section, the design performance of low-complexity synchronization and channel equalization, and the system performance of the OFDM-based UWB systems: LDPC-COFDM and MB-OFDM system are shown and discussed for the trade-off of low design complexity and high performance. In the PER simulation, each transmitted packet contains 1024 data bytes for the standard request. The simulation environment mainly comprises AWGN, CFO effect, SCO effect, and the indoor multipath channel [20, 32] for 528MHz UWB system. Since the frequency offset is expected as ±20ppm for each UWB design, the CFO and SCO between transmitter and receiver design are both set as 40ppm. And the phase noise model [35,36] is also added. ### 4-3-1 Performance of the Proposed Sub-sampling-Based Auto-Correlation and Matched Filter To discuss the reduction factor $\omega$ and the required signal amount of the proposed sub-sampling-based and moving-average-free auto-correlation and matched filter. The system FER, CFO estimation RMSE, and PER of UWB system is simulated in the multipath channel described above. The FER with different $\omega$ values of auto-correlation and matched filter is shown in Figure 4-15. Figure 4-15: FER with different $\omega$ values of the proposed auto-correlation and matched filter As shown in Figure 4-15, when the $\omega$ is increased to 2 and 4, the SNR loss for 8% FER will be $0.6\sim1.2dB$ . However when the $\omega$ value is increased to $\geq 8$ , the SNR loss will be increased to total 7.2dB. That means when $\omega$ is $\geq 8$ the FER degradation becomes more serious. The CFO RMSE with different $\omega$ values is shown in Figure 4-16. When the $\omega$ value becomes higher to reduce more design complexity, the CFO RMSE will become also higher. So the value of reduction factor $\omega$ should be decided based on a nice trade-off between design complexity and system performance. To decide the suitable $\omega$ value, the design PER of LDPC-COFDM system is simulated with different $\omega$ values. The PER curves of 120Mb/s and 480Mb/s, which are the lowest and highest rate of the proposed LDPC-COFDM system [1], are shown in Figure 4-17 and Figure 4-18. Total and the second Figure 4-16: CFO RMSE with different ω values of the proposed auto-correlation Channel condition: RMS=5ns, CFO=40ppm+phase noise, SCO=40ppm Figure 4-17: PER of 120Mb/s with different $\omega$ values of the proposed design Channel condition: RMS=5ns, CFO = 40ppm+phase noise, SCO=40ppm Figure 4-18: PER of 480Mb/s with different ω values of the proposed design As shown in Figure 4-17, we can find when $\omega$ is $\leq$ 4, the SNR loss compared with the perfect synchronization for 8% PER can be 0.05dB, 0.06dB, and 0.21dB. However when $\omega$ is increased to 8, the SNR loss will be higher than 2dB. And the PER with $\omega = 16$ is not efficient to be lower than typical 8%. As shown in Figure 4-18, we can find when $\omega$ is 1, 2, 4, and 8, the SNR loss compared with the perfect synchronization is 0.09dB, 0.12dB, 0.24dB, and 0.90dB. However when $\omega$ is increased to 16, the SNR loss will be higher than 2.4dB. In the case of $\omega$ = 8, since the FER of 480Mb/s is lower than that of 120Mb/s, the SNR loss of design with $\omega$ = 8 of 480Mb/s mode is far lower than that of 120Mb/s mode. According to the PER of 120Mb/s and 480Mb/s, we can find $\omega$ = 4 is suitable for low-complexity design with the increase of SNR loss from $\omega$ = 1 by 0.16dB and 0.15dB in 120Mb/s and 480Mb/s respectively. The algorithm with $\omega$ = 4 can simplify the synchronizer architecture which will be discussed in chapter 5. The RMSE and range of CFO estimation with $\omega$ = 4 is shown in Figure 4-19. With the proposed CFO estimation with $\omega$ = 4 the CFO in the range -45ppm $\sim$ 45ppm can be estimated. So the proposed CFO estimation can satisfy the ±20ppm CFO specification [15, 16]. ### 4-3-2 FER and PER Analysis of the Proposed Dynamic-threshold design for UWB System To present the performance improvement of the PTD with the proposed dynamic-threshold design, the FER and PER of UWB system is simulated in the Intel multipath channel with 5ns RMS delay spread [32]. The FER of the proposed dynamic-threshold design compared with fixed-threshold designs is shown in Figure 4-20. Channel condition: RMS=5ns, CFO = -80~+80ppm, SCO=40ppm Figure 4-19: PER of 480Mb/s with different ω values of the proposed design Channel condition: RMS = 5ns, CFO = 40ppm+phase noise, SCO = 40ppm Figure 4-20: FER with different threshold of PTD in the multipath channel We can find when SNR = 0dB, the design with threshold $\lambda_{PTD} = 0.1$ can achieve the lowest FER of the fixed-threshold designs. And when SNR = $2\sim6$ dB, the design with fixed threshold = 0.04 can achieve the lowest FER of the fixed-threshold designs. The designs with fixed threshold = 0.1 and 0.04 can respectively achieve the low FER in different SNR regions. However they can't achieve the lowest FER in all SNR regions. As shown in Figure 4-20, the proposed dynamic-threshold design can achieve the lowest FER. That is because the threshold can be automatically tuned to adapt the channel environment. To understand the PER improvement by the proposed dynamic threshold design, the PER curves are simulated. Since the PER of low data rate is more sensitive to FER which increases PER in the lower SNR. The PER of the proposed dynamic-threshold design compared with fixed-threshold designs in 120Mb/s data rate, which is the lowest data rate of LDPC-COFDM system, is shown in Figure 4-21. Since the proposed design can achieve the lowest FER, it can require lower SNR for 8% PER. Compared with the design with fixed threshold = 0.02 and 0.4, it can achieve 2.0dB~2.33dB large SNR improvement. So the performance of fixed-threshold design is very sensitive to the threshold value. Compare with the fixed-threshold design with threshold = 0.02 ~ 0.4, the proposed dynamic-threshold design can reduce 0.13dB~2.33dB SNR for 8% PER. It can enhance the system PER for UWB system. Channel condition: RMS = 5ns, CFO = 40ppm+phase noise, SCO = 40ppm Figure 4-21: PER with different threshold of PTD in 120Mb/s data rate ### 4-3-3 CE MSE and PER Analysis of the Proposed Divider-free Channel Equalization To understand the CE performance of the proposed divider-free channel equalizer, the mean-square-error (MSE) of CE of the proposed design is shown in Figure 4-22. As shown in Figure 4-22, the MSE of the estimated channel phase of the general complex zero-forcing (ZF) CE and the proposed CE is very close. The difference of the CE MSE between two designs is less then 0.1dB MSE. The PER performance of 480Mb/s in the CM2 channel is shown in Figure 4-23. Without the magnitude information, the soft-decision de-mapping accuracy will be a little degraded. Hence the SNR loss for 8% PER of the proposed equalizer is larger than that of the divider-based equalizer by 0.3dB. And the SNR loss caused by CE error is 1.7dB compared with the perfect CE design in the CM2 channel. Channel Condition: CM4 Channel (RMS=25ns) Figure 4-22: CE MSE of the proposed channel equalizer **Channel Condition:** CM2 multipath channel, CFO=40ppm+phase noise, SCO=40ppm Figure 4-23: PER of proposed channel equalizer in 480Mb/s for MB-OFDM After the PER performance discussion of the proposed low-complexity synchronization and channel equalizer, the additional SNR loss to achieve low design complexity is listed in Table 4-1 as a summary. Proposed Design **SNR Loss Variation** Table 4-1: Summary of SNR loss variation of the proposed design Synchronization Add 0.15~0.16dB SNR loss Dynamic-Threshold Design of PTD Reduce 0.13~2.33dB SNR in 120Mb/s Divider-free equalizer Add 0.3dB SNR loss in 480Mb/s ### 4-4 System Performance of LDPC-COFDM-Based UWB System and MB-OFDM-Based UWB System In this section the PER curves of LDPC-COFDM-Based UWB system and MB-OFDM-Based UWB system are shown. The synchronization loss which is the difference of SNR for typical 8% PER between perfect and non-perfect synchronization will be listed. The channel condition of LDPC-COFDM-Based UWB system comprises AWGN, Intel multipath channel model [32], 40ppm CFO, and 40ppm SCO. The channel condition of MB-OFDM-Based UWB system comprises AWGN, CM1 ~ CM4 multipath channel model [20], 40ppm CFO, and 40ppm SCO. ### 4-4-1 PER of LDPC-COFDM-Based UWB System In this section the PER of the data rates supported by LDPC system: 120Mb/s, 240Mb/s, and 480Mb/s will be shown in AWGN channel and Intel multipath channel [1, 32]. The PER curves of the LDPC-COFDM-Based UWB system in AWGN channel are shown in Figure 4-24. And the SNR for typical 8% PER requested by USB system [15, 16] in the AWGN channel is listed in Table 4-2. As listed in Table 4-2 the SNR loss for 8% PER of the proposed design is 1.5dB, 1.dB, and 2.1dB for 120Mb/s ~ 480Mb/s compared with the perfect synchronizations (Sync). The PER curves of the LDPC-COFDM-Based UWB system in Intel multipath channel with 5ns RMS delay spread are shown in Figure 4-25. And the SNR for 8% PER in the multipath channel is listed in Table 4-3. As listed in Table 4-3, the SNR loss for 8% PER is 1.9dB, 1,2dB, and 2.0dB of 120Mb/s, 240Mb/s, and 480Mb/s data rate. Figure 4-24: PER of LDPC-COFDM-Based UWB system in AWGN channel Table 4-2: SNR for 8% PER of LDPC-COFDM System in AWGN channel | Data Rate (Mb/s) | Perfect Synchronization (dB) | Proposed Design (dB) | System Constraint (dB) | SNR Loss (dB) | |------------------|------------------------------|------------------------|------------------------|---------------| | 120 | 2.6 | 4.2 | 7.5 | 1.5 | | 240 | 4.1 | 5.7 | 16.0 | 1.6 | | 480 | 6.1 | 8.2 | 21.1 | 2.1 | | Average | 4.27 | 6.0 | 14.87 | 1.73 | | Note | A | B (should < C) | С | B-A | Figure 4-25: PER of LDPC-COFDM-Based UWB system in multipath channel Table 4-3: SNR for 8% PER of LDPC-COFDM System in multipath channel | Data Rate (Mb/s) | Perfect Synchronization (dB) | Proposed Design (dB) | System Constraint (dB) | SNR Loss (dB) | |------------------|------------------------------|------------------------|------------------------|---------------| | 120 | 3.4 | 5.5 | 7.5 | 2.1 | | 240 | 5.4 | 7.3 | 16.0 | 1.9 | | 480 | 8.3 | 10.1 | 21.1 | 1.8 | | Average | 5.7 | 7.67 | 14.87 | 1.97 | | Note | A | B (should < C) | С | B-A | As listed in Table 4-2 and Table 4-3, the SNR values for 8% PER are lower than the system constraint. The SNR constraint is calculated from the system required transmission distances with 6dB RF noise figure [16]. Since the SNR for 8% PER of the proposed design is lower than the system constraint, in the SNR region of system constraint the PER of the proposed design can be lower than 8% to satisfy the system requirement. And the SNR loss distribution of 480Mb/s and 120Mb/s in AWGN channel is plotted in Figure 4-26. As shown in Figure 4-26 the proposed dynamic-threshold design (DTD) can reduce 8.7% SNR loss and 2% SNR loss of 120Mb/s and 480Mb/s data rate. And the proposed synchronization and divider-free channel equalizer add 23% ~ 24% SNR loss, equal to only 0.36dB ~ 0.48dB SNR. Similar to WLAN system, the SNR loss is still dominated by CE and PET loss in the OFDM-based UWB system. And to develop the low-complexity synchronization can achieve a nice trade-off between low design complexity and high performance. Figure 4-26: SNR loss for 8% PER of OFDM-based WLAN system in AWGN channel Besides the PER curves versus SNR values, the PER versus transmission distances of the LDPC-COFDM system can be also simulated and then shown in Figure 4-27. The transmission distances can be calculated from the SNR values with standard 6dB noise figure in the RF design [16]. And the transmission distances for typical 8% PER are listed in Table 4-4. The transmission distance for 8% PER of the proposed design can be longer than the system required distances also listed in Table 4-4 [16]. So the proposed design can satisfy transmission distance requirement of UWB system. Channel Condition: AWGN or RMS = 5ns, CFO=40ppm+phase noise, SCO = 40ppm Figure 4-27: PER vs. transmission distances of LDPC-COFDM system Table 4-4: Transmission distance for 8% PER of LDPC-COFDM System | Data Rate | AWGN channel | Intel multipath channel | System | |-----------|----------------|-------------------------|------------------| | (Mb/s) | (meters) | with 5ns RMS delay | Requirement [16] | | ( 1.12) | (meters) | (meters) | (meters) | | 120 | 14.9 | 12.5 | 10 | | 240 | 13.0 | 10.8 | 4 | | 480 | 8.9 | 7.2 | 2 | | Note | A (should > C) | B (should > C) | С | #### 4-4-2 PER of MB-OFDM-Based UWB System In this section the PER of the main data rates supported by MB-OFDM system: 110Mb/s, 200Mb/s, and 480Mb/s will be shown in AWGN channel and multipath channel (CM1~CM4) [15, 16, 20]. The PER curves of the LDPC-COFDM system in AWGN channel are shown in Figure 4-28. And the SNR for typical 8% PER is listed in Table 4-5. As listed in Table 4-5, the SNR loss for typical 8% PER of the proposed design is 1.9dB, 1.6dB, and 1.1dB compared with the perfect synchronization (sync). And then for the multipath channel simulation, since 200Mb/s and 110Mb/s data rate are suitable for 4~10 meter transmission and 480Mb/s data rate is suitable for 2 meter transmission, the most complex multipath channels for 110Mb/s, 200Mb/s, and 480Mb/s are CM4, CM4, and CM2 channels [15, 16, 20]. And the PER curves of the proposed design in the CM multipath channels are shown in Figure 4-29. The SNR values for 8% PER in CM channels are listed in Table 4-6. AWGN channel, CFO=40ppm+phase noise, SCO = 40ppm Figure 4-28: PER of MB-OFDM-Based UWB system in AWGN channel Table 4-5: SNR for 8% PER of MB-OFDM-Based UWB system in AWGN channel | Data Rate (Mb/s) | Perfect | Proposed | System | SNR Loss | |------------------|-----------------|----------------|------------|----------| | | Synchronization | Design | Constraint | (dB) | | | (dB) | (dB) | (dB) | , | | 110 | -0.7 | 1.2 | 7.1 | 1.9 | | 200 | 2.2 | 3.8 | 15.1 | 1.6 | | 480 | 6.1 | 7.2 | 21.1 | 1.1 | | Average | 2.53 | 4.07 | 17.45 | 1.53 | | Note | A | B (should < C) | С | B-A | Channel Condition: CM channels, CFO=40ppm+phase noise, SCO = 40ppm Figure 4-29: PER of MB-OFDM-Based UWB system in CM multipath channel Table 4-6: SNR for 8% PER of MB-OFDM-Based UWB system in CM channels | Data Rate (Mb/s) and | Perfect Synchronization | Proposed Design | System Constraint | SNR Loss | | |----------------------|-------------------------|------------------|-------------------|----------|--| | channel | (dB) | (dB) | (dB) | (dB) | | | 110 CM4 | 3.0 | 5.8 | 7.1 | 2.8 | | | 200 CM4 | 11.4 | 14.2 | 15.1 | 2.8 | | | 480 CM2 | 16.8 | 18.5 | 21.1 | 1.7 | | | Average | 10.4 | 12.83 | 17.45 | 2.43 | | | Note | A | B (should < C) | С | B-A | | As listed in Table 4-6, the SNR loss for 8% PER of the proposed design is 2.8dB, 2.8dB, and 1.7dB compared with the perfect synchronizations. As listed in Table 4-5 and Table 4-6, the SNR for 8% PER of the proposed can be lower than the SNR of the system constraint to satisfy the system performance requirement. The PER curves versus transmission distances in AWGN channel and CM channels are shown in Figure 4-30. And the transmission distances for 8% PER are listed in Table 4-7. As listed in Table 4-7, the proposed design can satisfy the transmission distance requirement of MB-OFDM-based UWB system [15, 16]. With the proposed low-complexity design the PER performance can still satisfy the system requirement. The complexity and power consumption reduced by the proposed design will be discussed in chapter 5. Table 4-7: Transmission distance (meters) for 8% PER of MB-OFDM System | Data Rate | 110Mb/s | 200Mb/s | 480Mb/s | Note | |-------------|---------|---------|---------|----------------| | AWGN | 19.3 | 14.1.6 | 10.0 | A (should > F) | | CM1 | 13.4 | 7.0 | 2.85 | B (should > F) | | CM2 | 13.1 | 6.8 | 2.8 | C (should > F) | | CM3 | 12.7 | 5.9 | | D (should > F) | | CM4 | 11.8 | 4.55 | | E (should > F) | | System | 10 | 4 | 2 | F | | requirement | 10 | 4 | 2 | Г | Channel Condition: AWGN and CM channels, CFO=40ppm+phase noise, SCO = 40ppm Figure 4-30 (a): PER vs. transmission distance of 200Mb/s and 480Mb/s Channel Condition: AWGN and CM channels, CFO=40ppm+phase noise, SCO = 40ppm Figure 4-30 (b): PER vs. transmission distance of 110Mb/s ### Chapter 5: # Hardware Architecture and Baseband Chip Design In this chapter the fixed-point simulations, hardware architectures, chip designs, and complexity analysis of the proposed synchronizer, channel equalizer, and complete OFDM baseband transceivers is introduced. For deciding the wordlength of the fixed-point design, the system PER with different wordlengths are simulated. The trade-off between high performance and low ADC wordlength resulting low ADC power is discussed. And then the hardware architectures of the proposed low-complexity and high-performance designs for OFDM-based WLAN and UWB systems are introduced. In the introduction of hardware architecture, the signal flow and the improvement parts different from the existing approaches will be discussed. And then in the complexity analysis, the reduced computations of the proposed low-complexity designs will be analyzed. And the power-reduction efficiency of the proposed schemes for high-throughput UWB system will be individually discussed. They can reduce 57% synchronizer power and 51.6% equalizer power in the UWB system. Finally, three silicon-proven baseband transceivers for OFDM-based WLAN system, LDPC-COFDM-based UWB system, and MB-OFDM system will be introduced. And the percentages of gate count and power belonging to the OFDM transceiver will be shown. In the UWB system a multi-stage gated-clock control for chip power saving will be also introduced. The percentages in gate count and power of the baseband chips will be discussed to understand the efficiency of power saving. ### 5-1 Hardware Design for OFDM-Based WLAN System In this section the fixed-point simulation, hardware architectures, complexity analysis, and baseband chip design of our implemented OFDM-based WLAN baseband processor [3] will be introduced below. #### 5-1-1 Fixed-point Performance of WLAN system Before the discussion of fixed-point simulation, note that in the published WLAN design [3] the CE and EQ modules are designed combining ZF CE, DDCT, and LS EQ. So the all fixed-point performance of the proposed WLAN chip is simulated without MMSE EQ. For converting the floating-point baseband design to the fixed-point design, the DAC and ADC wordlengths need to be decided first. In the existing baseband chip design 9-bit ADC/DAC [18, 23, 37] and 10-bit ADC/DAC [25, 44] are generally proposed. For deciding the wordlength, two issues including the quantization error and the power consumption of DAC and ADC should be considered. To understand the system performance degradation caused by the fixed-point quantization error, the PER curves of 54Mb/s mode with different wordlengths are simulated. In the fixed-point simulation the key modules comprising AGC, synchronizer, FFT/IFFT, channel equalizer, shaping filters, and FEC designs are also converted to the fixed-point design with the suitable wordlengths. The PER curves simulated in AWGN channel and IEEE multipath channel [19] with 50ns RMS delay spread are respectively shown in Figure 5-1 and Figure 5-2. And the SNR values for 10% PER are listed in Table 5-1. As shown in Figure 5-1 and Table 5-1, we can find the acceptable wordlength for WLAN system is ≥ 9 bits since the SNR for 10% PER can satisfy the system constraint (26.7dB) in both AWGN and multipath channel. Another concern for deciding the wordlength is the DAC and ADC power consumption. Since the PER between 16-bit 10-bit fixed-point designs is very close, for low power we just consider 9-bit and 10-bit to be the fixed-point wordlength. The normalized DAC and ADC power with the same operation frequency is listed in Table 5-2. Figure 5-1: 54Mb/s PER with different ADC/DAC wordlength in AWGN channel Figure 5-2: 54Mb/s PER with different ADC/DAC wordlength in multipath channel Table 5-1: SNR for 10% PER of 54Mb/s WLAN with different wordlengths | | AWGN | Fixed-point | Multipath | Fixed-point | | |-----------|--------------|--------------|--------------|--------------|------------| | Design | channel | SNR loss | channel with | SNR loss | System | | Design | (dB) | AWGN | 50ns RMS | in multipath | constraint | | | (4D) | channel (dB) | delay (dB) | channel (dB) | | | Floating- | 20.8 | | 25.5 | | | | point | (A1) | | (B1) | | | | 16-bit | 21.25 | 0.45 | 25.6 | 0.1 | | | 10-bit | 21.35 | 0.55 | 25.8 | 0.3 | 26.7 | | 9-bit | 21.4 | 0.6 | 26.5 | 1.0 | | | 8-bit | 22.2 | 1.4 | 27.6 | 2.1 | | | 7-bit | >23 | >2.2 | >30 | >4.5 | | | Note | A | A-A1 | В | B-B1 | С | | Note | (should < C) | A-A1 | (should < C) | D-D1 | | Table 5-2: Power consumption of DAC and ADC for WLAN system | C4-4 C414 | Wordlength | I/Q DAC power in | I/Q ADC power in | |------------------|------------|------------------|------------------| | State-of-the-art | DAC/ADC | 40MS/s (mW) | 40MHz (mW) | | Ref. [18] | 9/9 | 17 | 105.5 | | Ref. [23] | 10/9 | 58 | 155 | | Ref. [25] | 10/10 | Not Listed | 248 | As listed in Table 5-2, the power is normalized in 40MS/s rate for basic 2x transmitter filtering and 2x receiver interpolation for 20MHz OFDM-based WLAN system. When we use 9-bit wordlength, the DAC and ADC power is respectively saved by 41mW and 93~142.5mW [18, 23, 25]. That means the use of 9 bit wordlength can save 241% of 9-bit DAC power and 60%~135% of 9-bit ADC power. The saved ADC power is equivalent to 70% of Coded-OFDM baseband receiver power [18]. In the trade-off between system performance and low power, we use 9-bit DAC and ADC as the interface connecting to RF. Besides, the 7-bit DAC is added for receiver AGC. The DAC and ADC power information of other state-of-the-art can be found in references. After the wordlength decision, the fixed-point system block diagram with main wordlength setting is shown in Figure 5-3. With the clipping design of the transmitter, the peak-to-average-power ratio (PAPR) can be suppressed to enhance the linearity of RF power amplifier. The PAPR of the fixed-point design is listed in Table 5-3. Figure 5-3: Block diagram of the proposed fixed-point WLAN baseband system Table 5-3: PAPR of the proposed fixed-point design for WLAN system | Data Rate (Mb/s) | PAPR (dB) | |------------------|---------------| | 6 | 10.4 | | 9 | 9.5 | | 12 | 9.6 | | 18 | 10.3 | | 24 | 9.6 | | 36 | 10.5 | | 48 | 9.8 | | 54 | 9.0 | | | Average: 9.84 | Based on the wordlength setting, the PER curves of the proposed fixed-point design can be simulated. The PER curves of the fixed-point design in (i) AWGN channel and (ii) IEEE multipath channel with 50ns RMS delay spread (frequency-selective fading > -15dB) are respectively shown in Figure 5-4 and Figure 5-5. And the relative SNR values for 10% PER are listed in Table 5-4 and Table 5-5. As listed in Table 5-4 and Table 5-5, the average SNR loss caused by the quantization error is only 0.16 and 0.36dB in the simulation channels. For understanding the performance difference from the state-of-the-art, the SNR values for 10% PER of the proposed design are compared with the references [18, 25]. And the SNR comparison is listed in Table 5-6. Figure 5-4: PER of fixed-point WLAN design in AWGN channel AWGN, CFO = 40ppm+phase noise, SCO=40ppm, Doppler = 50Hz Table 5-4: SNR for 10% PER of fixed-point WLAN design in AWGN channel | Data Rate | Floating-point | Fixed-point | System | Fixed-point | |-----------|----------------|----------------|-----------------|-------------| | (Mb/s) | Design (dB) | Design (dB) | Constraint (dB) | Loss (dB) | | 6 | 2.3 | 2.6 | 9.7 | 0.3 | | 9 | 4.0 | 4.05 | 10.7 | 0.05 | | 12 | 5.65 | 5.7 | 12.7 | 0.05 | | 18 | 8.35 | 8.4 | 14.7 | 0.05 | | 24 | 11.2 | 11.3 | 17.7 | 0.1 | | 36 | 15.0 | 15.1 | 21.7 | 0.1 | | 48 | 19.4 | 19.5 | 25.7 | 0.1 | | 54 | 20.8 | 21.4 | 26.7 | 0.6 | | Avg. | 10.84 | 11.00 | 17.45 | 0.16 | | Note | A | B (should < C) | С | B-A | RMS=50ns, CFO = 40ppm+phase noise, SCO=40ppm, Doppler = 50Hz **Channel Condition:** Figure 5-5: PER of fixed-point WLAN design in multipath channel with RMS=50ns Table 5-5: SNR for 10% PER of fixed-point WLAN design with RMS=50ns | Data Rate | Floating-point | Fixed-point | System | Fixed-point | |-----------|----------------|----------------|-----------------|-------------| | (Mb/s) | Design (dB) | Design (dB) | Constraint (dB) | Loss (dB) | | 6 | 7.2 | 7.5 | 9.7 | 0.3 | | 9 | 10.05 | 10.1 | 10.7 | 0.05 | | 12 | 9.6 | 9.65 | 12.7 | 0.05 | | 18 | 13.4 | 13.8 | 14.7 | 0.4 | | 24 | 15.1 | 15.45 | 17.7 | 0.35 | | 36 | 19.8 | 20.15 | 21.7 | 0.35 | | 48 | 23.2 | 23.6 | 25.7 | 0.4 | | 54 | 25.5 | 26.5 | 26.7 | 1.0 | | Avg. | 15.48 | 15.84 | 17.45 | 0.36 | | Note | A | B (should < C) | С | B-A | Table 5-6: SNR for 10% PER of fixed-point WLAN processors in AWGN channel | Data Bata | Proposed | Design SNR | Design SNR | System | |-----------|------------|------------|------------|-------------| | Data Rate | Design SNR | [18] | [25] | Requirement | | (Mb/s) | (dB) | (dB) | (dB) | (dB) | | 6 | 2.6 | 5.4 | 4.9 | 9.7 | | 9 | 4.05 | 5.8 | 5.8 | 10.7 | | 12 | 5.7 | 7.0 | 8.6 | 12.7 | | 18 | 8.4 | 9.5 | 9.9 | 14.7 | | 24 | 11.3 | 11.3 | 12.4 | 17.7 | | 36 | 15.1 | 14.9 | 15.9 | 21.7 | | 48 | 19.5 | 18.6 | 20.3 | 25.7 | | 54 | 21.4 | 20.6 | 21.7 | 26.7 | | Avg. | 11.00 | 11.68 | 12.44 | 17.45 | The difference between the proposed design and references [18, 25] is the use of low-complexity synchronizer, high-performance decision-directed channel tracking (DDCT), and high-performance weighted-average phase error tracking (WAPET). As listed in Table 5-6, the proposed design requires lower SNR compared with the reference [18, 25] especially in low data rates. In the low data rate, the packet length is longer and the OFDM symbol number of one packet is larger. Therefore the channel variance and phase error caused by CFO and SCO will become rapid. And the phase error will exceed $\pm \pi$ which is the range of normal phase detection. In this low-data-rate condition, the proposed DDCT and WAPET can be efficient to suppress the estimation error of channel frequency response (CFR) and the phase error even exceeding $\pm \pi$ . Therefore the proposed design requires lower SNR for 10% PER. And in the average SNR for 10% PER, the proposed design can have 0.68dB, 1.44dB, and 6.45dB gain when compared with reference [18, 25] and the system constraint. #### 5-1-2 Hardware Architecture of the Proposed Designs for WLAN System In this section the hardware architecture of the proposed high-power-signal-used (HPSU) auto-correlator for packet detection (PD), high-power-coefficient-used (HPCU) matched filter for FFT-window detection (FWD), DDCT for channel frequency response (CFR) tracking, and WAPET for phase error tracking is introduced. And the signal flow in these architectures will be also discussed. The architecture of general auto-correlator and the proposed low-complexity auto-correlator is drawn in Figure 5-6. As shown in Figure 5-6 (a), the general auto-correlator stores 16 and 64 samples and then multiplies them with the conjugate input samples in short and long symbols [24, 28]. In this architecture 16 times of FIFO read/write and complex multiplications are needed in each short symbol, and total 80 times of FIFO writing and complex multiplications are needed in the long symbols. As description in chapter 3, the HPSU auto-correlation (3-3) with $\omega_{AC} = 2$ is proposed. And the proposed auto-correlator architecture is shown in Figure 5-6 (b). For using only $1/\omega = 50\%$ high-power preamble signal, the high-power signal selector is used to sieve out the high-power signal from all of preamble signal. And then the high-power signal will be sent through the "gate" and then go for correlation. The overhead circuits of the proposed auto-correlator: "High-power signal selector" and "gate" is shown in Figure 5-7. This simple overhead only needs 110 gates. Figure 5-6: Architecture of the auto-corrugators for OFDM-based WLAN system Figure 5-7 Design of high-power signal selector and gate After the signal control of the high-power signal selector and the gate, the FIFO writing and complex multiplication amount can be reduced. The signal behavior of the general and the proposed auto-correlator is drawn in Figure 5-8. #### (a) General Auto-correlator #### (b) Proposed Auto-correlator (assumed even signal is used) Figure 5-8: Signal behavior of the auto-correlators In Figure 5-8 (a), the received preamble comes with 20MHz clock rate (1x of the signal bandwidth). And then the circuit speed of FIFO read, FIFO write, complex multiplier, and the accumulator resulting auto-correlator outputs also follows the preamble transition rate. In the proposed auto-correlator, since the high-power signal selector and gate sieves the preamble, the circuit speed of the auto-correlator with $\omega_{AC}$ =2 can be reduced to half. Therefore the required computation and power consumption can be also reduced. The architecture of the general matched filter and the proposed HPCU matched filter is drawn in Figure 5-9. In Figure 5-9 (a), 64 complex multipliers and 64-sample FIFO are used to realize the general matched filter (3-8). And these high number of parallel complex multiplications will consume high power. In Figure 5-9 (b), the proposed matched filter only uses 16 complex multipliers for the 16-tap matched filtering. Hence the multiplier power can be reduced by 75%. The index of high-power coefficient of a long symbol used by the proposed matched filter is listed in Table 5-7. Since the first index is 0 and the last index is 53, the 54-sample FIFO is used to continuously generate the outputs of the proposed matched filter. The proposed matched filter can save the hardware cost and power consumption of the multipliers and FIFO. 54-sample FIFO (Max. distance of the high-power coefficients is 54) (b) Proposed matched filter Figure 5-9: Architecture of the matched filters for OFDM-based WLAN system (Total 16 coefficients) | General Matched | 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,, 60, 61, 62, 63, 64 | |------------------|---------------------------------------------------------------| | Filter | (Total 64 coefficients) | | Proposed matched | 0, 11, 14, 19, 24, 26, 29, 30, 32, 34, 35, 38, 40, 45, 50, 53 | filter Table 5-7: Coefficient index of the matched filter for OFDM-based WLAN system For solving the CFR variance caused by Doppler effect and the phase error caused by CFO and SCO, the channel equalizer combining FD-MMSE EQ as $(3-24) \sim (3-25)$ , DDCT as $(3-28) \sim (3-29)$ and WAPET as (3-33) is proposed in chapter 3. The proposed EQ is shown in Figure 5-10. $(\psi/2)$ : Noise power estimated in time-domain preamble) Figure 5-10: Architecture of the proposed channel equalizer In Figure 5-10 we can find the proposed CE and EQ comprise three parts: ZF CE, the FD-MMSE EQ, and DDCT. In ZF CE the CFR is estimated by dividing the transmitted preamble $X_L(K)$ . Since X(K) is a constant-magnitude value the division can be replaced by a low-complexity sign conversion. In the FD-MMSE EQ the compensating value : $H_E(K)*/[|H_E(K)|^2+\sigma_E2]$ can be estimated from the estimated CFR and noise power. The FD-MMSE EQ consists of a complex multiplier, a pair of real dividers, and a square calculator, and therefore the cost is the same as a complex divider used by LS EQ. For a complete FD-MMSE EQ, one more square circuit is needed for time-domain noise power estimation. In the DDCT part, there are two complex multipliers are used for $X_E(K)\times X_D(K)^{-1}$ and $H_C(K)=H_E(K)\times R_N(K)$ . Furthermore for low-power consideration, the channel updating: $H_C(K)=H_E(K)\times R_N(K)$ may not work every OFDM symbol. And this consideration should be operated with performance loss analysis. The proposed architecture of the WAPET is shown in Figure 5-11. Figure 5-11: Architecture of the proposed WAPET As shown in Figure 5-11, the proposed WAPET comprises pilot pre-compensation, weighted-average (WA) part, and the general PET with phase error detection (PED) and data compensation. For reducing the complex multipliers, the pilot pre-compensation and data compensation share one complex multiplier. And the architecture is realized with one complex multiplier and 2 constant multipliers equivalent to just the bit shifting. In the WAPET, first the pilot signal is sent for the pre-compensation with the old phasor $\exp\{-j\theta_{K,N-1}\}$ and then the residual phase error will be detected with the PED and WA part. And then through the phase combination and look-up table (LUT)-based phasor generator, the data will be compensated with the updated phasor $\exp\{-j\theta_{K,N}\}$ and then sent to DDCT. In the proposed channel equalizer architecture as shown in Figure 5-10, two memory blocks are used. Memory (I) is used to store the CFR as in general equalizers. And memory (II) is used to store A(K,N) for the proposed DDCT. Total memory size is 512 Bytes, equivalent to 35% of the total memory size of an OFDM transceiver [3]. The main combinational circuits of the proposed channel equalizer with DDCT and WAPET comprise one complex divider and 4 complex multipliers. The complex divider is used to compensate the received data with CFR. Three complex multipliers are used for data and pilot compensation of equalizer and WAPET. And one complex multiplier is used for error extraction $[X_E(K) \times X_D(K)^{-1}]$ and channel correction $[H_C(K)=H_E(K)\times$ A(K,N)DDCT. multiplications for Other N·A(K,N)×1/N=A(K,N) and weighted averaging can be realized by the real multipliers. The known 1/N can be approximated as $\Sigma 2^{-n}$ and generated by the low-complexity look-up table (LUT). For low-power concern, the DDCT can be stopped in a slow-fading channel with low Doppler frequency. And following the stop of DDCT, the power of memory (II), memory (III), complex divider, and 2 complex multipliers used for DDCT can be also saved. In the hardware cost, the gate-count of the proposed channel equalizer with DDCT and WAPET only is 161K. The design gate-count is only 60% of a matrix-based channel tracking design in 0.18µm CMOS process [24, 43]. #### 5-1-3 Complexity Analysis of the Proposed Designs for WLAN System The design complexity of the whole OFDM baseband receiver comprising AGC, synchronizer, FFT, FFT I/O ordering, and channel equalizer is listed in Table 5-8. In the OFDM receiver, first the AGC uses complex multipliers to estimate signal power $|Y(n)|^2 = Y(K)Y(K)^*$ of 80 samples. And then synchronizer detects the symbol timing and CFO. In the synchronizer seven times of auto-correlations is used for PD, coarse CFO estimation, and the detection of short symbol/long symbol boundary. And then it used matched filter to find the FFT window during 64 possible timing. Hence the general synchronizer with a 100%-signal-used auto-correlator and a 64-tap matched filter uses $16 \times 7 + 64 \times 1 + 64 \times 64 = 4272$ complex multiplications and memory read times. And the proposed synchronizer with a 50%-signal-used auto-correlation and 16-tap matched filter uses $(16\times7+64\times1)/2+16\times64 = 1112$ complex multiplications and memory read times for auto-correlation and matched filtering. After synchronization there are 64×M complex multiplications needed for CFO compensation. And then for compensating the FFT input signal with the estimated CFO, each FFT input signal needs to be buffered. Hence the FFT-input buffer needs 64×M memory read/write for each OFDM symbol. And then in a radix-2<sup>2</sup> 64-point pipelined FFT design, $64\times3/4\times(\log_464-1) = 96$ complex multiplications and 64-1 = 63 memory read/write are needed in each OFDM symbol [24]. Table 5-8: Design complexity of the baseband receiver in each packet for OFDM-based WLAN system | Design | Complex multiplication | Complex division | Memory Write (Samples) | Memory Read (Samples) | | |---------------------------------------------|------------------------|------------------|------------------------|-----------------------|--| | AGC (A) | 80 | 0 | 0 | 0 | | | General Synchronizer (B1) | 4272 | 0 | 240 | 4272 | | | Proposed Synchronizer (B2) | 1112 | 0 | 174 | 1112 | | | CFO compensation (C) | 64×M | Wille. | | | | | FFT-input buffer (D) | 0 | 5 10 | 64×M | 64×M | | | FFT (E) | 96×M 0 | | 63×M | 63×M | | | FFT-output ordering (F) | 0 | min differ | 52×M | 52×M | | | Proposed EQ (G) | 208×M | 52×M | 104×M | 104×M | | | General OFDM RX | 4352+ | 52×M | 240+ | 4272+ | | | (A+B1+C+D+E+F+G) | 368×M | 32×IVI | 283×M | 283×M | | | Proposed OFDM RX | 1192+ | | 174+ | 1112+ | | | (A+B2+C+D+E+F+G) | 368×M | 52×M | 283×M | 283×M | | | Reduced | 3160 | 0 | 66 | 3160 | | | Note: M is OFDM symbol number of one packet | | | | | | Since the FFT design is generally designed with Decimation in Frequency (DIF) type, the output needs to be reordered to the nature order. And the pilots need to be sent before the data for correct PET. Hence the FFT output ordering needs 2×52 (No. of used subcarriers) = 104 times of memory read/write to reorder and buffer the FFT output signals in each OFDM symbol duration. And then in the channel EQ, as introduced in section 5.1.2, the proposed EQ with PET has 4 complex multipliers and 2 memory blocks. And in PET the working times of the complex multipliers is only 4 for pilot pre-compensation and 48 for data compensation. In other parts of the EQ, each complex multiplier and memory block works for 52 used subcarriers. Hence the proposed EO uses 4+48+52×3=208 complex multiplications and 52×2 memory read/write for each OFDM symbol. As shown in Table 5-8, total 3160 complex multiplications and memory read times can be reduced with the proposed synchronizer. In 54Mb/s mode, the number of OFDM symbol per packet is 41 for transmitting typical 1000 data bytes. In this case the proposed synchronizer can reduce 3160/4272 = 74% complex multiplications of a general synchronizer. The proposed synchronizer can reduce 3160/(4272+368×41) = 16.3% complex multiplications of the whole OFDM baseband receiver. After the analysis of our whole OFDM baseband processor, the complexity of existing CE schemes is discussed. The hardware costs of the existing LMMSE, SVD, ML CE, ZF CE, and referenced DDCT are compared with our proposed DDCT with the improved MMSE EQ in Table 5-9. We can find that CE approaches [49]-[51] paid large number of complex multipliers for accurate CE. However from the simulation result as shown in section 3-3-3, if the LS EQ is applied, the performance improvement by these CE schemes is restricted. Compared with referenced DDCT, our proposed design needs one more square circuit for noise-power estimation. And the cost of square circuit consists of two real multipliers and one real adder can be seen as 1/2 complex multipliers. Based on the comparison of performance and hardware cost, the proposed design can achieve efficient performance improvement with low hardware cost. Table 5-9: Hardware cost of CE and EQ designs | Design | Complex Multiplier | Complex Divider* | |--------------------------|--------------------|------------------| | LS EQ with ZF CE | 1 | 1 | | LS EQ with LMMSE CE [49] | 64 | 1 | | LS EQ with SVD CE [50] | 32 | 1 | | LS EQ with ML CE [51] | 32 | 1 | | LS EQ with DDCT [52] | 3 | 1 | | Proposed DDCT with | 2110 | 1 | | FD-MMSE EQ | 3+1/2 | 1 | <sup>\*</sup> A complex divider includes a complex multiplier, a pair of real divider, and a square circuit for power calculation. ### 5-1-4 Proposed Baseband Chip for OFDM-Based WLAN System With the general synchronizer, the proposed DDCT, and the proposed WAPET, a baseband processor for IEEE 802.11a system is designed. The system architecture of the proposed design is shown in Figure 5-12. As shown in Figure 5-12, for low-area concern, the FFT/IFFT is shared by the transmitter (lower part) and the receiver (upper part). In this half-duplex baseband processor, not only the FFT block but also the some of FEC blocks which are scrambler/descrambler and interleaver/de-interleaver are also shared by transmitter and receiver part. And the 18% of baseband memory (3.3Kbytes) can be saved. The proposed baseband processor is designed in standard 0.18μm CMOS process. The chip microphoto and chip summary are shown in Figure 5-13 and Table 5-10. Figure 5-12: System architecture of the proposed IEEE 802.11a baseband processor Figure 5-13: Chip microphoto of OFDM-based WLAN baseband processor Integrating OFDM transceiver and FEC codec, the proposed baseband processor achieves 54Mb/s data rate with 123mW receiver power. The percentages of gate count and receiver power of baseband processor core are listed in Table 5-11. The OFDM transceiver occupies 317K gate count (86%), 1.5K memory (46%), and 68mW receiver power consumption (55%). For understanding the distribution of sub-blocks, the percentage of hardware complexity in the OFDM transceiver is shown in Figure 5-14. Note that the gate count of EQ+DDCT+PET = 161.7K, equal to 60% of gate count of a WLAN equalizer design [43]. Table 5-10: Chip summary of OFDM-based WLAN baseband processor | Technology | 0.18μm CMOS, 1P 6M | | |---------------------------------|-------------------------|--| | Transistor Count | 2.1M (Include I/O) | | | Package | 144 CQFP | | | Core Size | 3.8×3.8 mm <sup>2</sup> | | | System Clock | 80MHz | | | Supply Voltage | 1.8V Core, 3.3V I/O | | | Core power at 54Mbits/s (Tx/Rx) | 52.4mW/123.5mW | | | I/O Power | 61mW | | Table 5-11: Hardware complexity of OFDM-based WLAN baseband processor | Block | Gate Count | Memory Size (Bytes) | Power (mW) | |------------------|------------|---------------------|------------| | OFDM transceiver | 317K | 1.5K | 68 | | FEC codec | 53K | 1.8K | 37 | | Clock Trees | <10K | 0 | 18.5 | | Total | 380K | 3.3K | 123.5 | (a) Percentage of OFDM transceiver gate-count (b) Percentage of OFDM transceiver memory (c) Percentage of OFDM RX power Figure 5-14: Percentages of hardware complexity of OFDM transceiver for WLAN system As shown in Figure 5-14 (c), the synchronizer and the channel equalizer totally occupy 80% of receiver power consumption. Hence the low-power design of these two blocks can efficiently reduce the whole OFDM transceiver power. For understanding the difference between the state-of-the-art in hardware complexity, the baseband power values of the proposed design and references [18, 23, 24, 39] are listed in Table 5-12. Compared with the existing designs, the proposed baseband design published in [3] needs only 22% ~ 62% low power. Table 5-12: Comparison of baseband power consumption for OFDM-based WLAN system | | Proposed (VLSI 04") | ISSCC 02" | ISSCC 01" | JSSCC 01"<br>[24] | ISCAS 05" [39] | |----------|---------------------|-----------|-----------|-------------------|----------------| | | [3] | | | | | | Process | 0.18µm | 0.25μm | 0.25μm | 0.18µm | 0.25µm | | 1100055 | CMOS | CMOS | CMOS | CMOS | CMOS | | FEC+OFDM | 123.5mW | 203mW | 540mW | | | | RX power | 123.3IIIW | 20311W | 340III W | | | | FEC+OFDM | 52.4mW | 210mW | 290mW | | | | TX power | 32.4111 W | 210mw | 290III W | | | | OFDM RX | 68 | | | 212mW | 109mW | | power | 08 | | | 212111 W | 109111 W | | OFDM TX | 2.1 | | | 100 m W | Not Lists 1 | | power | 31 | | | 199mW | Not Listed | ## 5-2 Hardware Design for OFDM-Based UWB System In this section the fixed-point performance, hardware architecture, and baseband chip design of LDPC-COFDM-based and MB-OFDM-based UWB system are introduced. The design complexity and power reduction efforts of the proposed low-complexity synchronizer and channel equalizer are also discussed. #### 5-2-1 Fixed-point Performance of OFDM-Based UWB Systems In UWB system, 5-bit I/Q DAC and ADC are suggested for maximum 480Mb/s transmission rate [15, 16]. For understanding the performance degradation caused by the quantization error, the PER of 480Mb/s with different wordlength setting in AWGN channel is simulated. The PER curves of 480Mb/s with several wordlength setting for LDPC-COFDM system and MB-OFDM system are shown in Figure 5-15 and Figure 5-16. As shown in Figure 5-15 and Figure 5-16, the PER curves of wordlength $\geq 5$ bits are close to each other. And when the wordlength is $\leq 4$ bits, the SNR for 8% PER is obviously increased. The SNR values for 8% PER of Figure 5-15 and Figure 5-16 are listed in Table 5-13. When the 5-bit wordlength is set, the fixed-point SNR loss for 8% PER is 0.38dB and 0.75dB for LDPC-COFDM and MB-OFDM system. And when the wordlength is reduced to 4 bits, the SNR loss for 8% PER is increased as 1.15dB and 1.35dB. For reducing ADC power we also simulate the fixed-point design with 5-bit DAC and 4-bit ADC. In LDPC system, the fixed-point SNR loss for 8% PER with 5-bit DAC and 4-bit ADC is 0.95dB, larger than 2x of that with 5-bit DAC and ADC in 480Mb/s rate. For keeping the fixed-point SNR loss with low ADC wordlength, the ADC and DAC wordlengths are chosen as 5 bits. Figure 5-15: 480Mb/s PER with different wordlength setting for LDPC-COFDM Figure 5-16: 480Mb/s PER with different wordlength setting for MB-OFDM system Table 5-13: SNR for 8% PER of 480Mb/s UWB with different wordlengths | | LDPC-COF | DM System | MB-OFDM System | | |----------------|------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------|---------------| | | Design SNR | Fixed-point | Design SNR | Fixed-point | | | (dB) | SNR loss (dB) | (dB) | SNR loss (dB) | | Floating-point | 8.2 | | 7.2 | | | design | (A1) | | (B1) | | | 8-bit | 8.37 | 0.17 | 7.74 | 0.54 | | 7-bit | 8.43 | 0.23 | 7.76 | 0.56 | | 6-bit | 8.44 | 0.24 | 7.80 | 0.6 | | 5-bit | 8.58 | 0.38 | 7.95 | 0.75 | | 4-bit | 9.35 | 1.15 | 8.5 | 1.35 | | 3-bit | >10 | <b>51.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8</b> | >10 | >2.8 | | Note | A | A-A1 | В | B-B1 | After the wordlength decision, the baseband block diagram with key wordlength setting is shown in Figure 5-17. With the fixed-point wordlength setting, the fixed-point PER curves of OFDM-based UWB systems are simulated in Intel multipath channel, CM channel, and AWGN channel [20, 32]. The most complex CM channel: CM2 for 480Mb/s, CM4 for 200Mb/s, and CM4 for 110Mb/s are used in the MB-OFDM system PER simulation [15, 16, 20]. These PER curves are shown in Figure 5-18 for LDPC-COFDM system and Figure 5-19 for MB-OFDM system. And the SNR values for 8% PER are listed in Table 5-14 for LDPC-COFDM system and Table 5-15 for MB-OFDM system. Figure 5-17: Block diagram of the fixed-point baseband design for UWB system Channel Condition: Intel multipath channel and AWGN, 40ppm CFO + phase noise, 40ppm SCO Figure 5-18: PER of fixed-point LDPC-COFDM-based UWB system Table 5-14: SNR for 8%PER of fixed-point LDPC-COFDM-based UWB system | | 120Mb/s | 240Mb/s | 480Mb/s | Note | |----------------------------------------|--------------|---------|---------|----------------| | Floating-point SNR in AWGN (dB) | 4.1 | 5.7 | 8.2 | A | | Fixed-point SNR in AWGN (dB) | 4.4 | 6.0 | 8.6 | B (should < E) | | Fixed-point SNR loss in AWGN (dB) | 0.3 | 0.3 | 0.4 | В-А | | Floating-point SNR in multipath (dB) | 5.5 | 7.3 | 10.1 | С | | Fixed-point SNR in multipath (dB) | 6.2 | 7.7 | 10.5 | D (should < E) | | Fixed-point SNR loss in multipath (dB) | 0.7 | 0.4 | 0.4 | D-C | | System Constraint (dB) | 7.5<br>F 8.5 | 16.0 | 21.1 | Е | Figure 5-19: PER of fixed-point MB-OFDM-based UWB system Table 5-15: SNR for 8% PER of fixed-point MB-OFDM-based UWB system | | 110Mb/s | 200Mb/s | 480Mb/s | Note | | |-----------------------------------|---------|---------|---------|----------------|--| | Floating-point SNR in AWGN (dB) | 1.2 | 3.8 | 7.2 | A | | | Fixed-point SNR in AWGN (dB) | 1.6 | 4.4 | 8.2 | B (should < E) | | | Fixed-point SNR loss in AWGN (dB) | 0.4 | 0.6 | 1.0 | B-A | | | Floating-point SNR in | 5.8 | 14.2 | 18.5 | C | | | CM channel (dB) | (CM4) | (CM4) | (CM2) | | | | Fixed-point SNR in CM | 6.4 | 14.8 | 19.8 | D (should < E) | | | channel (dB) | (CM4) | (CM4) | (CM2) | D (should < E) | | | Fixed-point SNR loss in | 0.6 | 0.6 | 1.3 | D-C | | | CM channel (dB) | (CM4) | (CM4) | (CM2) | D-C | | | System Constraint (dB) | 7.1 | 15.1 | 21.1 | E | | In LDPC-COFDM system, as listed in Table 5-14, the fixed-point SNR loss is average 0.42dB. Compared with the SNR of system constraint, the SNR for 8% PER of the proposed fixed-point LDPC-COFDM system can be lower than the system constraint by $3.1dB\sim12.5dB$ in AWGN channel and by $1.7dB\sim10.6dB$ in the multipath channel. IN MB-OFDM system, as listed in Table 5-15, the fixed-point SNR loss is average 0.75dB. Compared with the SNR of system constraint, the SNR for 8% PER of the proposed fixed-point MB-OFDM system can be lower than the system constraint by 5.5dB~12.9dB in AWGN channel and by 0.3dB~1.9dB in the multipath CM channel. The PER versus transmission distances calculated with 6dB noise figure [15, 16] are shown in Figure 5-20 and Figure 5-21. And the transmission distance for 8% PER are listed in Table 5-16 and Table 5-17. Comprising the proposed low-complexity synchronizer and channel equalizer, the UWB baseband system can still achieve the SNR constraints and required transmission distances for 8% PER. Figure 5-20: PER vs. transmission distances of fixed-point LDPC-COFDM-based UWB system Table 5-16: Transmission distances for 8% PER of fixed-point LDPC-COFDM-based UWB system | Data Rate (Mb/s) | AWGN channel (meters) | Multipath channel with RMS=5ns (meters) | System requirement (meters) | |------------------|-----------------------|-----------------------------------------|-----------------------------| | 120 | 14.5 | 11.8 | 10 | | 240 | 12.5 | 10.8 | 4 | | 480 | 8.5 | 6.8 | 2 | | Note | A (should > C) | B (should > C) | С | Figure 5-21: PER vs. transmission distances of fixed-point MB-OFDM-based UWB system Table 5-17: Transmission distances for 8% PER of fixed-point MB-OFDM-based UWB system | Data Rate | AWGN channel | CM channel | System requirement | |-----------|----------------|----------------|--------------------| | (Mb/s) | (meters) | (meters) | (meters) | | 110 | 19.1 | 11.0 (CM4) | 10 | | 200 | 13.8 | 4.2 (CM4) | 4 | | 480 | 9.0 | 2.4 (CM2) | 2 | | Note | A (should > C) | B (should > C) | С | According to reference [42], the PAPR for OFDM-based UWB system are suggested as $\leq$ 9dB. The measured peak-to-average-power ratio of the fixed-point simulation is listed in Table 5-18. The PAPR satisfies the system suggestion. Table 5-18: PAPR of OFDM-based UWB systems | Data Rate, system | PAPR (dB) | | |-------------------------------|-----------|--| | 120Mb/s of LDPC-COFDM system | 6.8 | | | 240 Mb/s of LDPC-COFDM system | 8.8 | | | 480Mb/s of LDPC-COFDM system | 8.8 | | | 110Mb/s of MB-OFDM system | 8.4 | | | 200Mb/s of MB-OFDM system | 8.4 | | | 480Mb/s of MB-OFDM system | 8.5 | | # 5-2-2 Hardware Architecture of the Proposed Designs for OFDM-Based UWB Systems discussed in chapter 4, the proposed data-partition-based and moving-average-free auto-correlation and matched filter with reduction factor $\omega = 4$ can reduce the complex multiplications and register size of synchronizer by 75% with additional 0.15dB ~ 0.16dB SNR loss for OFDM-based UWB system. For achieving UWB high-throughput requirement, parallel architecture is generally used [30, 41]. For 528Msamples/s throughput for 480Mb/s UWB system, the synchronizer is designed with 4-parallelism architecture and 132MHz working frequency. The architecture of the general auto-correlator [24, 28] with 4 parallelisms and the proposed auto-correlation algorithm of equation (4-3) are drawn in Figure 5-22. As shown in Figure 5-22 (a), the general 4-parallelism auto-correlator is realized with parallel FIFO registers and complex multipliers. In the parallel architecture the power and hardware area are linear to parallelism. For low-power concern, the proposed auto-correlator with $\omega = 4$ can be realized with one FIFO and one complex multiplier. Since the used correlated samples can be partitioned into 1/4 of the original number, only one of four input paths is used by the auto-correlator. And the FIFO size and complex multiplier number of the proposed design can be only 1/4 of the general auto-correlator design. The architecture of general matched filter [57], the general tap-reduction matched filter [41], and the proposed matched filter with $\omega = 4$ in equation (4-8) [2] is shown in Figure 5-23. Since the MF coefficients are real values with constant magnitude and variable phase, the coefficient can be one-bit wordlength [57]. According the derivation of (5-1), the proposed MF can be realized with adders and subtractions. Figure 5-22: Architecture of auto-correlators for 528MS/s OFDM-based UWB systems As shown in Figure 5-23 (a), in the general 4-parallelism matched filer design, there are 128 complex multipliers in each matched filter. And the number of complex multipliers is linear to parallelism, resulting $4\times128 = 512$ complex multipliers in the whole matched filter design. To store the used samples for the matched filter, 128-sample FIFO is needed in the 128-tap matched filter. In the general tap-reduction matched filter [41], the complex multiplier number can be reduced to $4\times32=128$ since the tap number is reduced. However the tap-reduction method can not reduce the FIFO size. In the proposed matched filter, the moving-average-free design and tap-reduction design are combined as equation (4-8). Hence both the number of complex multipliers and FIFO size can be reduced in the proposed matched filter architecture. For more clearly understanding the difference between the three matched filters, the equations of these matched filters are shown again in equation (5-1), (5-2), and (5-3). And the flows of received samples and coefficients of the matched filter are shown in Figure 5-24. $$\Lambda_{MF}(k) = \sum_{n=0}^{N-1} r_{k+n} \times C_n^* = \sum_{n=0}^{N-1} r_{k+n} \times K(-1)^{f(n)} = K \sum_{n=0}^{N-1} r_{k+n} \cdot (-1)^{f(n)} {}_{(5-1)}$$ $$\Lambda_{MF}(k) = \omega K \sum_{\ell=0}^{\lfloor N/\omega \rfloor - 1} r_{\omega\ell+k} \cdot (-1)^{f(\omega\ell)}$$ (5-2) $$\Lambda_{MF}(k) = \omega K \sum_{\ell=0}^{\lfloor N/\omega \rfloor - 1} r_{\omega\ell} \cdot (-1)^{f(\omega\ell - k + \lceil (k - \omega\ell)/N \rceil \times N)}$$ (5-3) Where $\omega$ is the reduction factor, N is the sample amount of a FFT symbol, k is the FFT-window detection timing from 0 to N-1, r is the received sample after CFO compensation, and $C_n$ is the coefficient of the matched filter, K is the magnitude of coefficients, and f is the function of coefficient sign values. In Figure 5-23 all received signal "r" are sign-conversed with "f" and then sent to adders. In (5-1) coefficient can be written as $K(-1)^f$ that is because the preamble has constant magnitude and variable phase in [15]. In this case the complex multiplications can be reduced as addition or subtractions. Between (5-1) and (5-2) the difference is only that the used sample amount of (5-2) is only $1/\omega$ of that of (5-1). #### (a) General parallel-4 128-tap matched filters #### (b) General Parallel-4 32-tap matched filters #### (c) Proposed 32-tap parallel-4 matched filters Figure 5-23: Architecture of matched filters for 528MS/s OFDM-based UWB systems Figure 5-24: Example of signal in the three kinds of the matched filter But the coefficient indexes of (5-1) and (5-2) are fixed and not depended on timing k. Therefore in Figure 5-24 (a) and (b), when the tap number is reduced from as (a) to as (b), the used coefficients of Figure 5-24 (b) are still fixed. And the used samples of Figure 5-24 (b) are changed according to timing k. Therefore all samples need to be stored [41] and the 128-samples FIFO is still needed in the 32-tap tap-reduction matched filter. And between (5-1) and (5-3) the differences are not only the reduction of multiplications, but also that only the coefficient indexes of (5-3) are tuned according to timing k. As shown in Figure 5-24 (c), since the coefficients can be tuned according to different timing k, the used samples can be fixed for different timing k. And the FIFO needs to store only 32 samples in the proposed 32-tap matched filter. Since the coefficient indexes are pre-defined and the samples are unknown, the hardware complexity of dynamic coefficients will be lower than that of dynamic samples. For example, we can store the coefficients in a ROM and use address control to select out the used coefficients. And the power of FIFO can be reduced with the proposed matched filter scheme. The architectures of the divider-and-multiplier-based channel equalizer [18, 24, 25] and the proposed channel equalizer [2] are shown in Figure 5-25. In the UWB system the transmission distances are $0 \sim 20$ meters. Different from WLAN system, the Doppler effect is more weak in the short-distance UWB system. So the DDCT is not needed and the zero-forcing CE scheme is suitable for UWB system. As shown in Figure 5-25 (a) the general zero-forcing channel equalizer is realized with a complex divider, 1 complex multiplier, and a memory storing $2\times6\times112\times3$ bits = 504 Bytes of channel frequency response (CFR) (Wordlength of I and Q = 8, NO. of used subcarrier = 100, NO. of used band = 3). In Figure 5-25 (a), the WAPET of the general channel equalizer comprises arc-tangent, phase error detection (PED), phase error tracking (PET), phase combining, phasor generator, and 1 complex multiplier for pilot pre-compensation and data compensation. As shown in Figure 5-25 (b), the proposed divider-and-multiplier-free channel equalizer uses phase addition/subtraction instead of complex multiplication/division. In the front of the proposed channel equalizer, the pilot and data subcarriers from FFT are converted from real/imaginary parts to symbol phases with the arc-tangent design. And the wordlength of each subcarrier can be reduced from 6×2 (I+Q) to 6 (phase). (a) General zero-forcing channel equalizer (b) Proposed divider-and-multiplier-free channel equalizer Figure 5-25: Architectures of the channel equalizers for UWB system Then in channel phase estimation the channel phase $H_E(K)$ can be estimated by the subtraction with preamble phases. And the estimated channel phases of three multi-bands with 112 used subcarriers in each are stored in the $6\times112\times3$ bits = 252 bytes memory. In the proposed channel equalizer, when the data subcarriers come, the data phases can be compensated in the real subtraction with the channel phases. And then in the WAPET, since the subcarriers have been converted to symbol phase, the arc-tangent is not needed in the WAPET. Different from the WAPET design of a general channel equalizer, the WAPET of the divider-and-multiplier-free channel equalizer does not need the arc-tangent and phasor generator design. And all the complex multipliers of WAPET in Figure 5-25 (a) can be replaced with adders. For removing the divider design, the arc-tangent design is also improved. The architecture of the used logarithm-based arc-tangent design is shown in Figure 5-26. In the arc-tangent design the subtraction of log values is used to replace the division of imaginary part and real part. And then the symbol phase can be resulted with the look-up table and sign information. For low complexity the LUT only finds the phase values within $0\sim\pi/2$ according to logarithm results. And the all phase range from $-\pi$ to $+\pi$ can be found from the sign information of real part and imaginary part of input symbols. With the logarithm-based arc-tangent, the whole channel equalizer can be designed without the use of any divider. Without the use of any divider and multiplier, the critical path delay is just dominated by the additions and look-up table (LUT) of arc-tangent. And the working frequency can be enhanced to 264MHz in 0.18 $\mu$ m CMOS process. Figure 5-26: Architecture of the logarithm-based arc-tangent design # 5-2-3 Complexity and Power Analysis of the Proposed Designs for OFDM-Based UWB Systems The design complexity of the divider-and-multiplier-based approaches [18, 24, 25, 28] and the proposed design are listed in Table 5-19. With the proposed low-complexity designs, the complex multiplications, complex division, memory read/write can be reduced. In the OFDM-based UWB system, one FFT symbols consists of 128 of 528MS/s samples and one OFDM symbol consists of 165 of 528MS/s samples. And the number of used subcarriers including pilots is 100 for each OFDM symbol. In the AGC, signal power of 5 OFDM symbols are estimated, needing 5×128/2=320 complex multiplications. In UWB system, the FFT size (128) is doubled of that (64) of WLAN system, so the needed results of the matched filter are increased. In synchronizer, 5 times of auto-correlation and 165 times of matched filter computations are needed to detect the correct symbol timing. In general synchronizer, 128 complex multiplications and memory reading are needed for each auto-correlation result and matched filter result. And total 128× (5+165) = 21760 complex multiplications and memory reading are needed in the general synchronizer. $128\times5+128+165=933$ memory writing is also needed in the general synchronizer. In the proposed synchronizer with $\omega=4$ , 128/4=32 complex multiplications and memory reading is needed for each auto-correlation result and matched filter result. And total $32\times(5+165)=5440$ complex multiplications and memory reading are needed in the proposed synchronizer. $32\times5+32=192$ memory writing is also needed in the proposed synchronizer. After synchronization there are 128×M complex multiplications needed for CFO compensation. In the FFT-input buffering, 128 memory read/write is needed for cyclic-prefix addition. And in the 528MS/s 128-point mixed-radix 4-parallelism FFT design, 176 complex multiplications and 124 memory read/write are needed for each FFT symbol [5]. In the FFT-output buffering, 2×112 = 224 memory read/write is needed to reorder the FFT output signal. In the channel equalizer, $112\times2\times3 = 672$ memory write and $112\times3 = 336$ memory read is needed to estimate the CFR of 3 multi-bands and 2 OFDM symbols in each bands. In the general channel equalizer with complex divider and complex multipliers, 336 times of complex division is needed to estimate the inversed CFR of 3 bands. And in each OFDM symbol, $2\times112 = 224$ complex multiplications are needed to compensate the subcarriers with inversed CFR and phasor error caused by CFO and SCO. Table 5-19: Design complexity of a baseband receiver in each packet for OFDM-based UWB system | Design | Complex multiplication | Complex division | Memory Write (Samples) | Memory Read (Samples) | |-------------------------------|------------------------|------------------|------------------------|-----------------------| | AGC (A) | 320 | 0 | 0 | 0 | | General Synchronizer (B1) | 21760 | 0 | 933 | 21760 | | Proposed Synchronizer (B2) | 5440 | 0 | 192 | 5440 | | CFO compensation (C) | 128×M | 0 | 0 | 0 | | FFT-input buffer (D) | 0 | 0 | 128×M | 128×M | | FFT [5] (E) | 176×M | 0 | 124×M | 124×M | | FFT-output ordering (F) | O ES | 0 | 224×M | 224×M | | General EQ (G1) | 224×M B96 | 336 | 672 | 336+<br>112×M | | Proposed EQ (G2) | 0 | 0 | 672 | 336+<br>112×M | | General OFDM RX | 22080 | 227 | 1605 | 22096 | | (A+B1+C+D+E+F+G1) | +528×M | 336 | +476M | +588×M | | Proposed OFDM RX | 5760+304×M | 0 | 864 | 5776 | | (A+B2+C+D+E+F+G2) | 3/00±304×W | 0 | +476M | +588×M | | Reduced | 16320<br>+224×M | 336 | 741 | 16320 | | Reduced in 480Mb/s $(M = 72)$ | 32448 | 336 | 741 | 16320 | #### Note: M is OFDM symbol number Compared with the general synchronizer, the proposed synchronizer can reduce (21760-5440)/21760 = 75% multiplication and memory reading. In the proposed channel equalizer, the channel estimation and data compensation is done with additions and subtractions. Therefore the times of complex division and complex multiplications can be reduced to zero. Compared with the general OFDM receiver [18, 24, 25, 28], the proposed design can reduce 32448 complex multiplications, 336 complex division, 741 memory writing and 16320 memory reading. The reduced complex multiplications, complex division, and memory reading are 54%, 100%, and 25% of those of the general OFDM algorithms in 480Mb/s mode. For understanding the hardware complexity of the proposed synchronizer, the power of the proposed synchronizer and the general synchronizer [24, 28] are estimated in the post-layout simulation in 0.18µm CMOS process. Both synchronizers work for 528MS/s throughput with 132MHz clock. Hence 4-parallelism scheme is the basic architecture. The gate-count and simulated power values are listed in Table 5-20. With the data-partition-based auto-correlation and moving-average-free matched filter, the proposed synchronizer only needs 37.6% gate count and 43.3% power of a general approach. In an OFDM transceiver consuming 162mW [1], the proposed synchronizer reduces equivalent 27.0% of OFDM transceiver power. For understanding the hardware complexity of the proposed channel equalizer, the power of the proposed equalizer and the divider-and-multiplier-based equalizer [18, 24, 25] are estimated in the post-layout simulation in 0.18µm CMOS process. Both equalizers work for 528MS/s throughput with 132MHz clock. The gate-count and simulated power values are listed in Table 5-21. Table 5-20: Hardware complexity of the proposed synchronizer and general 4-parallelism synchronizer | | The proposed design | | 4-parallelism architecture | | |------------------|---------------------|------------|----------------------------|------------| | | Gate-count | Power (mw) | Gate-count | Power (mw) | | Auto-correlator | 4K | 3.2 | 13K | 12.8 | | Matched filter | 33K | 6.7 | 106K | 15.4 | | Register | 7.2K | 8.5 | 28.8K | 34.0 | | CFO compensation | 14K | 13.1 | 14K | 13.1 | | FSM and Others | 4.2K | 1.9 | 4.2K | 1.9 | | Total | 62.4K | 33.4 | 166K | 77.2 | As listed in Table 5-21, in the general divider-and-multiplier-based zero-forcing channel equalizer, 300 complex divisions are needed for each packet and 200 complex multiplications are needed for each OFDM symbol. They consume 27.8+37.4 = 62.5mW power. With the divider-and-multiplier-free equalization scheme, the power dissipation of the complex divider and multipliers can be saved. And the proposed equalizer only needs 48.6% gate count and 40.4% power of divider-and-multiplier-based approach. In an OFDM transceiver consuming 162mW [1], the proposed equalizer reduces equivalent 38% of OFDM transceiver power. And the proposed equalizer can achieve 528MS/s throughput at 264MHz clock. And the maximum working frequency can arrive at 270MHz (3.7ns) in the post-layout simulation. That means the design can be realized with the 2-parallelism architecture. In this case the equalizer gate-count can be reduced to 41K. Table 5-21: Hardware complexity of the proposed divider-and-multiplier-free channel equalizer and the general divider-based channel equalizer with 4-parallelism | | The prop | osed design | General divider-based design | | |--------------------------------|------------|-------------|------------------------------|------------| | | Gate-count | Power (mw) | Gate-count | Power (mw) | | Complex divider | 0 | 0 | 24K | 27.8 | | Complex<br>multipliers | 0 | 0 | 16K | 34.7 | | Add/Sub for CE or compensation | 3K | 7.6 | 0 | 0 | | WAPET +arc-tangent | 31K | 25.1 | 31K | 25.1 | | Memory/Reg. | 18K | 9.3 | 36K | 16.2 | | Total | 52K | 42 | 107K | 103.8 | The complexity of the proposed design, magnitude-and-phase-based design (polar coordinates-based), and divider-and-multiplier-based design are listed in Table 5-21-2. We can find the divider-and-multiplier-based design uses one complex multiplier and divider for channel estimation and equalization. The memory size of $112\times3\times2\times6=504$ Bytes are needed to store the CFR of the three bands with 6-bit I and Q. And there are 3 adders to calculate the traced phase error and to achieve the computation of log(Q)-log(I) in logarithm-based arc-tangent design. And one 12-bit in/6-bit out look-up table is needed to generate the pilot phase in phase error tracking (PET) design. When we use the magnitude-and-phase-based design, both magnitude and phase parts need 2 adders for CE and EQ. Hence the number of adders is added by 4. And one look-up table is added to convert the I and Q signal to magnitude value. When we be design, the number of adder can less the proposed than use magnitude-and-phase-based design since the magnitude part is not needed. And the number of look-up table can be kept as 1. In both the magnitude-and-phase-based and the proposed design, the divider and multiplier can be completely removed. Table 5-21-2: Complexity of the proposed design, magnitude-and-phase-based design, and divider-and-multiplier-based design | | -4511 | • magnitude and | divider-and- | |-----------------------|-----------------|-----------------------------------------|------------------| | | Proposed design | magnitude-and- | multiplier-based | | | | phase-based design | design | | Adder | 5 | 896 | 3 | | Complex Divider | 0 | 111111111111111111111111111111111111111 | 1 | | Complex Multiplier | 0 | 0 | 1 | | Memory Size | 252 Bytes | 504 Bytes | 504 Bytes | | Look-up Table | 1 | 2 | 1 | | (32-bit in/6-bit out) | 1 | 2 | 1 | #### 5-2-4 Proposed Baseband Chip for LDPC-COFDM-Based UWB System Comprising the proposed low-complexity synchronizer and channel equalizer, a LDPC-COFDM-based UWB baseband processor are designed in 0.18µm CMOS process and 0.13µm CMOS process respectively. The system architecture of the half-duplex baseband processor is shown in Figure 5-27. The baseband processor links RF with 5-bit I/Q DAC and 5-bit I/Q ADC. And the 7-bit DAC is used for digital AGC. For satisfying the power spectrum mask the shaping filter with 11 taps is used and the DAC sampling rate is increased to 1.056GHz (2x of passband bandwidth). The same as the proposed WLAN baseband processor, the FFT design is shared by the transmitter part and the receiver part of the UWB baseband processor. Different from the WLAN baseband processor, the proposed UWB baseband processor comprises four parallel paths and 132MHz working clock to achieve 528MHz throughput rate. Hence the parallel to serial (P/S) and serial to parallel (S/P) is needed to link the DAC and ADC. Figure 5-27: System architecture of the proposed baseband processor for OFDM-based UWB systems The chip microphoto of the LDPC-COFDM-based UWB baseband processor designed in 0.18µm CMOS process is shown in Figure 5-28. And the chip testing summary is listed in Table 5-22. As shown in Figure 5-28, the LDPC decoder, FFT, Synchronizer, and channel equalizer occupy 50%, 10%, 8%, and 6% chip area. The core power is 523mW and 575mW for transmitter and receiver part. The gate count and power of OFDM transceiver, FEC codec, and block buffers are listed in Table 5-23. Figure 5-28: Chip microphoto of the proposed LDPC-COFDM-based UWB baseband processor Table 5-22: Chip summary of the proposed LDPC-COFDM-based UWB baseband processor | Technology | 0.18μm CMOS 1P6M | |-------------------------------|---------------------| | Package | 208 CQFP | | Die Size | 6.5mm×6.5mm | | Gate Count (Including I/O) | 1.056M | | Maximum data rate (b/s) | 480M | | Maximum signal bandwidth (Hz) | 528M | | Supply voltage | 1.8V Core, 3.3V I/O | | Core power at 480Mb/s (TX/RX) | 523mW /575mW | Table 5-23: Hardware complexity of the LDPC-COFDM baseband chip | Block | Gate Count | RX Power (mW) | |------------------|--------------------------|---------------| | OFDM transceiver | 350K | 162 | | FEC codec | 567K | 211 | | Clock Trees | <10K | 202 | | Total | 918K (Not Including I/O) | 575 | As shown in Table 5-23, the OFDM transceiver occupies 38% gate count and 29.6% power of the baseband core. And the clock buffers which occupy 35% of baseband receiver power dissipate the power consumption. For understanding the hardware complexity of the sub-blocks of OFDM transceiver, the percentages of the gate count and the receiver power in OFDM transceiver is shown in Figure 5-29. With the proposed low-complexity schemes and low-power architecture designs, the synchronizer and equalizer totally occupy only 32% of OFDM transceiver gate-count and 49% of OFDM receiver power consumption. Compared with those in WLAN design (Figure 5-16), the percentages of gate-count and receiver power are reduced by 43% and 31%. As listed in Table 5-20 and Table 5-21, the proposed schemes of synchronizer and channel equalizer totally reduce 158.6K gates and 105.6mW power. They can reduce 45.3% (158.6/350) gates and 65.1% (105.6/162) power of the proposed UWB OFDM transceiver. #### 5-2-5 Proposed Baseband Chip for MB-OFDM-Based UWB System Comprising the proposed low-complexity synchronizer and channel equalizer, the MB-OFDM-based UWB baseband processor is designed in 0.13µm CMOS process. And the high clock-buffer power problem of the LDPC-COFDM chip is also solved. The clock-buffer power of LDPC-COFDM chip is so high because the clock buffers of OFDM transceiver are still tuned on when the partial circuits are sleep. THE OWNER OF OWNER OF THE OWNER OW (a) Gate Count Percentage of OFDM transceiver (b) RX power Percentage of OFDM transceiver Figure 5-29: Percentage of gate-count and receiver power of the OFDM transceiver for LDPC-COFDM-based UWB For solving this problem, a multi-stage gated-clock control is developed. For saving the clock buffer power, we separate the gated clock buffers into 5 stages according to the FSM of overall OFDM transceiver. The stages of clock buffers in the OFDM transceiver are shown in Figure 5-30. RX: ① on Packet detection success ②&③ on, others off TX: ④ on Preamble transmission will end ②&⑤ on, others off Figure 5-30: Stages of clock buffers of OFDM transceiver As shown in Figure 5-30, the receiver comprises stage 1, state 2, and stage 3 of clock buffers. And the transmitter comprises stage 4, stage 2, and stage 5 of clock buffers. The FSM to turn on or turn off the stages of clock buffers is listed in Table 5-24. Table 5-24: Finite stage machine of the proposed multi-stage gated-clock control | State | Stage 1 | Stage 2 | Stage 3 | Stage 4 | Stage 5 | |---------------------|---------|---------|---------|---------|---------| | RX: Initial | On | Off | Off | Off | Off | | RX: | On | On | On | Off | Off | | PD is OK | On | On | Oli | Oli | Oll | | TX: Initial | Off | Off | Off | On | Off | | TX: End of Preamble | Off | On | Off | On | On | | transmission | Oll | On | Oll | On | On | In receiving mode, the stage 4 and stage 5 of clock buffers belonging to transmitter part should be turned off. In the initial of signal receiving mode, the symbol timing of valid packet is unknown. Hence only AGC and synchronizer need to work for packet and timing detection. And only stage 1 of clock buffers needs to be turned on. When the packet detection (PD) is successful, the other receiver block mainly comprising FFT, channel equalizer, and De-QPSK need to work for data demodulation. And the stage 2 and stage 3 of clock buffers are also turned on. In the transmission mode, the stage 1 and stage 3 of clock buffers belonging to receiver part should be turned off. In the initial of signal transmission, a PLCP preamble needs to be transmitted during typical 9.375µs. During the preamble transmission, the (I)FFT and other transmitter part do not need to work and stage 2 and stage 5 of clock buffers can be turned off. After the end of preamble transmission, stage 2 and stage 5 of clock buffers can be turned on for data modulation. This multi-stage gated-clock control only turns on the clock buffers when the driven circuits need to work. And the baseband power can be saved efficiently. The microphoto of the MB-OFDM transceiver comprising the proposed synchronizer, channel equalizer, and multi-stage gated-clock control is shown in Figure 5-31. And the chip testing summary is listed in Table 5-25. With the multi-stage gated-clock control, the core power at 480Mb/s is only 15.8 and 31.2mW for transmission mode and receiving mode. For understanding the power reduction by the multi-stage gated-clock control, the power consumption with and without the power control is drawn in Figure 5-32. As shown in Figure 5-32, the proposed clock control can save 17% of OFDM receiver power and 28% of OFDM transmitter power. Figure 5-31: Chip microphoto of MB-OFDM OFDM transceiver Table 5-25: Chip summary of MB-OFDM OFDM transceiver | Technology | 0.13μm CMOS 1P6M | |--------------------------------|---------------------| | Package | 208 CQFP | | Die Size | 3.975mm×3.98mm | | Gate Count (Not Including I/O) | 344K | | Maximum data rate (b/s) | 480M | | Maximum signal bandwidth (Hz) | 528M | | Supply voltage | 1.2V Core, 3.3V I/O | | Core power at 480Mb/s (TX/RX) | 15.8mW /31.2mW | ### (a) Power percentage of transmission mode ### (b) Power percentage of receiving mode Figure 5-32: Power percentage of MB-OFDM UWB baseband transceiver ## Chapter 6: ## Conclusions and Future Work The proposed HPSU or sub-sampling-based auto-correlator designs lead to low complexity for packet detection and CFO estimation. The proposed HPCU or moving-average-free matched filter schemes lead to low complexity for FFT-window detection. The proposed DDCT or divider-and-multiplier-free channel equalizer lead to low complexity for channel tracking and equalization. Based on the proposed low-complexity synchronizer and channel equalizer schemes, the OFDM baseband processor can achieve low power and keep high performance. In the WLAN system, the proposed synchronizer can reduce 74% multiplications of a general synchronizer equivalent to 16.3% multiplications of whole OFDM transceiver in 54Mb/s mode. And the SNR loss for 10% PER added by the proposed synchronizer can be limited in 0.1dB (54Mb/s) $\sim 1.3$ dB (6Mb/s). In the proposed channel equalizer for WLAN, the CE MSE can be reduced by 6~27dB and SNR for 10% PER can be reduced by 0.7~1.9dB when compared with the conventional zero-forcing scheme. When compared with the existing WLAN equalizer [24, 43], the proposed design can achieve better 3~24dB CE MSE and less 16 complex multipliers. The gate-count of the proposed equalizer is only 60% of [24, 43]. With the proposed schemes, the SNR for 10% fixed-point PER can be less than the existing chips by 0.68~6.45dB. And the power consumption is only $22\% \sim 62\%$ of the existing designs. In the UWB system, the proposed synchronizer and channel equalizer totally reduce 45.3% gate count and 65.1% power of an UWB OFDM transceiver. The proposed synchronizer can reduce 75% multiplications of a general synchronizer. The amount of the reduced multiplications is equivalent to 54% of that of whole OFDM transceiver in 480Mb/s mode. It only needs 37.6% gate count and 43.3% power of a general synchronizer approach. And equivalent 27% power of OFDM transceiver is also reduced by proposed synchronizer. The SNR loss for 8% PER added by the proposed synchronizer can be limited in 0.2dB. The proposed channel equalizer can eliminate all complex divisions and multiplications of a general channel equalizer. In the 480Mb/s case it eliminates 300 complex division (100% of division in OFDM transceiver) and 14400 complex multiplications (29% of multiplications in OFDM transceiver). The proposed equalizer only needs 48.6% gate count and 40.4% power of a divider-and-multiplier-based equalizer. And equivalent 38% power of OFDM transceiver is also reduced by proposed equalizer. And the SNR loss for 8% PER added by the proposed equalizer is only 0.3dB. Compared with the system constraint, the proposed fixed-point LDPC-COFDM system can reduce the SNR requirement for 8% PER by 1.7dB ~ 12.5dB. And the proposed fixed-point MB-OFDM system can reduce the SNR requirement for 8% PER by 0.3dB ~ 12.9dB. With the proposed low-complexity designs, the OFDM-based baseband transceivers can achieve 480Mb/s data rate with lower power consumption. The proposed MB-OFDM transceiver in 0.13µm CMOS process only consumes 31.2mW power. The complexity reduction and the changing of SNR loss for 10% PER of WLAN and 8% PER of UWB by the proposed synchronizer (Sync.) and channel equalizer (EQ) are listed in Table 6-1. And the chip performance, OFDM gate count, and power consumption are listed in Table 6-2 as the proposed low-power design summary. In the future, the following work is to the integration of the digital baseband and analog RF front-end. In out research process, we find there are still several bottlenecks in the baseband/RF integration. Such as RF filtering distortion, I/Q mismatch, AGC controlling, and interface problems. However these non-ideal impacts have been modeled and simulated in our research works, the mismatch between RF behavior model and RF silicon-proven circuit still exists. And a programmable RF-effect calibration is needed in the baseband chip. For example the most important work of baseband receiver is to find the valid packet and correct symbol boundary with the synchronizer. However when the filter response of RF circuit is different from that in RF behavior model, the synchronizer may not find the distorted packets. Therefore a "training machine" should be added in the synchronizer to adapt to the RF circuit effects. In the training mode we can send the signal from RF circuits to the baseband and inform baseband design the coming signal is a valid packet. And then baseband chip can automatically tune the parameters of the synchronizer and adapt to the real RF circuit effects. With the integration of RF, baseband, and even MAC, the low-power design can be improved from the system level and the complete low-power wireless products can be realized. Table 6-1: Complexity reduction and SNR loss changing of the proposed design | Design | Complexity reduction | SNR-loss changing | | |---------------------|------------------------------------------|------------------------|--| | Proposed WLAN Sync. | Reduce 3160 complex multiplications (CM) | Loss 0.1~1.3dB SNR | | | Employed WLAN | Complexity is close to ZF CE and | Gain 1.8dB SNR than LS | | | FD-MMSE EQ | more lower than existing CE designs | EQ designs | | | | 1. Less 16 complex multipliers than | 1. Gain 8~15dB CE MSE | | | Proposed WLAN | [24] | than ZF CE | | | DDCT | | 2. Gain 1.5dB SNR than | | | | 2. Only 60% gate-count of [24, 43] | ZF CE | | | | 1. Reduce 16320 CM | | | | Proposed UWB Sync. | 2. Reduce 103.6K gate-count | Loss 0.2dB SNR | | | Froposed Owb Sync. | 3. Reduce 43.8mW power (27% of | LOSS U.ZUD SINK | | | | 0.18μm CMOS OFDM) | | | | | 1. Eliminate all CM and divisions | | | | n limin ro | 2. Reduce 55K gate-count | Loss O 2dD CND | | | Proposed UWB EQ | 3. Reduce 61.8mW power (38% of | Loss 0.3dB SNR | | | | 0.18(m CMOS OFDM) | | | 1896 Table 6-2: Chip performance and OFDM hardware complexity | Proposed Baseband | SNR for 8% or 10% PER | OFDM | RX Core | | |-------------------|-----------------------|------------|---------|--| | Chips | compared with system | Gate count | Power | | | Chips | constraint in AWGN | Gate count | (mW) | | | OFDM WLAN in | Avg. better 6.45dB | 317K | 68 | | | 0.18(m CMOS [3] | Avg. better 0.43db | 31/K | 08 | | | LDPC-COFDM in | Avg botton 9 6dD | 350K | 162 | | | 0.18(m CMOS [1] | Avg. better 8.6dB | 330K | 102 | | | MB-OFDM in 0.13(m | Avg botton 0.7dD | 344K | 31.2 | | | CMOS [53] | Avg. better 9.7dB | 344K | 31.2 | | ## References - [1] Hsuan-Yu Liu, Chien-Ching Lin, Yu-Wei Lin, Ching-Che Chung, Kai-Li Lin, Wei-Che Chang, Lin-Hung Chen, Hsie-Chia Chang, and Chen-Yi Lee, "A 480Mb/s LDPC-COFDM-based UWB Baseband Transceiver," *IEEE International Solid-State Circuits Conference (ISSCC)* Digest of Technical Papers, PP. 444-446, February 2005. - [2] Hsuan-Yu Liu and Chen-Yi Lee, "A Low-Complexity Synchronizer for OFDM-Based UWB System," to appear in *IEEE Transactions on Circuits and Systems II*. - [3] Hsuan-Yu Liu, Yi-Hsin Yu, Chien-Ching Lin, Ching-Che Chung, Terng-Yi Hsu, and Chen-Yi Lee, "A COFDM Baseband Processor with Robust Synchronization for High-Speed WLAN Applications," *IEEE Symposium on VLSI Circuits*, pp. 156-159, June 2004. - [4] Hsuan-Yu Liu, Yi-Hsin Yu, Chien-Jen Hung, Temg-Yin Hsu, and Chen-Yi Lee, "Combining Adaptive Smoothing and Decision-Directed Channel Estimation Schemes for OFDM WLAN Systems," *International Symposium on Circuits and Systems*, vol. 2, pp. II-149 II-152, May 2003. - [5] Yu-Wei Lin, Hsuan-Yu Liu, and Chen-Yi Lee, "A 1-GS/s FFT/IFFT Processor for UWB Applications," *IEEE Journal of Solid-State Circuits (JSSC)*, Vol. 40, Issue. 8, pp. 1726–1735, August 2005. - [6] Yu-Wei Lin, Hsuan-Yu Liu, and Chen-Yi Lee, "A Dynamic Scaling FFT Processor for DVB-T Applications," *IEEE Journal of Solid-State Circuits* (*JSSC*), Vol. 39, Issue. 11, pp. 2005–2013, November 2004. - [7] Wei-Che Chang, Lin-Hung Chen, Wan-Chun Liao, Hsuan-Yu Liu, and Chen-Yi Lee, "An Area and Power Efficient Frame Synchronizer for 480Mb/s OFDM-based UWB System" *VLSI-TSA-DAT*, April 2005. - [8] Lin-Hung Chen, Wei-Che Chang, Wan-Chun Liao, Hsuan-Yu Liu, and Chen-Yi Lee, "A 528MS/s Frequency Synchronizer for OFDM-based UWB System" *VLSI-TSA-DAT*, April 2005. - [9] Yi-Hsin Yu, Hsuan-Yu Liu, Terng-Yin Hsu, and Chen-Yi Lee, "A Joint Scheme of Decision-Directed Channel Estimation and Weighted-Average Phase Error Tracking for OFDM WLAN Systems," *IEEE Asia-Pacific Conference on Circuits and Systems*, vol.2, pp.985-988, December 2004. - [10] C.S. Peng and K.A. Wen, "Synchronization for Carrier Frequency Offset in Wireless LAN 802.11a System," *Wireless Personal Multimedia Communications*, vol.3, October 2002. - [11] M. Morelli and U. Mengali, "An Improved Frequency Offset Estimator for OFDM Applications," *IEEE Communication Letters*, vol. 3, pp. 75-77, March 1999. - [12] P.H. Moose, "A Technique for Orthogonal Frequency Division Multiplexing Frequency Offset Correction," *IEEE Trans. on Communications*, vol. 42, No. 10, pp. 2908-2914, October 1994. - [13] T. M. Schmidl and D. C. Cox, "Robust Frequency and Timing Synchronization for OFDM," *IEEE Trans. on Commun.*, vol. 45, pp. 1613-1621, December 1997. - [14] Shou-Yin Liu, Jong-Wha Chong, "A Study of Joint Tracking Algorithms of Carrier Frequency Offset and Sampling Clock Offset for OFDM-Based WLANs", *IEEE International Conference on Communications, Circuit and System and West Sino Expositions*, vol. 1, pp. 109-113, July 2002. - [15] A. Batra, et al., "Multi-band OFDM Physical Layer Proposal Update," *IEEE P802.15-03267r6-TG3a*, September 2003. - [16] A. Batra, et al., "Multi-band OFDM Physical Layer Proposal for IEEE 802.15 Task Group 3a," *IEEE P802.15-04/0943r0-TG3a*, September 2004. - [17] W. Eberle, et al., "A Digital 72 Mb/s 64-QAM OFDM Transceiver for 5 GHz Wireless LAN in 0.18 µm CMOS," *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers*, pp. 336-337, February 2001. - [18] J. Thomson, et al., "An Integrated 802.11a Baseband and MAC Processor," *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers*, pp. 126-127, February 2002. - [19] Bob, O'Hara and Al Petrick, "The IEEE 802.11 Handbook A Designer's Companion", January, 2000. - [20] J. Foerster et al., "Channel Modeling Sub-committee Report Final," Doc. No. 802.15-02/490r1-SG3a, *IEEE P802.15 WPAN, February 2003, available at http://grouper.ieee.org/groups/802/15/.* - [21] John G. Proakis, "Digital Communications," McGraw Hill, pp.808-810, 2001. - [22] John G. Proakis, Masoud Salehi, "Communication Systems Engineering," Second Edition, Prentice Hall, 2002. - [23] P. Ryan, et al., "A Single Chip PHY COFDM Modem for IEEE 802.11a with Integrated ADCs and DACs," *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers*, pp. 338-339, 463, 2001. - [24] Wolfgang Eberle, et al., "80-Mb/s QPSK and 72-Mb/s 64-QAM Flexible and Scalable Digital OFDM Transceiver ASICs for Wireless Local Area Networks in the 5-GHz Band," *IEEE Journal of Solid-State Circuits (JSSC)*, Vol.: 36, Issue. 11, pp. 1829-1838, November 2001. - [25] Fujisawa, et al., "A Single-Chip 802.11A MAC/PHY with a 32b RISC Processor," *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 38, Issue. 11, pp.2001-2009, November 2003. - [26] J. Liu and J. Li, "Parameter Estimation and Error Reduction for OFDM-Based WLANs," *IEEE Trans. Mobile Computing*, Vol. 03, Issue. 2, pp. 152-163, April 2004. - [27] Chien-Fang Hsu, Yuan-Hao Huang, and Tzi-Dar Chiueh, "Design of An OFDM Receiver for High-Speed Wireless LAN," *International Symposium on Circuits and Systems*, vol. 4, pp. 558-561, May 2001. - [28] M. Krstic, A. Troya, K. Maharatna, and E. Grass, "Optimized Low-power Synchronizer Design for the IEEE 802.11a Standard," *ICASSP*, Vol. 2, pp. II-333-6, April 2003. - [29] L. Schwoerer, "VLSI Suitable Synchronization Algorithms and Architecture for IEEE 802.11a Physical Layer," *IEEE International Symposium on Circuits and Systems*, vol. 5, pp. 721-724, May 2002. - [30] Marian Verhelst, et al., "Architecture for Low Ultra-Wideband Radio Receivers in the 3.1-5GHz Band for Data Rates <10Mbps," proceeding of 2004 International Symposium on Low Power Electronics And Design, August 2004. - [31] Furrer S. and Dahlhaus D. "Mean Bit-Error Rates for OFDM Transmission with Robust Channel Estimation and Space Diversity Reception," *International Zurich Seminar on Broadband Communications, Access, Transmission, and Networking*, pp.: 47-1~47-6, 2002. - [32] J. Foerster and Q. Li, "UWB Channel Modeling Contribution from Intel," *IEEE P802.15-02/279-SG3a*, June 2002. Available: <a href="http://grouper.ieee.org/groups/802/15/pub/2002/Jul02/02279r0P802-15\_SG3a-C">http://grouper.ieee.org/groups/802/15/pub/2002/Jul02/02279r0P802-15\_SG3a-C</a> hannel-Model-Cont-Intel.doc . - [33] Zhiwei Xu, et al., "A Compact Dual-Band Direct-Conversion CMOS Transceiver for 802.11a/b/g WLAN," *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers*, pp. 98-100, February 2005. - [34] J.W.M. Rogers, M. Cavin, F. Dai, and D. Rahn, "A ΔΣ Fractional-N Frequency Synthesizer with MUM-Band PMOS VCOs for 2.4 and 5GHz WLAN Applications, *European Solid-State Circuit Conference (ESSCIRC)*, pp. 651–654, September 2003. - [35] G.Y. Tak, S.B. Hyun, T.Y. Kang, B.G. Choi, and S.S. Park, "A 6.3–9-GHz CMOS Fast Settling PLL for MB-OFDM UWB Applications," *IEEE Journal of Solid-State Circuits (JSSC)*, vol. 40, issue 8, pp.1671-1679, August 2005. - [36] Jri Lee, and Da-Wei Chiu, "A 7-Band 3-8GHz Frequency Synthesizer with 1ns Band-Switching Time in 0.18µm," *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers*, pp. 204-206, February 2005. - [37] William McFarland, et al, "A WLAN SoC for Video Applications Including Beamforming and Maximum Ratio Combining," *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers*, pp. 452-454, February 2005. - [38] Manish Bhardwaj, et al, "A 180MS/s, 162Mb/s Wideband Three-Channel Baseband and MAC Processorfor 802.11a/b/g," *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers*, pp. 454-456, February 2005. - [39] Wei-Hsiang Tseng, Ching-Chi Chang, and Chorng-Kuang Wang, "Digital VLSI OFDM Transceiver Architecture for Wireless SoC Design," *IEEE International Symposium on Circuits and Systems*, pp. 5794-5797, May 2005. - [40] A. Saleh and R. Valenzuela, "A Statistical Model for Indoor Multipath Propagation," *IEEE Selected Areas in Communications (JSAC)*, vol. SAC-5, No. 2, pp. 128-137, February 1987. - [41] I. D. O'Donnell, S. W. Chen, B. T. Wang, and R. W. Brodersen "An Integrated, Low Power, Ultra-Wideband Transceiver Architecture for Low-Rate Indoor Wireless System," *IEEE CAS Workshop on Wireless Communications and Networking*, Sepember 2002. - [42] Eldon Staggs, "Ultrawideband Radio Design System Analysis," Patterns in Design, Ansoft, July 2003, available at <a href="http://users.ece.gatech.edu/rincon-mora/publicat/trade\_jrnls/pmdl\_0705\_pa.pdf">http://users.ece.gatech.edu/rincon-mora/publicat/trade\_jrnls/pmdl\_0705\_pa.pdf</a> - [43] Wolfgang Eberle, et al., "A Digital 72Mb/s 64-QAM OFDM Transceiver for 5GHz Wireless LAN in 0.18µm CMOS," *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers*, pp: 336-337, 462-463, 2001. - [44] David Su, et al., "A 5 GHz CMOS Transceiver for IEEE 802.11a wireless LAN," *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers*, vol. 1, pp. 92-449, February 2002. - [45] F. Chen and Zhang Guoping, "Parallel FFT with CORDIC for Ultra Wide Band," *IEEE International Symposium on Personal, Indoor and Mobile Radio Communications*, vol. 2, pp. 1173-1177, September 2004. - [46] P. Robertson and S. Kaiser, "The Effects of Doppler Spreads in OFDM(A) Mobile Radio Systems," *IEEE Vehicular Technology Conference*, vol. 1, pp. 329-333, September 1999. - [47] A. Klein, G.K. Kaleh, P.W. Baier, "Zero Forcing and Minimum Mean-square-error Equalization for Multiuser Detection in Code-division Multiple-access Channels," *IEEE Transactions on Vehicular Technology*, vol. 45, pp. 276-287, May 1996. - [48] Sheng-Chou Lin, and Juin-Ming Tsai, "Optimum Performance of Zero-forcing and Minimum Mean-square-error Equalization for Spatial Combining Diversity over Mobile Radio Channels," *IEEE Vehicular Technology Conference*, vol. 4, pp. 2497-2501, Oct. 2001. - [49] J. J. van de Beek, O. Edfors, M. Sandell, S.K. Wilson, and P.O. Borjesson, "On Channel Estimation in OFDM Systems," *IEEE Vehicular Technology Conference*, vol. 2, pp. 815-819, July 1995. - [50] O. Edfors, M. Sandell, J. J. van de Beek, S.K. Wilson, and P.O. Borjesson, "OFDM Channel Estimation by Singular Value Decomposition," IEEE Transactions on Communications, vol. 46, pp. 931-939, July 1998. - [51] L. Deneire, P. Vandenameele, L. van der Perre, B. Gyselinckx, and M. Engels "A Low-Complexity ML Channel Estimator for OFDM," *IEEE Transactions on Communications*, vol. 51, pp. 135-240, Feb. 2003. - [52] J. Ran, R. Grunheid, H. Rohling, E. Bolinth, and R. Kern, "Decision-Directed Channel Estimation Method for OFDM Systems with High Velocities," *IEEE Vehicular Technology Conference*, vol. 4, pp. 2358-2361, April 2003. - [53] Jui-Yuan Yuet et al., "A 31.2mW UWB Baseband Transceiver with All-Digital I/Q-mismatch Calibration and Dynamic Sampling," To appear in *IEEE Symposium on VLSI Circuits*, 2006. - [54] A. Scaglione, G.B. Giannakis, and S. Barbarossa, "Redundant filterbank precoders and equalizers Part I: Unification and optimal designs," *IEEE Trans. Signal Processing*, vol. 47, pp.1988-2006, July 1999. - [55] B. Muquet et al., "Cyclic Prefixing or Zero Padding for Wireless Multicarrier Transmissions," *IEEE Transactions on Communications*, vol. 50, pp. 2136-2148, Dec. 2002. - [56] W. Li, Z. Wang, Y. Yan, T. M, "An Efficient Low-cost LS Equalization in COFDM Based UWB Systems by Utilizing channel-Stateinformation (CSI)," IEEE Vehicular Technology Conference, vol. 4, pp. 2167-2171, Sept. 2005. - [57] T. Ha, S. Lee and J. Jim "Low-complexity Correlation System for Timing Synchronization in IEEE802.11a Wireless LANs," Proceedings of Radio and Wireless Conference, pp. 51-54, Aug. 2003. - [58] A. Fort, J.-W. Weijers, V. Derudder, W. Eberle, and A. Bourdoux, "A Performance and Complexity Comparison of Auto-correlation and Cross-correlation for OFDM burst Synchronization," *IEEE International Conference on Acoustics, Speech, and Signal Processing*, vol. 2, pp.:II 341-344, Apr. 2003. # Appendix A: Supplementary of OFDM-Based System SPEC In Appendix A, the system specification (SPEC) of OFDM-based system will be supplemented. The derivation of main system parameters and system requirement will be discussed below. ### A-1: System Parameter Derivation In OFDM system, all the parameters can be derived from the few key parameters. And in the standard institution, the key parameters lead the system data rate and system performance. The key parameters of OFDM-based WLAN, LDPC-COFDM-based UWB, and MB-OFDM-based UWB systems are listed in Table A-1 ~ Table A-3. The number of pilot subcarriers leads to the receiver synchronization performance with pilots. And other parameters decide the data rate values. Different from Table A-1, new parameters: spreading factor is added in Table A-2 and Table A-3. That is because in OFDM-based UWB system the spreading method is added to overcome channel noise and large interference. In OFDM-based system, the transmitted data packet consists of OFDM symbols. The OFDM symbol consists of FFT symbol and guard-interval. The FFT symbol duration (T<sub>OFDM</sub>) are derived as $$T_{FFT} = N/BW \tag{A-1}$$ $$T_{OFDM} = T_{FFT} + T_{GI} \tag{A-2}$$ Where N is FFT size, BW is signal bandwidth, and $T_{GI}$ is the guard-interval duration listed in Table A-1, Table A-2, and Table A-3. In IEEE 802.11a system, the 8 kinds of data rate are provided. The data rate can be derived as Table A-1: Dominated parameters of OFDM-based WLAN system | | Dond | EET | Guard- | NO. of | NO. of | Coded | EEC | |---------|----------------|-------------|------------|------------|------------|--------------|----------| | Data | Band-<br>width | FFT<br>size | interval | used data | used pilot | bits per | FEC | | Rate | (BW) | | length | subcarrier | subcarrier | subcarrier | coding | | | (BW) | (N) | $(T_{GI})$ | $(N_{SD})$ | $(N_{SP})$ | $(N_{CBPC})$ | rate (R) | | 6Mb/s | | | | | | 1 | 1/2 | | 9 Mb/s | | | , will | Mr. | | 1 | 3/4 | | 12 Mb/s | | | ALL C | ME | | 2 | 1/2 | | 18 Mb/s | 20 | 64 | 0.8µs | 48 | 4 | 2 | 3/4 | | 24 Mb/s | 20 | 1 | 18 | 9.6 | 4 | 4 | 1/2 | | 36 Mb/s | | | The same | THE | | 4 | 3/4 | | 48 Mb/s | | | | | | 6 | 2/3 | | 54 Mb/s | | | | | | 6 | 3/4 | $$Data \ rate = N_{CBPC} \times R / S \times N_{SD} / T_{OFDM}$$ (A-3) Where R is FEC coding rate and S is spreading factor. In WLAN system the spreading factor can be seen as 1. According to (A-3) we can find the transmitted bit number of each OFDM symbol duration: N<sub>DBPS</sub> can be derived as $$N_{DBPS} = N_{CBPC} \times R \times N_{SD} \tag{A-4}$$ Table A-2: Key parameters of LDPC-COFDM-based UWB system | Data Rate | 120Mb/s | 240Mb/s | 480Mb/s | | |--------------------------------------------------|---------|---------|---------|--| | Bandwidth | | 528MHz | | | | FFT size (N) | | 128 | | | | Guard-interval length (T <sub>GI</sub> ) | 70.1ns | | | | | NO. of used data subcarriers ( $N_{SD}$ ) | 100 | | | | | NO. of used pilot subcarriers (N <sub>SP</sub> ) | 12 | | | | | Coded bits per subcarrier (N <sub>CBPC</sub> ) | 2 | | | | | Coding rate (R) | 3/4 | | | | | Spreading factor (S) | 4 | 2 | 1 | | Table A-3: Key parameters of MB-OFDM-based UWB system | Data Rate | 110Mb/s | 200Mb/s | 480Mb/s | |--------------------------------------------------|----------|---------|---------| | Bandwidth (BW) | 528MHz | | | | FFT size (N) | 128 | | | | Guard-interval length (T <sub>GI</sub> ) | 70.1ns | | | | NO. of used data subcarriers (N <sub>SD</sub> ) | 100 | | | | NO. of used pilot subcarriers (N <sub>SP</sub> ) | 12 or 22 | | | | Coded bits per subcarrier (N <sub>CBPC</sub> ) | 2 | | | | Coding rate (R) | 11/32 | 5/8 | 3/4 | | Spreading factor (S) | 2 | 2 | 1 | Where the $N_{\text{CBPC}}$ means the contained coded bits of each constellation symbols and R is equivalent to the average data-bit number in each coded bit. Therefore $N_{CBPC} \times R$ is equal to the contained data bit number of each constellation symbols. In each OFDM symbol there are $N_{SD}$ of data constellation symbols are transmitted. Therefore the transmitted bit number of each OFDM symbol duration is $N_{CBPC} \times R \times N_{SD}$ . Combining (A-1) and (A-3), we can find the data rate can be also derived as Data rate = $$N_{CBPC} \times R/S \times N_{SD} / (N/BW + T_{GI})$$ (A-5) So the data rate can be increased with $N_{CBPC}$ (QAM constellation), coding rate, $N_{SD}$ , bandwidth. And when spreading factor is increased, or FFT size or guard-interval duration is increased, the data rate will be decreased. ### A-2: Power Spectrum Density Requirement In WLAN system, besides the PER, CFO, and SCO requirement, another important specification is power spectrum mask (PSM). That means the power spectrum density (PSD) of transmitted signal should be limited in the PSM constraint. The power spectrum mask (PSM) is shown in Figure A-1. For satisfy the PSM constraint, we use the 15-tap (total length = 350ns for WLAN) and 21-tap (total length = 19.8ns for UWB) finite-length raise-cosine shaping filters for WLAN and UWB baseband design. With 2x up-sampling, the transmitted baseband signal can satisfy the PSM within 2x signal bandwidth. Figure A-1: Power spectrum mask of IEEE 802.11a WLAN system Figure A-2: Power spectrum mask of OFDM-based UWB system # A-3 RF Band Allocation, Spreading Scheme, and Overcoming Jamming Technique of MB-OFDM System The band groups are listed in Table A-4. There are 5 band groups to choose of UWB transmission. The allocated band groups are also shown in Figure A-3. Table A-4: Band location of IEEE 802.15.3a OFDM-based UWB system | Band | BAND_ID | Lower | Center | Upper | |-------|---------|-----------|-----------|-----------| | Group | | frequency | frequency | frequency | | 1 | 1 | 3168 MHz | 3432 MHz | 3696 MHz | | | 2 | 3696 MHz | 3960 MHz | 4224 MHz | | | 3 | 4224 MHz | 4488 MHz | 4752 MHz | | 2 | 4 | 4752 MHz | 5016 MHz | 5280 MHz | | | 5 | 5280 MHz | 5544 MHz | 5808 MHz | | | 6 | 5808 MHz | 6072 MHz | 6336 MHz | | 3 | 7 | 6336 MHz | 6600 MHz | 6864 MHz | | | 8 | 6864 MHz | 7128 MHz | 7392 MHz | | | 9 | 7392 MHz | 7656 MHz | 7920 MHz | | 4 1 | 10 | 7920 MHz | 8184 MHz | 8448 MHz | | | 11 | 8448 MHz | 8712 MHz | 8976 MHz | | | 12 | 8976 MHz | 9240 MHz | 9504 MHz | | 5 | 13 | 9504 MHz | 9768 MHz | 10032 MHz | | | 14 | 10032 MHz | 10296 MHz | 10560 MHz | В Figure A-3: Band location of IEEE 802.15.3a OFDM-based UWB system As shown in Figure A-3, there are 2~3 carriers used for hopping in each band group. For example, if the system chooses band group 1 (Mod Balacte Group)#1 the signal will be hopping in band ID = 1~3 (3432MHz, 3960MHz, and 4488MHz). An example of signal hopping is shown in Figure A-4, In Figure A-4, the transmitted #1 #2 #3 OFDM symbols are carried in band ID = 1~3 by turns. The neighbor OFDM symbols will be transmitted in the different RF bands so the ISI between OFDM symbols can become weaker. This hopping operation is done by time-frequency interleaving of RF and controlled by the baseband design. Figure A-4: An example of MB hopping of MB-OFDM-based UWB system For solving the large jamming problem, the UWB baseband system uses spreading methods to provide several modes of data rate. The spreading method frequency-domain spreading time-domain comprises and spreading. frequency-domain spreading is used when data rate $\leq 80$ Mb/s [16]. In this mode only half the data subcarriers in the negative frequency range are used to transmit the mapped data symbols. The other half the data subcarriers in the positive frequency range are the complex conjugate values of the first half data subcarriers. Therefore the IFFT outputs can become the real numbers. That means the imaginary parts of IFFT outputs are zero and the relative circuit includes the DAC and RF circuit of imaginary part can be turned off to save the power consumption. The time-domain spreading is to duplicate the time-domain OFDM symbols. The symbol duplication can solve the jamming problem with MB technique. Figure A-5 is an example of UWB signal in frequency domain with a jamming. The jamming happens in band #2. When the time-domain spreading is used, the OFDM symbols will be transmitted as shown in Figure A-6. The OFDM symbol #A, #B, and #C will be transmitted in two bands. And when the band #2 is hurt by jamming, the receiver can ignore the OFDM symbol in band #2 and the OFDM symbol #A and #C can be still received in band #1 and #3. So combining the time-domain and frequency-domain spreading method, the baseband can reduce the power consumption and solve the jamming degradation in UWB system. Figure A-5: An example of jamming happening in UWB bands Figure A-6: An example of received OFDM symbols with jamming # A-4: Conversion Scheme from Zero-Pad to Cyclic-Prefix in OFDM-Based UWB Systems Different from the guard interval which is equal to the cyclic prefix of the FFT symbol, the zero pad consists of zeros. The advantage of zero pad is to reduce the transmitted signal power. Therefore to finish the cyclic convolution effect of multipath channel and received signal, the cyclic prefix of FFT symbol is added to the front of FFT symbol in the receiver. The example of the transmitted OFDM symbols with cyclic prefix and zero pad are shown in Figure A-7 and Figure A-8. Figure A-7: The transmitted OFDM signal with cyclic prefix Figure A-8: The transmitted OFDM signal with zero pad In OFDM-based WLAN system such as IEEE 802.11a system, the cyclic prefix equivalent to the guard interval is added in the front of the FFT symbol in the transmitter. And then through the channel, the linear convolution of the channel and the FFT symbol with the cyclic prefix will be equal to the circular convolution of the channel and the FFT symbol. Therefore in the receiver, each used subcarrier after FFT will be equal to the multiplication result of each transmitted subcarrier and the channel subcarrier. And the one-tap channel equalization can be used instead of the time-domain channel equalizers to reduce the design complexity. In the UWB system with zero pad, the baseband also needs to use one-tap channel equalization for low complexity and low power. Although the cyclic prefix is not added in the transmitter, it can still be added in the receiver. AS shown in Figure A-8, the cyclic prefix is added to the received FFT symbol. This operation makes the received FFT symbols the same as that in Figure A-7.