# 國 立 交 通 大 學 電子工程學系電子研究所

### 碩士論文

應用於正交分頻多工技術為基礎之

無石英無線近身網路同步器

**Synchronization Method for** 

Crystal-less OFDM-based

**Wireless Body Area Network Applications** 

研究生:馬曉涵

指導教授:李鎮宜博士

中華民國九十八年七月

### 應用於正交分頻多工技術為基礎之

### 無石英無線近身網路同步器

### Synchronization Method for Crystal-less OFDM-based

### Wireless Body Area Network Applications

研究生:馬曉涵

Student: Hsiao-Han Ma

指導教授:李鎮宜博士

Advisor: Dr. Chen-Yi Lee



A Thesis

Submitted to Department of Electronics Engineering & Institute Electronics College of Electrical and Computer Engineering National Chiao Tung University in Partial Fulfillment of Requirements for the Degree of Master in

> Electronics Engineering July 2009 Hsinchu, Taiwan, Republic of China

中華民國 九十八 年 七 月

### 應用於正交分頻多工技術為基礎之無石英

### 無線近身網路同步器

研究生:馬曉涵 指導教授:李鎮宜教授 國立交通大學電子工程學系電子研究所

### 摘要

在本篇論文裡,介紹一個正交分頻多工系統的同步方法,應用於無石英震盪器之無 線近身網路,來增加整個系統收發端的頻率誤差容忍度。

無線近身網路為主的健康照護的系統越來越受人們的重視,尤其是針對人體生醫訊 號的偵測,可降低醫療成本以及提高醫療成效。應用情境如配戴在身上的無線感測器對 人體訊號做長時間的偵測,並以無線的方式將資料傳送給整合在手機或個人數位助理的 接收端。基於這樣的一個應用,極低的功率消耗跟高度整合的面積會是系統不可或缺的 需求,因此我們使用了互補式金屬氧化層半導體震盪器取代傳統產生系統時脈的石英震 盪器,以降低系統功率消耗以及提升整合度。

無石英震盪器現階段技術尚未成熟,使用此系統會造成收發段頻率誤差大的問題, 造成整個系統收發資料損失。於是本篇論文提出了一種套用於正交分頻多工的基頻同步 演算法配合一個可以調整的數位控制震盪器已達到減少收發端頻率誤差的效果。

將晶體振盪器整合進單一晶片中,可以降低系統的製造成本、面積及功率消耗。在 本篇論文我們對整個正交分頻多工無線近身網路系統的行為做一個詳盡的闡述,並分析 頻率校準的基頻演算法,最後建立了一個正交分頻多工的基頻原型來驗證這樣的行為。 這個方法套用在此論文之應用系統可使整個系統的頻率誤差容忍度拓展為140倍,進而 達到小面積,低功率,高整合度的基頻系統晶片設計。

### Synchronization Method for Crystal-less OFDM-based Wireless Body Area Network Applications

Student: Hsiao-Han Ma Department of Electronics Engineering and Institute of Electronics, National Chiao-Tung University

### Abstract

In this thesis, we propose a synchronization method for crystal-less OFDM-based wireless body area network (WBAN) applications to enlarge the frequency error tolerance between the transmitter and receiver.

WBAN systems for ubiquitous health monitoring are gradually attracting many attentions. This can reduce the medical cost and improve medical treatment outcomes. The wireless sensor node (WSN) nodes are placed on the human body allowing long-term health monitoring. Those gathered signals from a multiple of WSNs are wirelessly transmitted to a remote central processing node (CPN), such as mobile phone or PDA. Based on these applications, low power and highly integration properties are indispensable. To meet these needs, the CMOS oscillator is applied instead of the conventional quartz crystal oscillator.

However, the crystal-less technology does not come to maturity. It causes large frequency error in the system using the crystal-less oscillator. We propose a synchronization method for OFDM-based WBAN systems. Applying the crystal-less oscillator in the system, the area and power can be reduced a lot. In this thesis, a complete WBAN system and the novel synchronization algorithm are described. An OFDM emulation system is also established. In our application system, the overall frequency error tolerance is expanded to 140x, enabling the tiny area and highly integration SOC design.

### 誌謝

在 Si2 實驗室的碩士班研究過程中,讓我學到了非常多的東西,過程相當充實。在 碩班的日子裡,幾乎每天待在實驗室的時數平均超過 14 小時,假日也還是在實驗室過, 但還是樂此不疲。

首先要感謝我的指導教授李鎮宜博士,提供了非常完善的研究環境與資源,時常給 我們諄諄教誨。感謝實驗室的鍾菁哲學長,學長不遺餘力的為實驗室貢獻了很多很多, 如果沒有學長教導我們 tapeout 的 flow,就不會有現在實驗室的豐碩的成果。

再來要感謝 WBAN Group 的大學長,游瑞元學長,在碩士班裡給我的關心與指導。 雖然學長相當嚴厲,但從學長身上學習到許多強者的處事態度以及經驗。

再來要感謝實驗室的學長及同學們,感謝厲害的阿龍學長和義閉學長,常常給予我 鼓勵和關心,也解決我們許多技術上的問題。感謝已畢業的上賓學長,學長在我碩班的 期間教導了我許多知識,也從學長身上學到整理資料的好習慣。感謝畢業的 95 級學長 姐們,俊廷,冠麟,茗智,琇茹,清峰,在實驗室相當照顧我,即使在畢業後也常給我 打氣。感謝建螢學長,學長就像我的父親一般,除了在研究上的指導外也教導我許多待 人處世的道理。感謝燦文學長,在我困惑時給我很好的意見提點。感謝智超學長,在我 難過時總是為我加油打氣。感謝偉豪和書餘,大家一起討論功課和研究,互相學習,一 起進步。

最後還要感謝我的父母和兄長總是給予我的關心與鼓勵,讓我能夠盡全力的花費碩 班的生活在研究和學習上。在大家的支持下,我才能順利地完成碩士學業,最後感謝口 試委員的指導與寶貴的意見。

-III-

| 摘要                                       | I    |
|------------------------------------------|------|
| Abstract                                 | II   |
| 誌謝                                       | III  |
| CONTENTS                                 | IV   |
| LIST OF FIGURES                          | VIII |
| LIST OF TABLES                           | XI   |
| Chapter 1                                | 1    |
| Introduction                             | 1    |
| 1-1 Introduction to WBAN Systems         | 1    |
| 1-2 Introduction to Crystal-less Systems | 2    |
| 1-3 Motivation                           | 5    |
| 1-4 Design Target                        | 7    |
| 1-5 Outline of the Thesis                |      |
| Chapter 2                                | 9    |
| Application System                       | 9    |
| 2-1 Overview                             | 9    |
| 2-2 System Behavior                      |      |
| 2-2-1 Downlink                           |      |
| 2-2-2 Uplink                             |      |
| 2-3 Packet Format                        |      |
| 2-3-1 Short Preamble                     |      |
| 2-3-2 Long Preamble                      |      |
| 2-3-3 Packet Format                      |      |
| 2-4 WiBoC System Block Diagram           |      |
| 2-4-1 Wireless Sensor Node               |      |
| 2-4-2 Central Processing Node            |      |
| 2-5 Channel Model                        |      |

| 2-5-1 AWGN                                             |    |
|--------------------------------------------------------|----|
| 2-5-2 Carrier Frequency Offset                         |    |
| 2-5-3 Sampling Clock Offset                            |    |
| 2-5-4 Link Budget                                      |    |
| 2-6 Frequency Allocation                               |    |
| Chapter 3                                              |    |
| Synchronization PAFEE Method                           |    |
| 3-1 PAFEE Packet Detection Algorithm                   |    |
| 3-1-1 Conventional Algorithm                           |    |
| 3-1-2 Proposed Algorithm                               |    |
| 3-2 PAFEE Frequency Error Estimation Algorithm         |    |
| 3-2-1 Conventional Algorithm                           |    |
| 3-2-2 Proposed Algorithm                               |    |
| 3-3 PAFEE Method Flow                                  |    |
| Chapter 4                                              |    |
| Simulation Results and Performance Analysis            |    |
| 4.1 Simulation on DAFEE Deduct Detection               | 50 |
| 4-1 Simulation on PAPEE Packet Detection               |    |
| 4-1-1 ROC Curves                                       |    |
| 4-1-2 Detection Rate, False Alarm Rate versus SNR      |    |
| 4-1-5 Detection Rate Versus Frequency Error Estimation |    |
| 4-2 Simulation on PAFEE Frequency Error Estimation     | 01 |
| 4-3 Simulation on System PER Performance               |    |
| Chapter 5                                              |    |
| Hardware Implementation                                |    |
| 5-1 Overall Architecture                               |    |
| 5-1-1 Packet Detector                                  |    |
| 5-1-2 Phase Rotator                                    |    |
| 5-1-3 Frequency Error Estimator                        | 71 |
| 5-2 Simulation on Quantization Results                 |    |
| 5-2-1 Simulation on PAFEE Packet Detection             |    |
| 5-2-2 Simulation on PAFEE Frequency Error Estimation   |    |

| 5-2-3 Simulation on System PER Performance |  |
|--------------------------------------------|--|
| 5-3 Hardware Overhead                      |  |
| Chapter 6                                  |  |
| Emulation of Crystal-less Baseband System  |  |
| 6-1 Building Block                         |  |
| 6-2 Evaluation System                      |  |
| 6-3 Emulation Results                      |  |
| Chapter 7                                  |  |
| Conclusions and Future Work                |  |
| 7-1 Conclusions                            |  |
| 7-2 Future Work                            |  |
| References                                 |  |
| Appendix A                                 |  |
| Embedded Crystal Design                    |  |
| A-1 Introduction 1896                      |  |
| A-2 Proposed DCO                           |  |
| A-2-1 DCO Architecture                     |  |
| A-2-1 Coarse-tuned Stage                   |  |
| A-2-2 Fine-tuned Stage                     |  |
| A-3 Measurement Results                    |  |
| A-3-1 Coarse-tuned Stage                   |  |
| A-3-2 Fine-tuned Stage                     |  |
| A-4 System Comparison                      |  |
| A-5 Conclusion                             |  |
| Appendix B                                 |  |
| IP Deliverable                             |  |
| B-1 Overall Architecture                   |  |
| B-2 Architecture of Receiver               |  |

| B-2-1 Components                   |  |
|------------------------------------|--|
| B-2-3 External Signal Descriptions |  |
| B-2-3 FSM                          |  |
| B-2-4 Area                         |  |
| B-2-5 Power                        |  |
| B-3 Architecture of Transmitter    |  |
| B-3-1 Components                   |  |
| B-3-2 External Signal Descriptions |  |
| B-3-3 FSM                          |  |
| B-3-4 Area                         |  |
| B-3-5 Power                        |  |
|                                    |  |



### LIST OF FIGURES

| Figure 1-1: Motivation, pros and cons                           | 6  |
|-----------------------------------------------------------------|----|
| Figure 1-2: Motivation and PAFEE method                         | 6  |
| Figure 2-1: The target operation scenario of WBAN system        |    |
| Figure 2-2: WiBoC system behavior                               |    |
| Figure 2-3: WiBoC system device connection behavior             | 11 |
| Figure 2-4: Short preamble format                               |    |
| Figure 2-5: Preamble format with 24-tone in frequency domain    | 14 |
| Figure 2-6: Preamble format with 6-tone in frequency domain     | 15 |
| Figure 2-7: Long preamble format in time domain                 | 17 |
| Figure 2-8: Downlink packet format                              |    |
| Figure 2-9: Uplink packet format.                               |    |
| Figure 2-10: WSN block diagram                                  |    |
| Figure 2-11: CPN block diagram                                  | 21 |
| Figure 2-12: CFO effect                                         |    |
| Figure 2-13: mathematical CFO model                             |    |
| Figure 2-14: MATLAB CFO model                                   |    |
| Figure 2-15: SCO effect                                         |    |
| Figure 3-1 (a): ROC curves from mathematics deduction           |    |
| Figure 3-1 (b): ROC curves from mathematics deduction (Zoom-in) |    |
| Figure 3-2: ROC curves under different M                        |    |
| Figure 3-3: Frequency error estimation block                    |    |
| Figure 3-4: Synchronization PAFEE algorithm flow                |    |
| Figure 4-1(a): Frequency error estimation block                 |    |
| Figure 4-1(b): Frequency error estimation block (Zoom-in)       |    |
| Figure 4-3: ROC curves in different L                           |    |
| Figure 4-4: ROC curves in conventional one and proposed one     |    |

### LIST OF FIGURES

| Figure 4-5: Detection rate in conventional one and proposed one                   | 58 |
|-----------------------------------------------------------------------------------|----|
| Figure 4-6: False alarm rate in conventional one and proposed one                 | 58 |
| Figure 4-7: Detection rate under different CFO and SCO conditions, SNR=1 dB       | 59 |
| Figure 4-8: Detection rate under different CFO and SCO conditions, SNR=2 dB       | 60 |
| Figure 4-9: Detection rate under different FO and SNR conditions                  | 61 |
| Figure 4-10: Estimation corrent rate versus frequency error                       |    |
| Figure 4-11 (a): Maximum remaining error versus frequency error                   |    |
| Figure 4-11 (b): Maximum remaining error versus frequency error (Zoom-in)         |    |
| Figure 4-12: Packet error rate versus SNR, N=3                                    |    |
| Figure 4-13: Packet error rate versus SNR, N=1                                    |    |
| Figure 5-1: Overall architecture of WSN system                                    |    |
| Figure 5-2: Hardware architecture of packet detector                              | 68 |
| Figure 5-3: Hardware architecture of moving average                               | 69 |
| Figure 5-4: Hardware architecture of phase rotator in packet detection state      | 69 |
| Figure 5-5: Hardware architecture of phase rotator after detecting the packet     | 71 |
| Figure 5-6: Hardware architecture of frequency error estimator                    |    |
| Figure 5-7: Real part of the Ri, i is from 1 to 16                                | 72 |
| Figure 5-8: Imaginary part of the Ri, i is from 1 to 16                           | 73 |
| Figure 5-9: Fixed point ROC curves                                                | 74 |
| Figure 5-10: Fixed point detection rate under different frequency error condition | 75 |
| Figure 5-11: Fixed point estimation correct rate versus frequency error           | 76 |
| Figure 5-12: Fixed point maximum remaining error versus frequency error           | 77 |
| Figure 5-13: Fixed point packet error rate versus SNR                             |    |
| Figure 6-1: Hardware emulation building blocks                                    |    |
| Figure 6-2: Evaluation system                                                     | 82 |
| Figure 6-3: Received signal shown in spectrum analyzer                            | 84 |
| Figure 6-4: Estimation results with frequency error 0, 400, 800, 1200 ppm         |    |
| Figure 6-5: Estimation results with frequency error 1600, 2000, 2400, 2800 ppm    | 87 |
| Figure A-1: DCO Architecture                                                      | 97 |

### LIST OF FIGURES

| Figure A-2: Coarse-tuned stage                          |  |
|---------------------------------------------------------|--|
| Figure A-3: Schmitt trigger cell                        |  |
| Figure A-4: Fine-tuned stage                            |  |
| Figure A-5: The period of DCO in coarse-tuned stage     |  |
| Figure A-6: The RMS jitter of DCO in coarse-tuned stage |  |
| Figure A-7: The period of DCO in fine-tuned stage       |  |
| Figure A-8: The RMS jitter of DCO in fine-tuned stage   |  |
| Figure B-1: Overall architecture of WSN system          |  |
| Figure B-2: FSM of WSN_RX                               |  |
| Figure B-3: P&R view of WSN_RX                          |  |
| Figure B-4: FSM of WSN_TX                               |  |
| Figure B-5: P&R view of WSN_TX                          |  |



### LIST OF TABLES

| LIST OF TABLES                                                              | PAGE |
|-----------------------------------------------------------------------------|------|
| Table 1-1: The power consumption, area, and cost of the quartz crystal [11] | 3    |
| Table 1-2: Existing solutions for crystal-less communications systems       | 4    |
| Table 1-3: Candidate clock source in CMOS technology                        | 4    |
| Table 1-4: WBAN system with their allowable frequency errors                | 5    |
| Table 1-5: Design target                                                    | 7    |
| Table 2-1: Link Budget                                                      |      |
| Table 5-1: Area overhead                                                    | 79   |
| Table 5-2: Power overhead                                                   |      |
| Table 6-1: Baseband components                                              |      |
| Table 6-2: The relationship between frequency error and clock period        |      |
| Table 7-1: System comparison summary                                        |      |
| Table 7-2: WiBoC OFDM system summary                                        |      |
| Table A-1: Delay element in coarse-tuned stage                              |      |
| Table A-2: Delay element in fine-tuned stage                                |      |
| Table A-3: DCO in coarse-tuned stage.                                       |      |
| Table A-4: DCO in fine-tuned stage                                          |      |
| Table A-5: DCO power consumption under different supply voltage             |      |
| Table A-6: DCO performance comparison                                       |      |
| Table B-1: The component in WSN_RX                                          | 110  |
| Table B-2: Mode of operation in WSN_RX                                      | 111  |
| Table B-3: External signal description in WSN_RX                            |      |
| Table B-4: State description in WSN_RX                                      |      |
| Table B-5: Area reports in WSN_RX                                           |      |
| Table B-6: Power classification and definition                              |      |
| Table B-7: Power report in WSN_RX                                           |      |
| Table B-8: The component in WSN_TX                                          |      |
| Table B-9: Mode of operation in WSN_TX                                      |      |

### LIST OF TABLES

| Table B-10: External signal description in WSN_TX | 118 |
|---------------------------------------------------|-----|
| Table B-11: State description in WSN_TX           | 119 |
| Table B-12: Area report in WSN_TX                 | 120 |
| Table B-13: Power report in WSN_TX                | 121 |



## Chapter 1 Introduction

### 1-1 Introduction to WBAN Systems

With the progress of the medical technology, there are more and more accurate and higher-level instruments invented in the hospitals. These devices help human to observe their disease in early time. The doctor will prevent and cure the disease efficiently.

In recently years, the ubiquitous healthcare devices are presented to the public and explode to be popular. Ubiquitous healthcare monitoring plays a crucial role in body signal tracking and recording. The patients want comfortable and portable devices to examine their physical status. This device is not only portable and convenient, but extends medical services from an indoor room to any open roaming spaces.

The wireless body area network (WBAN) is designed for the applications of body signal gathering and monitoring to provide reliable physical information. It is composed by a multiple of wireless sensor nodes (WSN), and a central processing node (CPN). The WSN is capable of sampling, processing, and communicating the body information to the CPN. These nodes are placed on the human body as tiny patches. They are comfortable and convenient for human to observe their physical status in any time from the CPN. The CPN is built in some portable devices, such as PDA, cell phones, or personal laptop. Existing solutions for the WBAN applications could be found in [1-5]. The well-defined systems, e.g. Bluetooth [1] and Zigbee [2], are designed for widespread applications, including entertainment, positioning, factory product management, and healthcare. With the increasing demands in wireless body area applications, however, these two candidate systems have difficulties to meet the consumer electronic devices functions and the healthcare at the same time.

There are also some customized systems, MIThrils [3], CodeBlue [4], and Human++ [5], are explored for WBAN system constructions in body-oriented applications.

It is our intention to design WBAN systems to be tiny and low power. To achieve the target, the crystal replacement method is used. It is called crystal-less system if the system does not have a crystal oscillator as its clock source.

1896

### 1-2 Introduction to Crystal-less Systems

With the progress of the system on chip (SoC) technology, the most of the circuits in a system can be integrated in one chip. Applying the SoC technology in communications system is very attractive. Nowadays the baseband circuits are integrated in one chip except the clock source, a crystal oscillator.

Though the crystal oscillator plays an important role for system clock, it boosts the effort in system integration. The quartz crystal consumes a lot of power and occupies large area in the chip, as shown in Table 1-1 [11].

Ch1. Introduction

| Power<br>consumption | In-crystal | 1µW~200µW              |  |
|----------------------|------------|------------------------|--|
|                      | Oscillator | 1mW~50mW (active)      |  |
|                      |            | 10μW~50μW (standby)    |  |
| Aroo                 | SMD        | 3.2mm x 2.5mm x 0.5mm  |  |
| Area                 | DIP        | 11.5mm x 4.7mm x 3.5mm |  |
| Cost                 |            | US\$0.15~2             |  |

Table 1-1: The power consumption, area, and cost of the quartz crystal [11]

Wireless sensor nodes require transceivers that are small, cheap, and power efficient. To achieve better integration and less power, more cost-effective baseband chip applications, the crystal-less system is explored. One solution of the crystal-less system, which has a self-calibrated embedded crystal (eCrystal), is proposed by NCTU [6]. The eCrystal system uses its embedded clock generator instead of a crystal oscillator.

There are some existing customized solutions for crystal-less communications systems [7-10]. The CMOS circuits are used as a system reference clock in the chip, e.g. ring oscillator, digital controlled oscillator, and voltage controlled oscillator. The CMOS circuits are sensitive to the process, voltage, and temperature (PVT) variations. The customized solutions use some frequency-insensitive modulation to overcome the inaccuracy of system clock. The following table shows the modulation and oscillator design of the existing solutions for crystal-less communications systems. The initial error of the oscillator is large, i.e, above 1%.

| Systems         | modulation | oscillator design             | initial error |
|-----------------|------------|-------------------------------|---------------|
| [7] ISCAS 2008  | IR + PPM   | CMOS clock generator          | 1.1 %         |
| [8] ISSCC 2008  | OOK        | CMOS Ring Oscillator          | 2.5 %         |
| [9] PIMRC 2006  | BFSK       | voltage controlled oscillator | 1 %           |
| [10] ISSCC 2009 | ASK        | Ring Oscillator               | N/A           |

Table 1-2: Existing solutions for crystal-less communications systems

In the proposed eCrystal system, the clock source is designed. Table 1-3 shows some candidates clock source made in CMOS technology. The LC tank oscillator is not suitable in our design due to its large frequency drift and large power. Though the MEMS oscillator has good properties about the frequency drift and power, it is not taken into consideration due to its non-standard process and long processing period.

The eCrystal method in NCTU has been proposed in self-defined system specification with generation frequency 5MHz and frequency drift 0.28%.

|           | LC [31]       | MEMS [32]  | RO [33] | eCrystal |
|-----------|---------------|------------|---------|----------|
| Process   | CMOS          | CMOS/ MEMS | CMOS    | CMOS     |
| Fgen      | Carrier freq. | Wide       | 7MHz    | 5MHz     |
|           | 0.1~10GHz     | 1.5~4GHz   |         |          |
| Frequency | 10~15%        | ±0.03%     | ±2.65%  | ±0.28%   |
| Power     | 6~20mW        | 0.1mW      | 1.5mW   | 0.25mW   |
| Freq.     | Yes           | N/A        | N/A     | Yes      |

Table 1-3: Candidate clock source in CMOS technology

4

### 1-3 Motivation

To achieve the low-power and highly integrated ubiquitous healthcare monitoring applications, we apply the crystal-less technology in the WBAN system.

The communications systems need accurate clock generator, or the data won't be correctly recovered in the receiver side. Table 1-4 shows the existing WBAN system and their allowable frequency error tolerance.

| Systems       | modulation | Frequency Error Tolerance |
|---------------|------------|---------------------------|
| [1] Bluetooth | FHSS       | ±20ppm                    |
| [2] Zigbee    | DSSS       | ±40ppm                    |
| [3] MIThril   | FHSS 1     | 396 ±30ppm                |
| [4] CodeBlue  | DSSS       | ±25ppm                    |
| [5] Human++   | UWB        | N/A                       |

Table 1-4: WBAN system with their allowable frequency errors

The allowable frequency error tolerance is very small, i.e, below 40ppm. The clock generator is designed with limited initial frequency offset, say 0.28%, under process, voltage, and temperature variations.

The target is to establish a standard compatible system with crystal replacement. The OFDM-based system is selected. Due to the low precision of the eCrystal oscillator, our motivation is to design a baseband calibration algorithm to detect the clock error and compensate the frequency mismatch between the transmitter and the receiver as shown in Figure 1-1.



Figure 1-1: Motivation, pros and cons

For the large clock error, the first problem encountered is that packet is undetectable. The algorithm of detecting the packet under large frequency error is proposed. After the packet detection, the frequency error is estimated and feedback to the clock generator. The algorithm is called packet detection and frequency error estimation (PAFEE) method, as shown in Figure 1-2.



Figure 1-2: Motivation and PAFEE method

A system defined as wireless body on the chip (WiBoC), based on the OFDM scheme is used as the evaluation platform in this thesis. The crystal-less system behavior and hardware design will be discussed and analyzed. The prototype platform is also established to verify the behavior of the overall system.

### 1-4 Design Target

In the self-defined crystal-less system, there is frequency drift 0.28% in our eCrystal clock generator. To solve the large frequency error and have the performance remain the same, we have to reach the design target as shown in Table 1-5.

|             |                                 | Conventional [12]    | Proposed PAFEE        |
|-------------|---------------------------------|----------------------|-----------------------|
| Target      | Frequency error tolerance range | 100 ppm<br>1 896     | 2800 ppm              |
|             | Final accuracy                  | 20 ppm               | 20 ppm                |
|             | Detection rate                  | 99.95% @ SNR ≥ 2dB   | 99.95% @ SNR ≥ 2dB    |
|             | False alarm rate                | < 0.05 %             | < 0.05 %              |
|             | Estimation correct rate         | > 0.9999 @ SNR ≥ 2dB | > 0.9999 @ SNR ≥ 2dB  |
|             | Short preamble                  | 2.5                  | 12                    |
| Spec.       | Short preamble                  | 2.5                  | (Downlink only/ Once) |
| Requirement | Estimation time                 | 32 ms                | 153.6 ms              |
|             | Packet format                   | the same             | modified (strengthen) |

Table 1-5: Design target

We apply the 802.11a specification [12] in our WiBoC system as the conventional approach. The frequency tolerance range is 100 ppm. We have to meet the constraints of the detection rate, false alarm rate and estimation correct rate in our system with 2800 ppm frequency error condition. However, there are some requirements in the specification. The short preamble is designed strengthen.

In the following simulation, the SNR condition is set 2 dB for the packet error rate (PER) performance is 1 at SNR = 2dB. If we can guarantee the synchronization performance is stable in SNR=2 dB, the PER performance will remain the same.

### 1-5 Outline of the Thesis

The thesis is organized as follows. Chapter 2 introduces the application system, WiBoC, which is an OFDM-based WBAN system, including the overall system behavior and system specification. In Chapter 3, the proposed synchronization algorithm is presented. Chapter 4 shows the system simulation results of the proposed algorithm. The hardware implementation details are described in Chapter 5. Then the overall system emulation results are shown in Chapter 6. Finally, chapter 7 gives the conclusions and future work.

8

### Chapter 2

### **Application** System

### 2-1 Overview

In this chapter, the WiBoC system is introduced, which is the OFDM-based WBAN system. It is the evaluation platform in the thesis. The target operation scenario is shown in Figure 2-1.

The WBAN system contains one central processing node (CPN) and a multiple of wireless sensor nodes (WSN). The WSN gathers the physical information from the human body, such as EEG wave, ECG wave, glucose, blood pressure, and so on. It is capable of sampling, processing, and communicating, and presents as a tiny patch in human body. The gathered signals from a multiple of WSNs are wirelessly transmitted to a remote CPN, which is integrated in a portable device, such as mobile phone or PDA. Both the CPN and WSN have transmitter and receiver. They transmit and receive data to each other through downlink and uplink path.

For simplicity, the evaluation WiBoC system has one central processing node and one wireless sensor node in the thesis.



Figure 2-1: The target operation scenario of WBAN system

### 2-2 System Behavior

The WiBoC system behavior is shown in Figure 2-2. A successful transmission is divided into two parts, downlink and uplink. Here the downlink is defined as that CPN transmits preamble to WSN. The uplink is defined as both the preamble and data transmit from WSN to CPN. The downlink process will help WSN to detect the packet and estimate the frequency error by the packet information.



Figure 2-2: WiBoC system behavior

Figure 2-3 shows the system behavior in terms of time axis and system device connection. The body signals are gathered and transmitted from the WSNs. The CPN receives the human body signals from WSNs for ubiquitous monitoring. When the WSNs and CPN are activated, they are initially in a reset state. In the beginning of network establishment, the CPN waits for sign-in signals from possible WSNs. After all WSNs join this network, the CPN broadcasts packets to every WSN (downlink process) for timing and frequency synchronizations. After the network synchronization in the downlink process, each WSN starts to gather body signals and transmits to the CPN (uplink process). Finally, the CPN receives the body information from WSNs.



Figure 2-3: WiBoC system device connection behavior

#### 2-2-1 Downlink

In the downlink process, the CPN transmits the downlink preamble to the WSN. The preamble is the known sequence transmitted from the CPN. The WSN is activated all the time after the reset. It detects the data from the downlink channel. The preamble is transmitted through the channel and is distorted under a variation of channel conditions. Once the packet is detected, the WSN uses the remaining distorted received data to estimate the frequency error.

#### 2-2-2 Uplink

After the downlink process, the clock source in WSN is tuned correctly. The frequency mismatch from the CPN to the WSN is eliminated. The WSN starts to gather the body signal and transmits to the CPN. In the uplink process, not only the data is transmitted, but the preamble is also added in front of the data and transmitted. Then the CPN receives the complete uplink data for the further inspections.

### 2-3 Packet Format

### 2-3-1 Short Preamble

The short preamble format is from the IEEE 802.11a [12] specification. It takes advantage of the properties of short preamble to do the synchronization and frequency error estimation. The short preamble format in frequency domain S(f) is shown in the Equation 2-1,

where AS is a scaling factor that is dependent on the front-end device. In the receiver side, AS varies by the auto gain control (AGC) of the analog to digital converter (ADC). From the Equation 2-1, the short preamble format can be regard as 12 tones in frequency domain. It is also observed that the short preamble s(t) in time

domain as expressed in Equation 2-2. The 16 of the 64 time domain data are shown, because the time domain of the short preamble is repeated sequence, i.e., s(1:16)=s(17:32)=s(33:48)=s(49:64). And *as* is also a scaling factor similar to the *AS* in Equation 2-1.

$$\begin{split} s(t) &= ifft(S(f), 64) \\ s(1:16) &= as \times [ 0.0313 + 0.0313i - 0.0900 + 0.0016i - 0.0092 - 0.0533i 0.0970 - 0.0086i \\ 0.0625 & 0.0970 - 0.0086i - 0.0092 - 0.0533i - 0.0900 + 0.0016i \\ 0.0313 + 0.0313i & 0.0016 - 0.0900i - 0.0533 - 0.0092i - 0.0086 + 0.0970i \\ 0 + 0.0625i - 0.0086 + 0.0970i - 0.0533 - 0.0092i & 0.0016 - 0.0900i ], \end{split}$$

The short preamble in frequency domain and time domain are plot in Figure 2-4.

The real part and imaginary part are shown respectively.



Figure 2-4: Short preamble format

The frequency domain properties are good for us to estimate the frequency error. It has 12 tones in frequency domain. It repeated 4 times every 16 data in time domain. The repeated preamble is used to do the packet detection, and the remaining preamble is for the frequency error estimation. The preamble format is suitable in this algorithm. If the preamble that has 24 tones in frequency domain is used, the time domain data will be repeated 2 times every 32 data, as shown in Figure 2-5. If the preamble that has 6 tones in frequency domain, the time domain data will be repeated 8 times every 8 data, as shown in Figure 2-6.



Figure 2-5: Preamble format with 24-tone in frequency domain



Figure 2-6: Preamble format with 6-tone in frequency domain

If the 6-tone preamble format is selected as the short preamble, the repeated 8 data are used for packet detection. For the frequency domain information is not enough to estimate the frequency error, this short preamble is not applied in our system. If the 24-tone preamble format is selected as the short preamble, the repeated 32 data are used for packet detection. The 32 data is needed for packet detection, this will increase the preamble length and hardware overhead is raised. This is not appropriate in our system.

#### 2-3-2 Long Preamble

The long preamble format is from the IEEE 802.11a [12] specification. It takes advantage of the properties of long preamble to perform the boundary detection and fine CFO estimation. The long preamble format in frequency domain L(f) can be expressed as Equation 2-3, where AL is a scaling factor that is depend on the front-end device.

The long preamble format in time domain l(t) is expressed as Equation 2-4, where *al* is also a scaling factor similar to the *AL* in Equation 2-3.

l(t) = ifft(L(f))-0.0082 - 0.1322i 0.0704 - 0.0899i 0.1059 + 0.0564i  $= al \times [0.1250]$ -0.0078 + 0.0544i 0.0451 - 0.1098i -0.0891 - 0.0405i -0.0185 - 0.1127i 0.0754 - 0.0259i 0.0292 + 0.0073i 0.0184 - 0.1175i -0.1092 - 0.0490i 0.0125 - 0.0512i 0.0288 - 0.0270i -0.0164 + 0.1741i 0.1503 - 0.0137i 0.0625 - 0.0625i 0.0058 + 0.1126i -0.0633 + 0.0086i -0.1014 + 0.1109i0.0942 + 0.0372i 0.0420 + 0.0702i -0.0777 + 0.0346i -0.0323 + 0.0053i-0.0129 - 0.1509i -0.1417 - 0.0470i -0.1533 + 0.0383i 0.0898 - 0.1538i 0.0261 + 0.1428i - 0.1010 + 0.0310i 0.0611 + 0.1713i 0.0153 + 0.0618i0.0153 - 0.0618i 0.0611 - 0.1713i -0.1010 - 0.0310i -0.12500.0261 - 0.1428i 0.0898 + 0.1538i -0.1533 - 0.0383i -0.1417 + 0.0470i -0.0129 + 0.1509i -0.0323 - 0.0053i -0.0777 - 0.0346i 0.0420 - 0.0702i 0.0942 - 0.0372i -0.1014 - 0.1109i -0.0633 - 0.0086i 0.0058 - 0.1126i 0.0625 + 0.0625i 0.1503 + 0.0137i -0.0164 - 0.1741i 0.0288 + 0.0270i0.0125 + 0.0512i - 0.1092 + 0.0490i - 0.0184 + 0.1175i - 0.0292 - 0.0073i0.0754 + 0.0259i -0.0185 + 0.1127i -0.0891 + 0.0405i 0.0451 + 0.1098i -0.0078 - 0.0544i 0.1059 - 0.0564i 0.0704 + 0.0899i -0.0082 + 0.1322i], (2-4)

Figure 2-7 shows the long preamble format in time domain and frequency domain.



Figure 2-7: Long preamble format in time domain

### 2-3-3 Packet Format

The packet comprises of short preamble, long preamble and the data. The preamble is in the head of the packet. The data is followed behind the preamble. Figure 2-8 shows the downlink broadcast packet. The first N repeated short preambles are used for the synchronization algorithm. The N depends on the simulation and the recursive times of the synchronization, and N is 3 in our approach. The following SGI is the half of the short preamble, which is regarded as GI of the short preamble. The LGI from the long preamble are for the boundary detection, which is the half of long preamble. The subsequent two long preambles are used to perform channel estimation and further fine frequency error estimation.



Figure 2-8: Downlink packet format

Uplink packet structure is shown in Figure 2-9. The preambles and the data are scrambled into a packet. The preamble is the same as the downlink one. It is used to perform symbol timing synchronization and symbol boundary detection. The following two long preambles are used for channel estimation. The final part of the packet is the payload data.



Figure 2-9: Uplink packet format

### 2-4 WiBoC System Block Diagram

The WiBoC system block diagram is introduced in this section. The WiBoC system consists of one WSN and one CPN. The operation of the CPN just sets for simulation. We put more emphasis on the WSN behavior and its hardware implementation in this thesis.

#### 2-4-1 Wireless Sensor Node

Figure 2-10 shows the block diagram of the WSN. In the downlink process, the WSN receives the downlink preamble from the CPN in the beginning. The packet detector detects the packet all the time after the reset state. Once the packet have detected, the phase rotator calculates the signal phase between two carriers and rotates the received signal. After the processing of the phase rotator, the frequency error estimator performs the frequency error detection during the time that receiving the short preamble.

The overall frequency error is the summation of the amount estimated by the phase rotator, the frequency error estimator, and fine CFO estimator. This estimation error amount is feedback to the eCrystal, which is the tunable clock generator. The clock generator will generate the accurate clock in a short period, and the data after the short preamble (i.e long preamble) will not be lost. Then the boundary detection and fine CFO estimation is operated.

After the clock is tuned accurately, WSN starts to gather the body signal to the uplink packet and transmits to the CPN. The body signal is encoded in FEC bitstream. The FEC encoder is convolutional encoder, which is designed with coding rate R=1/2, K=7, and generator polynomials g0=133<sub>8</sub> and g1=171<sub>8</sub>. The FEC bitstream is modulated in QPSK and then passed through IFFT block. After the IFFT and GI insertion, the packet is transmitted to the uplink channel.

The WSN is composed of one transmitter (WSN\_TX) and one receiver (WSN\_RX). The yellow part in Figure 2-10 is the block diagram of synchronization algorithm, PAFEE method.





#### 2-4-2 Central Processing Node

In the CPN, the received data from the WSN have been pre-calibrated. The synchronizer detects the symbol timing with symbol correlation results. The synchronized data are then sent into the FFT block. After transformation, the phase recovery is performed to calibrate the data. Then the recovered data are demapped for later FEC decoding. The CPN clock diagram is shown in Figure 2-11.



Figure 2-11: CPN block diagram

### 2-5 Channel Model

The channel condition is important for the wireless communications systems. The system design depends on the channel conditions. In WBAN applications, the channel is assumed simple because its communication distance is short, i.e., smaller than 3 meter. We can ignore the multipath channel effect. In this section, three kinds of channel conditions are focused, additive white Gaussian noise (AWGN) channel, carrier frequency offset (CFO) and sampling clock offset (SCO).

#### 2-5-1 AWGN

In the wireless channel, due to the thermal vibrations of atoms in antennas, shot noise, black body radiation, etc, the receiver can not receive identical signal from the transmitter. The phenomenon is called additive white Gaussian noise, say, AWGN. When the original signal pass through the AWGN channel, the linear addition noise is added to the signal. The additive noise is modeled as a Gaussian distribution. The AWGN generated by MATLAB can be expressed as Equation 2-5,

$$w(t) = randn(1, L) \times rms + j \times randn(1, L) \times rms$$
  
where  $rms = \frac{1}{\sqrt{2}} \times 10^{(P_{data} - SNR)/20}$  (2-5)

The randn(1, L) returns an 1-by-L matrix containing random values from a normal distribution with mean zero and standard deviation one. The rms represents the normalized root mean square power of the coming L data. The  $P_{data}$  is the power of the data with unit dB.

#### 2-5-2 Carrier Frequency Offset

Carrier frequency offset is caused by the frequency mismatch between the transmitter and receiver in the front-end device, i.e., synthesizer, modulator, and demodulator.

Take the short preamble in our system as the example, the following Figure 2-12 shows the frequency offset effect in frequency domain. The system operates in 1.4GHz radio band, and occupies 5MHz bandwidth. The frequency domain is comprehended in terms of 64-point FFT results. The x-axis represents the index from 1 to 64. The y-axis is the magnitude of the subcarrier. Consider the frequency allocation in RF band as the fixed window which is occupied from fc-BW/2 to fc + BW/2 in the frequency domain. The fc and BW represent the center frequency and the bandwidth respectively. The frequency offset effect is regarded as the data shift out of the window in the frequency domain. This phenomenon results in large amount data loss. From the Figure 2-12, the carriers are shifted left with the frequency that is proportional to the frequency offset amount. When the CFO = 3000 ppm, the carriers are shifted about 1.4GHz x 3000ppm = 4.2 MHz, which is close to the bandwidth 5MHz. There is no data in this window under such large frequency offset.


Figure 2-12: CFO effect

The effects of the frequency offset can be defined in two types, integral frequency offset (IFO) and fractional frequency offset (FFO). IFO is the effect that the received frequency domain data are in the wrong frequency position. FFO means loss of mutual subcarrier orthogonality. Figure 2-12 (2), (3), and (6) shows the joint influence of IFO and FFO in the subcarrier. Figure 2-12 (4) and (5) shows that IFO impact dominates a lot.

Consider the short preamble format in our system, because of the FFT sizes and frequency allocation, the IFO can be calculated as Equation 2-6,

$$IFO = \frac{BW}{RF} \times \frac{1}{Minimum \ Carrier \ Spacing} \times N$$
$$= \frac{5MHz}{1.4GHz} \times \frac{1}{16} \times N = 223.19 \times N \text{ ppm}$$
$$(2-6)$$

where N is an integer. Because the FFT size is 64, and the subcarrier locates every 4 points, the minimum carrier spacing is defined as 64/4 = 16.

The mathematical model of the CFO effect can be expressed as the following Equations, from 2-7 to 2-16. To operate in the coordination, please reference to the Figure 2-13.

When the baseband signal is ready, first it will pass through the DAC.

$$a(t) = \sum_{k=1}^{\infty} a_k p(t - kT_s), \quad b(t) = \sum_{k=1}^{\infty} b_k p(t - kT_s)$$
(2-7~8)

The modulator combines a(t) and b(t) to s(t) and transmits to the channel.

$$s(t) = a(t)\cos(2\pi f_c t) - b(t)\sin(2\pi f_c t)$$
  

$$r(t) = s(t) + n(t)$$
(2-9~10)

Let  $\Delta f = f_c - f_e$ , and  $\Delta f_1 = f_c + f_e$ . The receiver demodulates r(t) to be c(t) and d(t).

$$c(t) = r(t) \times (2\cos(2\pi f_e t)) + n_1(t)$$
  
=  $a(t) \{\cos(2\pi\Delta f_1 t) + \cos(2\pi\Delta f t)\} - b(t) \{\sin(2\pi\Delta f_1 t) + \sin(2\pi\Delta f t)\} + n_1(t)$   
 $d(t) = r(t) \times (2\sin(2\pi f_e t)) + n_2(t)$   
=  $a(t) \{\sin(2\pi\Delta f_1 t) - \sin(2\pi\Delta f t)\} + b(t) \{\cos(2\pi\Delta f_1 t) - \cos(2\pi\Delta f t)\} + n_2(t)$ 

(2-11~12)

After the low pass filter, the terms with  $f_1$  are filtered out. The e(t) and f(t) are obtained.

$$e(t) = a(t)\cos(2\pi\Delta ft) - b(t)\sin(2\pi\Delta ft)$$

$$f(t) = a(t)\sin(2\pi\Delta ft) + b(t)\cos(2\pi\Delta ft)$$
(2-13~14)
Consider the CFO effect only, let Te = Ts.
$$Let \ x(t) = a(t) + jb(t)$$
1896
$$c_n = \int_{nTe}^{(n+1)Te} e(t)dt = \int_{nTs}^{(n+1)Ts} \{a(t)\cos(2\pi\Delta ft) - b(t)\sin(2\pi\Delta ft)\}dt$$

$$= \int_{nTs}^{(n+1)Te} \Re\{x(t)\exp(j2\pi\Delta ft)\}dt = \Re\{x_n\exp(j2\pi\Delta fnT_s)\}$$

$$d_n = \int_{nTe}^{(n+1)Te} f(t)dt = \int_{nTs}^{(n+1)Ts} \{a(t)\sin(2\pi\Delta ft) + b(t)\cos(2\pi\Delta ft)\}dt$$

$$= \int_{nTs}^{(n+1)Te} \Im\{x(t)\exp(j2\pi\Delta ft)\}dt = \Im\{x_n\exp(j2\pi\Delta fnT_s)\}$$

(2-15~16)

 $c_n \mbox{ and } d_n \mbox{ are the received signal in the baseband}.$ 



Figure 2-13: mathematical CFO model

From the mathematical model, the CFO model in the MATLAB platform as shown in Figure 2-14 is established. To model the analog signal, we upsample the signal by a factor M and use a low pass filter (LPF) to eliminate the redundant shadow signal. Then the subcarriers are multiplied by the exponential terms, which are caused by Equations 2-15 and 2-16. The MATLAB equation is expressed from Equations 2-17 to 2-19.

$$f=1.4 \times 10^{9} \times (CFO\_ppm/M) \times 10^{-6}$$
  
t=1/(5×10<sup>6</sup>)  
data\_cfo(k)=data(k) × exp(j × 2π × f × t × k) (2-17~19)

where the CFO\_ppm is the frequency offset value in unit ppm. The data\_cfo represents the data that distorted by CFO effect. The k represents the index of the subcarriers. After the multiplying, the LPF and down-sampling filter in Figure 2-14 works like the front-end LPF before ADC.



Figure 2-14: MATLAB CFO model

#### 2-5-3 Sampling Clock Offset

Sampling clock offset (SCO) is caused by the ADC and DAC sampling clock mismatch between the transmitter and the receiver. The effect is modeled in the Equation 2-20 at the receiver side. The  $R_{preADC}(t)$  and R(t) represents the signal before and after the ADC respectively. The  $\triangle P_n$  is the sampling error amount.  $\ell$  is the index of the sampling point.

$$R(nT_{s}) = R_{preADC}(nT_{s}) * \sin c(\frac{nT_{s} - \Delta P_{n}}{T_{s}})$$

$$= \sum_{k=-\ell}^{\ell} R_{preADC}(nT_{s} - kT_{s}) \times \sin c(k - \frac{\Delta P_{n}}{T_{s}})$$
(2-20)

Take the short preamble as an example, Figure 2-15 shows the distortion of the SCO effect on the real part of the preamble in time domain. The imaginary part of the preamble will be distorted in the same way, for simplicity, the imaginary part is not plotted. In general, the SCO effect does not distort the signal severely. From the Figure 2-15, the SCO added in the signal is from 100 to 3000 ppm. It doesn't influence the data a lot.

The major problem of the SCO effect is the sample data may loss. For example, if the SCO is 200 ppm, that means the sampling clock is 5.001 MHz and the period is 199.96 ns. In other words, there is 0.04 ns mismatch each sample. Our system clock frequency is 5MHz, and the period is 200 ns. That means after 200/0.04 = 5000

samples, one data may be loss.



Figure 2-15: SCO effect

#### 2-5-4 Link Budget

In the wireless communications system, the transmitted signal may lose energy

during propagation from the transmitter to the receiver. The transmitted information will loss due to the distance propagation loss, diffraction loss, penetration loss, or others loss. For the WBAN system, the T-R separation distance is not very long, but the path loss will still affect our system. The key factor that affects the path loss is channel model of the system. However, few attempts have been made to characterize electromagnetic propagation around human body. It is difficult to develop the simple mathematic channel model of the human body. EM waves can propagate around the body via two path. One is the penetration inside the body and the second path is creeping wave that follows the surface of the body. Reference [13] uses the simulators with an anatomically accurate model of the human body, derived from the Visual Human project of the National Library of Medicine. They projected that penetrating loss is even higher than the propagation loss, so the contribution of the penetrating wave can be neglected. Reference [14] shows that the overall peak to peak variation of path loss between the belt and shoulders in an anechoic chamber with changes in posture is 16dB at 2.45GHz, and the S21 is from -68~-86dB. Statistics [15] shows that the channel exhibits great variability due to body movements with path loss data behaving in different manors depending mainly on the freedom of communication link between transmitting and receiving antenna.

For the lack of information about the channel model and path loss mathematics model in the 1.4GHz, our issue is to discuss the path loss in the channel that encounter and set the restriction and specification of the parameters such as required SNR and sensitivity in the receiver by link budget. The parameter about the link budget is listed in the Table 2-1 below.

| Parameter                                    | value (dBm) |
|----------------------------------------------|-------------|
| Transmitted Power (Pt)                       | 10 dBm      |
| Transmitter Antenna Gain (Gt)                | 0 dBm       |
| Transimitter Cable Loss (Lt)                 | 3 dBm       |
| Receiver Antenna Gain (Gr)                   | 0 dBm       |
| Receiver Cable Loss (Lr)                     | 3 dBm       |
| Receiver Sensitivity (Rs)                    | -80 dBm     |
| Free space T-R separation path loss (PL1)    | 45.5 dB     |
| PL1 = $20\log(4\pi f_c/c) + 20\log(d), d=3m$ |             |
| Path loss from the human body (PL2)          | 35 dB       |
| Remaining margin (Prm) <sup>896</sup>        | 3.5 dB      |
|                                              |             |

Table 2-1: Link Budget

A radio Link consists of three basic elements: effective transmitting power (Pte), propagation loss (PL), and effective receiving sensibility (Prs). The effective transmitting power is transmitter power minus cable loss plus antenna gain, which is shown as Equation 2-21.

$$Pte = Pt + Gt - Lt \tag{2-21}$$

In the WBAN channel, the path loss is divided into two kinds, one is due to free space T-R separation, the other is the loss from the human body. The path loss is the summation of them, as shown in Equation 2-22.

$$PL = PL1 + PL2 \tag{2-22}$$

The receiver sensitivity (Rs) is the combination of the noise power and noise figure in the receiver. It is depending on manufacturer between -78 to -85 dBm. The overall effective receiving sensitivity is shown in Equation 2-23.

$$Prs = Gr - Lr - Rs$$
 (2-23)

For the overall link, the remaining margin (Prm) is the summation of these three components, as shown in Equation 2-24.

In our WBAN system, the wireless medical telemetry service (WMTS) radio bands are used in our applications. Wireless medical telemetry is generally used to monitor patient physiological parameters (i.e., ECG, EEG signal) over a distance via radio-frequency (RF) communications between a transmitter worn by the patient (i.e., WSN) and a central monitoring station (i.e., CPN). It provides 3 bands for the services, 608-614 MHz, 1395-1400 MHz, and 1427-1432 MHz. Our WiBoC system is designed targeting at 1395 – 1400 MHz WMTS band. The bandwidth is 5MHz.

## Chapter 3 Synchronization PAFEE Method

In this chapter, the synchronization algorithm in the receiver side will be described. The synchronization algorithm is called PAFEE method, which is the abbreviation of packet detection and frequency error estimation method. This operation works all the time after the reset state. The transmitter sends the known preamble for receiver synchronization. Once the packet has been detected, the remaining data are used for frequency error estimation. The estimation value is sent to embedded crystal (eCrystal), and the system clock can be tuned accurately. The overall synchronization procedure has been done.

## 3-1 PAFEE -- Packet Detection Algorithm

In our channel condition (i.e., large frequency error), the conventional packet detection algorithm may be failed due to the large loss of the data. In our proposed algorithm, the packet detection algorithm development is based on the conventional one. The receiver operating characteristic (ROC) curves are shown to prove the packet detection algorithm is successful under large frequency error.

#### 3-1-1 Conventional Algorithm

There are some conventional approaches for symbol timing synchronization and packet detection. The Van de Beek's [16] work presents the joint maximum likelihood symbol timing and carrier frequency offset estimator in OFDM systems. He describes a method using a correlation with the cyclic prefix to find the symbol. The symbol timing can be found in Equation 3-1.

$$\hat{\theta} = \arg \max_{\theta} \left\{ \left| \sum_{k=\theta}^{\theta+L+1} r(k) r^{*}(k+N) \right| - \frac{\rho}{2} \sum_{k=\theta}^{\theta+L+1} (|r(k)|^{2} + |r(k+N)|^{2}) \right\}$$
where  $\rho = \left| \frac{E\{r(k)r^{*}(k+N)\}}{\sqrt{E\{|r(k)|^{2}\}E\{|r(k+N)|^{2}\}}} \right|$ 
(3-1)

The r(k) represents the baseband receive signal. The detector decides a maximum argument to be the packet arrival.

Schmidl and Cox [17] present the auto-correlation method for packet detection. This algorithm operates near the Cramer-Rao lower bound for the variance of the frequency offset. This method is very usual for packet detection in OFDM system. The timing metric is defined as Equation 3-2.

$$V(d) = \frac{\left|\sum_{m=0}^{L-1} r_{d+m}^* r_{d+m+L}\right|^2}{\left(\sum_{m=0}^{L-1} \left|r_{d+m+L}\right|^2\right)^2}$$
(3-2)

where d is a time index corresponding to the first sample in a window of 2L samples. The window slides along in time as the receiver searches the first training symbol.

To improve the timing metric, Minn [18] presents another synchronization scheme applicable to OFDM systems. In Minn's approach, the preamble symbol is designed to have a sharp timing metric trajectory. He uses the minimum Euclidean distance to define the timing metric, the Equation 3-3 shows the normalized metric.

$$V_{n}(d) = \frac{E(d) - 2|P(d)|}{E(d)} = 1 - \frac{2|P(d)|}{E(d)}$$
  
where  $E(d) = \sum_{i=0}^{M-1} (|r(i+d+M)|^{2} + |r(i+d)|^{2}),$  (3-3)  
and  $P(d) = \sum_{i=0}^{M-1} r(i+d+M)r^{*}(i+d)$ 

where d is a time index, and M is the data length for auto-correlation. Finding the minimum of Euclidean distance  $V_n(d)$  is equivalent to finding the maximum of  $\frac{|P(d)|}{E(d)}$ . The method is similar to the approach of Schmidl and Cox. On the other hand, Minn's approach uses different preamble format. The time domain of the short preamble from the Chapter 2 is repeated 4 times. Consider the repeated part to be A. The original preamble is  $s(t) = [A \ A \ A]$ . In Minn's approach, the preamble becomes  $[-A \ A - A \ -A]$ .

From the three famous approaches about the packet detection in OFDM systems, they all use the similarities between two repeated data. However, the methods are for the OFDM systems which have small frequency error, i.e., FFO only. Our system encounters the large frequency error problems, i.e., joint FFO and IFO. The new approach has to be derived.

#### 3-1-2 Proposed Algorithm

In our channel conditions, the data will be distorted severely. Though the violent data loss, the likelihood between the repeated data is existed. Consider the minimum Euclidean and maximum likelihood issues, the timing metric is defined in Equation 3-4.

$$V(n) = \frac{|C(n)|^2}{P(n)}$$
where  $C(n) = \sum_{i=0}^{M-1} r_{i+n} r_{i+n+M}^*$ , and  $P(n) = \sum_{i=0}^{M-1} |r_{i+n+M}^*|^2$ 
(3-4)

where d is a time index, and M is the data length for auto-correlation. The Equation 3-4 can be expressed as Equation 3-5.

$$V(n) = \frac{|C(n)|^{2}}{P(n)} = \frac{\left|\sum_{i=0}^{M-1} r_{i+n}r^{*}_{i+n+M}\right|^{2}}{\sum_{i=0}^{M-1} |r_{i+n+M}|^{2}}$$

$$= \frac{(R_{l}^{i+n}R_{l}^{i+n+M} + R_{Q}^{i+n}R_{Q}^{i+n+M})^{2} + (R_{Q}^{i+n}R_{l}^{i+n+M} - R_{l}^{i+n}R_{Q}^{i+n+M})^{2}}{\sum_{i=0}^{M-1} (r_{i+n+M}^{i})^{2} + (r_{i+n+M}^{Q})^{2}} = \frac{u_{1}^{2} + u_{2}^{2}}{\sum_{i=0}^{M-1} (u_{4i})^{2} + (u_{5i})^{2}}$$
where  $u_{1} = R_{l}^{i+n}R_{l}^{i+n+M} + R_{Q}^{i+n}R_{Q}^{i+n+M} = \sum_{i=0}^{M-1} r_{i+n}^{i}r_{i+n+M}^{i} + r_{i+n}^{Q}r_{i+n+M}^{Q}$ ,
and  $u_{2} = R_{Q}^{i+n}R_{l}^{i+n+M} - R_{l}^{i+n}R_{Q}^{i+n+M} = \sum_{i=0}^{M-1} r_{i+n}^{Q}r_{i+n+M}^{i} + r_{i+n}^{i}r_{i+n+M}^{Q}$ 
(3-5)

Let  $H_0$  be the null hypothesis, that is, there is no OFDM preamble transmission present. Let  $H_1$  be the alternate hypothesis, i.e., the OFDM preambles are present, as shown in Equation 3-6.

$$H_0: r_i = \omega_i H_1: r_i = s_i + \omega_i$$
, where  $\omega_i \sim N(0, \sigma_n^2)$  (3-6)

The noise is assumed to be white, with mean 0 and variance  $\sigma_n^2$ . Let the energy of  $s_i$  is  $\sigma_s^2$ , and the real part and the imaginary part of the energy are  $\sigma_s^2/2$  and  $\sigma_s^2/2$  respectively. For H<sub>0</sub>, by the central limit theorem (CLT), the u<sub>1</sub> and u<sub>2</sub> can be

regarded as Gaussian distribution. The  $u_{4i}$  and  $u_{5i}$  are Gaussian, too. The probability distribution of V(n) can be approximate to the F-distribution [19], with the degree 2 in nominator, and degree 2M in the denominator. To obtain the probability density function of the F-distribution, the mean and variance of  $u_1$ ,  $u_2$ ,  $u_{4i}$  and  $u_{5i}$  have to be deduced. The white noise is statistically uncorrelated, so the mean of  $u_1$ ,  $u_2$ ,  $u_{4i}$  and  $u_{5i}$  are all zero, as shown in Equations 3-7 to 3-10.

$$E[u_{1} | H_{0}] = E\left[\sum_{i=0}^{M-1} r_{i+n}^{l} r_{i+n+M}^{l} + r_{i+n}^{Q} r_{i+n+M}^{Q}\right] = E\left[\sum_{i=0}^{M-1} \omega_{i+n}^{l} \omega_{i+n+M}^{l} + \omega_{i+n}^{Q} \omega_{i+n+M}^{Q}\right] = 0$$

$$E[u_{2} | H_{0}] = E\left[\sum_{i=0}^{M-1} r_{i+n}^{Q} r_{i+n+M}^{l} - r_{i+n}^{l} r_{i+n+M}^{Q}\right] = E\left[\sum_{i=0}^{M-1} \omega_{i+n}^{Q} \omega_{i+n+M}^{l} - \omega_{i+n}^{l} \omega_{i+n+M}^{Q}\right] = 0$$

$$E[u_{4i} | H_{0}] = E\left[r_{i+n+M}^{l}\right] = E\left[\omega_{i+n+M}^{l}\right] = 0$$

$$E[u_{5i} | H_{0}] = E\left[r_{i+n+M}^{Q}\right] = E\left[\omega_{i+n+M}^{Q}\right] = 0$$
(3-7~10)

To obtain the variance, the calculation of the second moment of  $u_1$  is shown as Equation 3-11.

$$E\left[u_{1}^{2} \mid H_{0}\right] = E\left[\left(\sum_{i=0}^{M-1} r_{i+n}^{\prime} r_{i+n+M}^{\prime} + r_{i+n}^{Q} r_{i+n+M}^{Q}\right)^{2}\right] = E\left[\left(\sum_{i=0}^{M-1} \omega_{i+n}^{\prime} \omega_{i+n+M}^{\prime} + \omega_{i+n}^{Q} \omega_{i+n+M}^{Q}\right)^{2}\right]$$
(3-11)

Let  $a_i = r_{i+n}^{\prime} r_{i+n+M}^{\prime}$ , and  $b_i = r_{i+n}^{Q} r_{i+n+M}^{Q}$ . They can be regarded as uncorrelated noise.

The Equation 3-11 can be written as

$$\begin{split} E\left[u_{1}^{2} \mid H_{0}\right] &= E\left[\left(\sum_{i=0}^{M-1} (a_{i} + b_{i})\right)^{2}\right] \\ &= E\left[\sum_{i=0}^{M-1} a_{i}^{2}\right] + E\left[\sum_{i=0}^{M-1} b_{i}^{2}\right] + E\left[\sum_{i=0}^{M-1} (2a_{i}b_{j} + (2a_{i}a_{j}) \mid_{i \neq j} + (2b_{i}b_{j}) \mid_{i \neq j})\right] \\ &= E\left[\sum_{i=0}^{M-1} (a_{i}^{2} + b_{i}^{2})\right] = E\left[\sum_{i=0}^{M-1} ((\omega_{i+n}^{l} \omega_{i+n+M}^{l})^{2} + (\omega_{i+n}^{Q} \omega_{i+n+M}^{Q})^{2})\right] \\ &= \sum_{i=0}^{M-1} (E((\omega_{i+n}^{l})^{2})E((\omega_{i+n+M}^{l})^{2}) + E((\omega_{i+n}^{Q})^{2})E((\omega_{i+n+M}^{Q})^{2})) \\ &= M\left[\left(\frac{\sigma_{n}^{2}}{2}\right)\left(\frac{\sigma_{n}^{2}}{2}\right) + \left(\frac{\sigma_{n}^{2}}{2}\right)\left(\frac{\sigma_{n}^{2}}{2}\right)\right] \\ &= M\left(\frac{\sigma_{n}^{4}}{2}\right) \end{split}$$

In the same manner, the second moment of  $u_2$  is obtained

$$E\left[u_{2}^{2} \mid H_{0}\right] = M\left(\frac{\sigma_{n}^{4}}{2}\right)$$
(3-12)  
The second moment of  $u_{4i}$  and  $u_{5i}$  and is shown in Equation 3-13.  
$$E\left[u_{4i}^{2} \mid H_{0}\right] = E\left[u_{5i}^{2} \mid H_{0}\right] = E\left[\omega_{i}^{2}\right] = \frac{\sigma_{n}^{2}}{2}$$
(3-13)

From Equations 3-11 to 3-13, the variance is achieved in Equations 3-14 to 3-17.

$$Var\left[u_{1}^{2} \mid H_{0}\right] = E\left[u_{1}^{2} \mid H_{0}\right] - \left(E\left[u_{1} \mid H_{0}\right]\right)^{2} = M\left(\frac{\sigma_{n}^{4}}{2}\right)$$

$$Var\left[u_{2}^{2} \mid H_{0}\right] = E\left[u_{2}^{2} \mid H_{0}\right] - \left(E\left[u_{2} \mid H_{0}\right]\right)^{2} = M\left(\frac{\sigma_{n}^{4}}{2}\right)$$

$$Var\left[u_{4i}^{2} \mid H_{0}\right] = E\left[u_{4i}^{2} \mid H_{0}\right] - \left(E\left[u_{4i} \mid H_{0}\right]\right)^{2} = \left(\frac{\sigma_{n}^{2}}{2}\right)$$

$$Var\left[u_{5i}^{2} \mid H_{0}\right] = E\left[u_{5i}^{2} \mid H_{0}\right] - \left(E\left[u_{5i} \mid H_{0}\right]\right)^{2} = \left(\frac{\sigma_{n}^{2}}{2}\right)$$

$$Var\left[u_{5i}^{2} \mid H_{0}\right] = E\left[u_{5i}^{2} \mid H_{0}\right] - \left(E\left[u_{5i} \mid H_{0}\right]\right)^{2} = \left(\frac{\sigma_{n}^{2}}{2}\right)$$

By the same way for  $H_1$ , the mean and variance of the  $u_1$ ,  $u_2$ ,  $u_{4i}$  and  $u_{5i}$  can be

figured out. First, the mean of  $u_1$  and  $u_2$  are calculated in Equations 3-18 and 3-19.

$$E[u_{1} | H_{1}] = E\left[\sum_{i=0}^{M-1} r_{i+n}^{i} r_{i+n+M}^{i} + r_{i+n}^{Q} r_{i+n+M}^{Q}\right]$$
  

$$\therefore E\left[\sum_{i=0}^{M-1} r_{i+n}^{i} r_{i+n+M}^{i}\right] = E\left[\sum_{i=0}^{M-1} (s_{i+n}^{i} + \omega_{i+n}^{i})(s_{i+n+M}^{i} + \omega_{i+n+M}^{i})\right]$$
  

$$= E\left[\sum_{i=0}^{M-1} (s_{i+n}^{i} s_{i+n+M}^{i} + s_{i+n}^{i} \omega_{i+n+M}^{i} + \omega_{i+n}^{i}(s_{i+n+M}^{i} + \omega_{i+n+M}^{i}))\right]$$
  

$$= \sum_{i=0}^{M-1} \left[E(s_{i+n}^{i} s_{i+n+M}^{i}) + E(s_{i+n}^{i})E(\omega_{i+n+M}^{i}) + E(\omega_{i+n}^{i})E(s_{i+n+M}^{i} + \omega_{i+n+M}^{i})\right]$$
  

$$= M \times E(s_{i+n}^{i} s_{i+n+M}^{i}) = M \frac{\sigma_{s}^{2}}{2}$$
  
and 
$$E\left[\sum_{i=0}^{M-1} r_{i+n}^{Q} r_{i+n+M}^{Q}\right] = M \frac{\sigma_{s}^{2}}{2}$$
  

$$\therefore E[u_{1} | H_{1}] = M \sigma_{s}^{2}$$
  
the same manner, the mean of the us is obtained.

In the same manner, the mean of the  $u_2$  is obtained.

$$E[u_2 | H_1] = E\left[\sum_{i=0}^{M-1} r_{i+n}^Q r_{i+n+M}^i - r_{i+n}^I r_{i+n+M}^Q\right] = 0$$
(3-19)

The  $u_{4i}$  and  $u_{5i}$  are easily obtained by Equations 3-20 and 3-21.

$$E[u_{4i} | H_1] = E[r'_{i+n+M}] = E[s'_{i+n+M} + \omega'_{i+n+M}] = 0$$
  

$$E[u_{5i} | H_1] = E[r^Q_{i+n+M}] = E[s^Q_{i+n+M} + \omega^Q_{i+n+M}] = 0$$
(3-20~21)

To obtain the variance, the calculation of the second moment of  $u_1$  is shown as Equation 3-22.

$$E\left[u_{1}^{2} \mid H_{1}\right] = E\left[\left(\sum_{i=0}^{M-1} \left(r_{i+n}^{I} r_{i+n+M}^{I} + r_{i+n}^{Q} r_{i+n+M}^{Q}\right)\right)^{2}\right]$$

$$= E\left[\sum_{i=0}^{M-1} a_{i}^{2}\right] + E\left[\sum_{i=0}^{M-1} b_{i}^{2}\right] + E\left[\sum_{i=0}^{M-1} \sum_{j=0}^{M-1} \left(2a_{i}b_{j} + \left(2a_{i}a_{j}\right)\right)_{i\neq j} + \left(2b_{i}b_{j}\right)\right]_{i\neq j}\right]$$
(3-22)

To be simplified, each term is analyzed individually,

$$E\left[\sum_{i=0}^{M-1} a_{i}^{2}\right] = E\left[\sum_{i=0}^{M-1} \left(r_{i+n}^{\prime} r_{i+n+M}^{\prime}\right)^{2}\right] = E\left[\sum_{i=0}^{M-1} \left(\left(s_{i+n}^{\prime} + \omega_{i+n}^{\prime}\right)\left(s_{i+n+M}^{\prime} + \omega_{i+n+M}^{\prime}\right)\right)^{2}\right]$$
$$= E\left[\sum_{i_{1}=0}^{M-1} \sum_{i_{2}=0}^{M-1} \left(s_{i_{1}+n}^{\prime} + \omega_{i_{1}+n}^{\prime}\right)\left(s_{i_{1}+n+M}^{\prime} + \omega_{i_{1}+n+M}^{\prime}\right)\left(s_{i_{2}+n}^{\prime} + \omega_{i_{2}+n}^{\prime}\right)\left(s_{i_{2}+n+M}^{\prime} + \omega_{i_{2}+n+M}^{\prime}\right)\right] \quad (3-23)$$
$$= \sum_{i_{1}=0}^{M-1} \sum_{i_{2}=0}^{M-1} E\left[\left(s_{i_{1}+n}^{\prime} + \omega_{i_{1}+n}^{\prime}\right)\left(s_{i_{1}+n+M}^{\prime} + \omega_{i_{1}+n+M}^{\prime}\right)\left(s_{i_{2}+n}^{\prime} + \omega_{i_{2}+n}^{\prime}\right)\left(s_{i_{2}+n+M}^{\prime} + \omega_{i_{2}+n+M}^{\prime}\right)\right]$$

If a, b, c, and d are jointly Gaussian random variables, then from [20], we have

$$E[abcd] = E[ab]E[cd] + E[ac]E[bd] + E[ad]E[bc]$$
(3-24)  

$$-2E[a]E[b]E[c]E[d]$$
Apply the Equations 3-24 to 3-23,  

$$E[(s'_{i_{i}+n} + \omega'_{i_{i}+n})(s'_{i_{i}+n+M} + \omega'_{i_{i}+n+M})(s'_{i_{2}+n} + \omega'_{i_{2}+n})(s'_{i_{2}+n+M} + \omega'_{i_{2}+n+M})]$$

$$= E[(s'_{i_{i}+n} + \omega'_{i_{i}+n})(s'_{i_{2}+n} + \omega'_{i_{2}+n})]E[(s'_{i_{2}+n} + \omega'_{i_{2}+n})(s'_{i_{2}+n+M} + \omega'_{i_{2}+n+M})]$$

$$+ E[(s'_{i_{i}+n} + \omega'_{i_{i}+n})(s'_{i_{2}+n+M} + \omega'_{i_{2}+n})]E[(s'_{i_{1}+n+M} + \omega'_{i_{1}+n+M})(s'_{i_{2}+n+M} + \omega'_{i_{2}+n+M})]$$

$$+ E[(s'_{i_{i}+n} + \omega'_{i_{i}+n})(s'_{i_{2}+n+M} + \omega'_{i_{2}+n+M})]E[(s'_{i_{1}+n+M} + \omega'_{i_{1}+n+M})(s'_{i_{2}+n} + \omega'_{i_{2}+n+M})]$$

$$= \left(\frac{\sigma_{s}^{2}}{2}\right)^{2} + \left(\frac{\sigma_{s}^{2} + \sigma_{n}^{2}}{2}\right)^{2} + \left(\frac{\sigma_{s}^{2}}{2}\right)^{2} - 0$$

$$= \left(\frac{\sigma_{s}^{2} + \sigma_{n}^{2}}{2}\right)^{2} + 2\left(\frac{\sigma_{s}^{2}}{2}\right)^{2}$$

$$E[\sum_{i=0}^{M-1} a_{i}^{2}] = \frac{M^{2}}{4} \left[\left(\sigma_{s}^{2} + \sigma_{n}^{2}\right)^{2} + 2\left(\sigma_{s}^{2}\right)^{2}\right]$$
(3-25)

In the same way, Equation 3-26 is followed,

Ch3. Synchronization PAFEE Method

$$E\left[\sum_{i=0}^{M-1} b_i^2\right] = \frac{M^2}{4} \left[ \left(\sigma_s^2 + \sigma_n^2\right)^2 + 2\left(\sigma_s^2\right)^2 \right]$$
(3-26)

As to the third term in Equation 3-22, it is derived in Equation 3-27.

$$\begin{split} E\left[\sum_{i=0}^{M-1}\sum_{j=0}^{M-1} \left(2a_{j}b_{j}\right)\right] &= E\left[\sum_{i=0}^{M-1}\sum_{j=0}^{M-1} \left(2r_{i+n}^{\prime}r_{i+n+M}^{\prime}r_{i+n}^{Q}r_{i+n}^{Q}\right)\right] \\ &= 2\sum_{i=0}^{M-1}\sum_{j=0}^{M-1} E\left(r_{i+n}^{\prime}r_{i+n+M}^{\prime}r_{i+n}^{Q}r_{i+n}^{Q}\right) \\ &\because E\left(r_{i+n}^{\prime}r_{i+n+M}^{\prime}\right) E\left(r_{i+n}^{Q}r_{i+n+M}^{Q}\right) + E\left(r_{i+n}^{\prime}r_{i+n}^{Q}\right) E\left(r_{i+n+M}^{\prime}r_{i+n+M}^{Q}\right) + E\left(r_{i+n}^{\prime}r_{i+n+M}^{Q}\right) + E\left(r_{i+n}^{\prime}r_{i+n+M}^{Q}\right) + E\left(r_{i+n}^{\prime}r_{i+n+M}^{Q}\right) \\ &= \left(\frac{\sigma_{s}^{2}}{2}\right)\left(\frac{\sigma_{s}^{2}}{2}\right) + 0 + 0 = \left(\frac{\sigma_{s}^{4}}{4}\right) \end{split}$$

$$E\left[\sum_{i=0}^{M-1}\sum_{j=0}^{M-1} (2a_{j}a_{j})\Big|_{i\neq j}\right] = E\left[\sum_{i=0}^{M-1}\sum_{j=0}^{M-1} (2r_{i+n}^{I}r_{i+n+M}^{I}r_{j+n}^{I}r_{j+n+M}^{I})\Big|_{i\neq j}\right]$$

$$= 2\sum_{i=0}^{M-1}\sum_{j=0}^{M-1} E\left(r_{i+n}^{I}r_{i+n+M}^{I}r_{j+n}^{I}r_{j+n+M}^{I}\right)\Big|_{i\neq j}$$

$$E\left(r_{i+n}^{I}r_{i+n+M}^{I}r_{j+n}^{I}r_{j+n+M}^{I}\right) = E\left(r_{i+n}^{I}r_{i+n+M}^{I}\right) = E\left(r_{i+n}^{I}r_{i+n+M}^{I}\right) = E\left(r_{i+n}^{I}r_{i+n+M}^{I}\right) = E\left(r_{i+n}^{I}r_{i+n+M}^{I}\right) = E\left(\frac{\sigma_{s}^{4}}{2}\right) = 0$$

$$\begin{split} & E\left[\sum_{i=0}^{M-1}\sum_{j=0}^{M-1} \left(2b_{j}b_{j}\right)\right|_{i\neq j}\right] = E\left[\sum_{i=0}^{M-1}\sum_{j=0}^{M-1} \left(2r_{i+n}^{Q}r_{i+n+M}^{Q}r_{j+n}^{Q}r_{j+n+M}^{Q}\right)\right|_{i\neq j}\right] \\ &= 2\sum_{i=0}^{M-1}\sum_{j=0}^{M-1} E\left(r_{i+n}^{Q}r_{i+n+M}^{Q}r_{j+n}^{Q}r_{j+n+M}^{Q}\right)\right|_{i\neq j} \\ &\simeq E\left(r_{i+n}^{Q}r_{i+n+M}^{Q}r_{j+n}^{Q}r_{j+n+M}^{Q}\right) + E\left(r_{i+n}^{Q}r_{j+n}^{Q}\right) E\left(r_{i+n+M}^{Q}r_{j+n+M}^{Q}\right) + E\left(r_{i+n}^{Q}r_{j+n+M}^{Q}\right) + E\left(r_{i+n+M}^{Q}r_{j+n+M}^{Q}\right) + E\left($$

$$\therefore E\left[\sum_{i=0}^{M-1}\sum_{j=0}^{M-1} \left(2a_ib_j\right) + \left(2a_ia_j\right)\Big|_{i\neq j} + \left(2b_ib_j\right)\Big|_{i\neq j}\right] = 3M^2\left(\frac{\sigma_s^4}{2}\right)$$
(3-27)

So the Equation 3-22 is rewritten to Equation 3-28,

$$E\left[u_{1}^{2} \mid H_{1}\right] = E\left[\sum_{i=0}^{M-1} \left(\left(r_{i+n}^{I} r_{i+n+M}^{I}\right) + \left(r_{i+n}^{Q} r_{i+n+M}^{Q}\right)\right)^{2}\right]$$

$$= E\left[\sum_{i=0}^{M-1} a_{i}^{2}\right] + E\left[\sum_{i=0}^{M-1} b_{i}^{2}\right] + E\left[\sum_{i=0}^{M-1} \sum_{j=0}^{M-1} \left(2a_{j}b_{j} + \left(2a_{j}a_{j}\right)\right)_{i\neq j} + \left(2b_{i}b_{j}\right)\right)_{i\neq j}\right)\right]$$

$$= \frac{M^{2}}{2}\left[\left(\sigma_{s}^{2} + \sigma_{n}^{2}\right)^{2} + 2\left(\sigma_{s}^{2}\right)^{2}\right] + 3M^{2}\left(\frac{\sigma_{s}^{4}}{2}\right)$$

$$= \frac{M^{2}}{2}\left[\left(\sigma_{s}^{2} + \sigma_{n}^{2}\right)^{2} + 5\left(\sigma_{s}^{2}\right)^{2}\right]$$
(3-28)

In the same way, the second moment of the  $u_2$  can be derived, as shown in Equation 3-29.

$$E\left[u_{2}^{2} \mid H_{1}\right] = E\left[\sum_{i=0}^{M-1} \left(r_{i+n}^{Q}r_{i+n+M}^{I} - r_{i+n}^{I}r_{i+n+M}^{Q}\right)^{2}\right]$$
  
$$= E\left[\sum_{i=0}^{M-1}a_{i}^{2}\right] + E\left[\sum_{i=0}^{M-1}b_{i}^{2}\right] - E\left[\sum_{i=0}^{M-1}\sum_{j=0}^{M-1}\left(2a_{j}b_{j} - (2a_{j}a_{j})\mid_{i\neq j} - (2b_{j}b_{j})\mid_{i\neq j}\right)\right]$$
  
$$= \frac{M^{2}}{2}\left[\left(\sigma_{s}^{2} + \sigma_{n}^{2}\right)^{2} + 2\left(\sigma_{s}^{2}\right)^{2}\right] - M^{2}\left(\frac{\sigma_{s}^{4}}{2}\right)$$
  
$$= \frac{M^{2}}{2}\left[\left(\sigma_{s}^{2} + \sigma_{n}^{2}\right)^{2} + \left(\sigma_{s}^{2}\right)^{2}\right]$$
  
(3-29)

The second moment of  $u_{4i}$  and  $u_{5i}$  and is shown in Equation 3-30.

$$E\left[u_{4i}^{2} \mid H_{1}\right] = E\left[u_{5i}^{2} \mid H_{1}\right] = E\left[(s_{i} + \omega_{i})^{2}\right] = \frac{\sigma_{s}^{2} + \sigma_{n}^{2}}{2}$$
(3-30)

From Equations 3-18 to 3-30, the variance is achieved in Equations 3-31 to 3-34.

Ch3. Synchronization PAFEE Method

$$Var\left[u_{1}^{2} \mid H_{1}\right] = E\left[u_{1}^{2} \mid H_{1}\right] - \left(E\left[u_{1} \mid H_{1}\right]\right)^{2} = \frac{M^{2}}{2}\left[\left(\sigma_{s}^{2} + \sigma_{n}^{2}\right)^{2} + 3\left(\sigma_{s}^{2}\right)^{2}\right]$$
$$Var\left[u_{2}^{2} \mid H_{1}\right] = E\left[u_{2}^{2} \mid H_{1}\right] - \left(E\left[u_{2} \mid H_{1}\right]\right)^{2} = \frac{M^{2}}{2}\left[\left(\sigma_{s}^{2} + \sigma_{n}^{2}\right)^{2} - \left(\sigma_{s}^{2}\right)^{2}\right]$$
$$Var\left[u_{4i}^{2} \mid H_{1}\right] = E\left[u_{4i}^{2} \mid H_{1}\right] - \left(E\left[u_{4i} \mid H_{1}\right]\right)^{2} = \left(\frac{\sigma_{s}^{2} + \sigma_{n}^{2}}{2}\right)$$
$$Var\left[u_{5i}^{2} \mid H_{1}\right] = E\left[u_{5i}^{2} \mid H_{1}\right] - \left(E\left[u_{5i} \mid H_{1}\right]\right)^{2} = \left(\frac{\sigma_{s}^{2} + \sigma_{n}^{2}}{2}\right)$$

 $(3-31 \sim 34)$ 

From these equations, the V(n) is approximate to non-central F-distribution with degree 2 and 2M under H<sub>1</sub> hypothesis. The non-central parameter  $\lambda$  is derived in Equation 3-35.

$$\lambda = \sum_{i=0}^{1} \left( \frac{\mu_{i}^{2}}{\sigma_{i}^{2}} \right)$$

$$= \frac{(M\sigma_{s}^{2})^{2}}{\left( \frac{M^{2}}{2} \left[ \left( \sigma_{s}^{2} + \sigma_{n}^{2} \right)^{2} + 3 \left( \sigma_{s}^{2} \right)^{2} \right] \right] \times \left( \frac{2}{2M} \right)^{2}} = \frac{2M^{2}}{\left( 1 + \frac{1}{SNR} \right)^{2} + 3}$$
(3-35)

The signal to noise ratio, SNR, is defined as  $SNR = \sigma_s^2 / \sigma_n^2$ . To analysis the performance, we summarizes the probability distribution of V(n) under the H<sub>0</sub> and H<sub>1</sub> conditions.

H0: F-distribution with degree 2 and 2M.

H1: Non-central F-distribution with degree 2 and 2M, non-central parameter  $\lambda$ .

The frequency offset effect can be modeled into SNR loss. Equation 3-36 shows the degradation of SNR due to frequency error.

$$D = 10\log_{10}\left(1 - \frac{IFO}{BW/RF} - \frac{4}{64}\right)$$
  
+20 × M × log<sub>10</sub>  $\left|real\left(exp\left(\frac{FFO}{BW/RF}\right)\right) - lm ag\left(1 - exp\left(\frac{FFO}{BW/RF}\right)\right)\right|$  (3-36)  
+20 × M × log<sub>10</sub>  $\left|real\left(exp\left(\frac{FFO}{BW/RF}\right)\right) + lm ag\left(1 - exp\left(\frac{FFO}{BW/RF}\right)\right)\right|$ 

where BW is bandwidth, RF is radio frequency, and IFO and FFO represents the integral frequency offset and fractional frequency offset respectively. For example, the BW is 5MHz and the RF is 1.4 GHz in our system. If the FO = 3000 ppm, the IFO and FFO are shown in Equations 3-37 and 3-38 respectively.

$$IFO = floor \left( \frac{3000 \times 10^{-6}}{BW/RF/16} \right) \times (BW/RF/16) = 2899 \text{ ppm}$$
(3-37~38)  

$$FFO = 3000 - 2899 = 101 \text{ ppm}$$
  
The SNR degradation is -9.24 dB as shown in Equation 3-39.  

$$D = 10\log_{10}(1 - \frac{2899}{BW/RF} - \frac{4}{64})$$

$$+ 20 \times M \times \log_{10} \left| \text{real} \left( \exp(\frac{101}{BW/RF}) \right) - \text{Im} ag \left( 1 - \exp(\frac{101}{BW/RF}) \right) \right|$$
(3-39)  

$$+ 20 \times M \times \log_{10} \left| \text{real} \left( \exp(\frac{101}{BW/RF}) \right) + \text{Im} ag \left( 1 - \exp(\frac{101}{BW/RF}) \right) \right|$$
(3-39)  

$$= -9.24 \text{ dB}$$

By using the MATLAB function, ncfpdf (Probability Density Function of non-central F-distribution) and ncfcdf (Cumulative Distribution Function of non-central F-distribution), the ROC curves is plotted under different SNR conditions in terms of frequency error conditions as shown in Figure 3-1. The Pd represents detection rate and Pfa represents false alarm rate. The deduction results are similar to the simulation results, and the comparison will be shown in Chapter 4.



Figure 3-1 (b): ROC curves from mathematics deduction (Zoom-in)

From the mathematics deduction, it shows that if the M increases, the ROC performance will be improved a lot. Figure 3-2 shows the different M value under different SNR conditions. In our design, the M=32 is chosen for auto-correlation length.



Figure 3-2: ROC curves under different M

## 3-2 PAFEE -- Frequency Error Estimation Algorithm

To overcome the frequency mismatch problem, there are many approaches. In hardware approaches, the famous ones are phase lock loop (PLL) and frequency lock loop (FLL). It is an electronic control system to generate a clock that is locked to a reference frequency.

In the baseband digital signal processing approaches, there are some frequency

offset estimation or tracking algorithms in OFDM systems. They use some baseband processing to achieve frequency error estimation and calibration. In our system, our approach is to estimate the frequency error by some baseband processing.

#### 3-2-1 Conventional Algorithm

The popular conventional approach about the frequency error estimation is Moose's work [21]. The Moose's work applies the repeated training sequence to estimate the frequency error. The estimation equation is derived by the maximum likelihood estimation (MLE) algorithm, as shown in Equation 3-40.



where  $\hat{\varepsilon}$  is the estimation value of frequency error. The Y<sub>2k</sub> and Y<sub>1k</sub> are the repeated part received from the transmitter side.

Sliskovic's approach [22] enables joint carrier and sampling frequency-offset estimation from the repeated pilot symbols. He use the MLE estimation to do the estimation. The sampling frequency offset can be derived by the difference of phase shifts on two neighboring subcarriers, where  $R_{1i}$  and  $R_{2i}$  are the received signal.

$$\hat{\varepsilon}_{s,i} = (\varepsilon_s i + \varepsilon_c) - (\varepsilon_s (i-1) + \varepsilon_c) = \frac{1}{2\pi} \left[ \angle \left( \frac{R_{1i}}{R_{2i}} \right) - \angle \left( \frac{R_{1i-1}}{R_{2i-1}} \right) \right]$$
(3-41)

Because of the noise, the estimates can obtain by some weighting factors

$$\hat{\varepsilon}_{s} = \sum_{i=1}^{M-1} \omega_{i} \cdot \hat{\varepsilon}_{s,i}$$
(3-42)

The carrier frequency offset can also be derived by using the same weighting factor.

$$\hat{\varepsilon}_{c,i} = \frac{1}{2\pi} \left[ \angle \left( \frac{R_{1i}}{R_{2i}} \right) - i \cdot \hat{\varepsilon}_s \right]$$

$$\hat{\varepsilon}_c = \sum_{i=1}^{M-1} \omega_i \cdot \hat{\varepsilon}_{c,i}$$
(3-43)

From these equations, they take advantage of the characteristic of time domain phase shift to estimate the frequency offset by inner product. This method is popular for frequency error estimation in OFDM system. However, this approach is suitable only for the FFO effect. If there is IFO effect in the system, this approach will be failed.

1896

#### 3-2-2 Proposed Algorithm

In order to solve the IFO effect, the correlator bank method to detect the frequency error is introduced. To introduce the correlator bank, the attempt is to grab the frequency domain information by the correlation results. Assume the frequency domain signal is s[n], H0 is the null hypothesis, and H1 is the alternative hypothesis as shown in Equation 3-44. Consider the w[n] as the white Gaussian noise, the probability that H1 and H0 happen can be represented as Equation 3-45 and Equation 3-46 respectively.

$$H_0: x[n] = w[n]$$
, n=0, 1, ..., N  
 $H_1: x[n] = s[n] + w[n]$ , n=0, 1, ..., N where  $\underline{x} = [x[0], x[1], ..., x[N-1]]^T$ 

$$p(\underline{x}; H_1) = \frac{1}{(2\pi\sigma^2)^{N/2}} \exp\{-\frac{1}{2\sigma^2} \sum_{n=0}^{N-1} (x[n] - s[n])^2\}$$
(3-45)

$$p(\underline{x}; H_0) = \frac{1}{(2\pi\sigma^2)^{N/2}} \exp\{-\frac{1}{2\sigma^2} \sum_{n=0}^{N-1} (x[n])^2\}$$
(3-46)

By Neyman-Pearson Theorem [23], the detection probability can be maximized for a given false alarm to decide H1.

$$L(x) = \frac{p(\underline{x}; H_1)}{p(\underline{x}; H_0)} > \gamma$$

$$L(x) = \exp\{-\frac{1}{2\sigma^2} (\sum_{n=0}^{N-1} (x[n] - s[n])^2 - \sum_{n=0}^{N-1} (x[n])^2)\} > \gamma \quad (3-47\sim50)$$

$$\ln L(x) = -\frac{1}{2\sigma^2} (\sum_{n=0}^{N-1} (x[n] - s[n])^2 - \sum_{n=0}^{N-1} (x[n])^2) > \ln \gamma$$

$$\frac{1}{\sigma^2} \sum_{n=0}^{N-1} x[n] s[n] - \frac{1}{2\sigma^2} \sum_{n=0}^{N-1} s^2[n]) > \ln \gamma$$

From the previous equations, the test statistic T(x) and threshold can be decided as Equation 3-51,

$$T(x) = \sum_{n=0}^{N-1} x[n] s[n] > \sigma^2 \ln \gamma + \frac{1}{2} \sum_{n=0}^{N-1} s^2[n]$$
(3-51)

Assume the signal in different position of the frequency domain are  $s_1[n]$ ,  $s_2[n]$ , ...,  $s_N[n]$ , the hypothesis test can be expressed as follows. The test statistic are  $T_1(x), T_2(x), ..., T_N(x)$ , corresponding to  $s_1[n], s_2[n], ..., s_N[n]$ .

$$T_{k}(x) = \sum_{n=0}^{N-1} x[n]s[n] > \sigma^{2} \ln \gamma + \frac{1}{2} \sum_{n=0}^{N-1} s_{k}^{2}[n]$$
Let  

$$H_{0}: x[n] = w[n] , n=0, 1, ..., N$$

$$H_{1}: x[n] = s_{1}[n] + w[n] , n=0, 1, ..., N$$

$$H_{2}: x[n] = s_{2}[n] + w[n] , n=0, 1, ..., N$$

$$\vdots$$

$$H_{N}: x[n] = s_{N}[n] + w[n] , n=0, 1, ..., N$$

$$T_{1}(x) = \sigma^{2} \ln \gamma + \frac{1}{2} \sum_{n=0}^{N-1} s_{1}^{2}[n]$$

$$\vdots$$

$$T_{N}(x) = \sigma^{2} \ln \gamma + \frac{1}{2} \sum_{n=0}^{N-1} s_{N}^{2}[n]$$

Figure 3-2 shows the frequency error estimator block with phase rotator and correlator bank. After the packet detection, the FFO, ɛ1 is estimated by the symbol inner product. The FFO of the data has been calibrated after all. The pilot tracer detects signal properties in frequency domain (i.e., The number and position of tones in frequency domain). Each branch in the correlator bank calculates the signal correlation result by the constant sequence, Ri. The Ri is the form expressed in Equation 3-52.

Ri = ifft ( 
$$[zeros(1, i*4) -1-j zeros(1,63-i*4)]$$
 ), where i =1 to 15.  
(3-52)

There are 16 branches in the correlator bank. The frequency shifted amount  $\epsilon 2$  can be decided by the pilot tracer. The overall frequency offset can be detected by

$$\varepsilon = \varepsilon 1 + \varepsilon 2. \tag{3-53}$$



The estimation value is sent to the tunable clock generator, the WSN clock can be tuned accurately.

## 3-3 PAFEE Method Flow

The overall PAFEE method flow is shown in Figure 3-3. In the beginning, the system is in the initial state. Then the packet detector starts to detect the packet all the time. Once the packet has been detected, the first FFO, is estimated by the inner product. The phase rotator compensates the received data and reduces the FFO for the next procedure. After the packet detection, the correlation bank search for the pilot tone in the frequency domain by the cross-correlation results. Then the IFO is estimated. The summation of the FFO and IFO are sent to the eCrystal and calibrate the frequency mismatch from the CPN side.

This procedure is iterative N times. The 4 short preambles are used in each iteration. N depends on the required performance, and N=3 in our system. After that, the eCrystal is supposed to be tuned accurately in one clock cycle (200 ns) and the

coming data will be distorted by the adjusted clock. By the N iteration time, the eCrystal is accurate at 5MHz. The boundary detector performs the boundary detection. The further fine FFO estimation value has been estimated.



Figure 3-4: Synchronization PAFEE algorithm flow

# Chapter 4 Simulation Results and Performance Analysis

The proposed PAFEE method is evaluated in the WiBoC OFDM-based system under AWGN, CFO, and SCO channel conditions. To elaborate the CFO and SCO pre-calibration efforts, the system performance is illustrated in this chapter. In our system, the target SNR at FER = 1% is about 5.4 dB. So it mainly focus on the SNR = 2dB in our synchronization algorithm and make a comparison with the conventional one.

Our target system tolerance frequency error is 2800 ppm. The bandwidth in our system is 5MHz, and the RF frequency is 1.4GHz. The number of preamble pilots in the frequency domain is 12. So the maximum system tolerance is about

$$\max FO = \frac{12}{64/4} \times \frac{5MHz}{1.4GHz} = 2678.6\,ppm \tag{4-1}$$

The "FO" is the abbreviation of "frequency offset". So the target system tolerance frequency error is defined 2800 ppm.

### 4-1 Simulation on PAFEE -- Packet Detection

#### 4-1-1 ROC Curves

To depict the performance in packet detection, the ROC curves is discussed under different conditions, i.e., different M, L, SNR, frequency error. The characteristic curve of popular conventional approach is shown in Figure 4-1. Each point is simulated by 2000 packets. The ROC curves have a right angle. In other words, there is a threshold value that make the Pd =1 and Pfa =0. However, when the frequency error exists 3000 ppm, the curve is not ideal. We want to find an ROC curve under the large frequency error condition that meets our requirement.

Figure 4-2 shows the comparison between simulation results and theorem deduction from Chapter 3. Take the M=16 and L=1 as the example, the curves are very close under the frequency error 2000 ppm and 3000 ppm. In other words, the deduction model is similar to the real one.



Figure 4-1(a): Frequency error estimation block



Figure 4-2 (a): Comparison between simulation and theorem deduction



To improve the ROC curves to an ideal one, the M value can be increased from the theorem deduction in Chapter 3. The other way is to increase the depth L. Here we introduce the parameter, L. The packet detection procedure is shown in Equation 4-2.

$$V(n) = \frac{|C(n)|^{2}}{P(n)}$$
where  $C(n) = \sum_{i=0}^{M-1} r_{i+n}r_{i+n+M}^{*}$ , and  $P(n) = \sum_{i=0}^{M-1} |r_{i+n+M}^{*}|^{2}$ 
(4-2)

Summation of the V(n) during L time is shown in Equation 4-3.

$$V_{\rm L}(n) = \frac{1}{L} \sum_{i=0}^{L-1} V({\sf n+i})$$
(4-3)

Then the  $V_{L}(n)$  is defined as the new timing metric. The operation can flatten the noise effect and achieve performance improvement. The Figure 4-3 shows the simulation results under different L conditions.

In our system, the M is chosen to be 32, and L is 32. This can make the ROC curves to the ideal one under FO = 2800 ppm as shown in Figure 4-4.



Figure 4-3: ROC curves in different L



Figure 4-4: ROC curves in conventional one and proposed one

#### 4-1-2 Detection Rate, False Alarm Rate Versus SNR

Though the detection rate is guaranteed to be 1 under SNR = 2 dB by the proposed algorithm, the low-SNR condition is also simulated, i.e., SNR < 1 dB. The proposed algorithm has bad performance under low SNR conditions. If we want the performance remain the same under some SNR conditions we expect, the M and L have to be redefined. This effect can be described by algorithm in Chapter 3-2. If it is under low SNR condition, the alternative hypothesis is approximate to central F-distribution. The ROC curve is close to a straight line. The detection rate and false alarm rate versus SNR are shown in Figure 4-5 and Figure 4-6. The threshold is chosen to be 0.29 for conventional one, and 0.19 for proposed one. The false alarm rate is lower than the conventional one due to the noise is white in the overall spectrum. In the Figures 4-5 and 4-6, 2000 packets are simulated.



Figure 4-6: False alarm rate in conventional one and proposed one
Figure 4-6 shows the false alarm rate of the proposed algorithm compared to the conventional algorithm. For the data are all noise, the plot are randomly distributed.

#### 4-1-3 Detection Rate Versus Frequency Error

The detection rate under different frequency error conditions are simulated, i.e., SCO is from -2800 ppm to 2800 ppm, and CFO is from -2800 ppm to 2800 ppm with threshold, 0.19. Figure 4-7 and Figure 4-8 shows the detection rate versus different frequency error under SNR 1 dB and 2 dB respectively. Each condition is simulated in 2000 packets. The detection rate is guaranteed to be 1 in any frequency error under SNR 2 dB.



Figure 4-7: Detection rate under different CFO and SCO conditions, SNR=1 dB



Figure 4-8: Detection rate under different CFO and SCO conditions, SNR=2 dB

1896

Here we simulate different frequency offset conditions, i.e. CFO and SCO. For the baseband and the synthesizer clock source are from eCrystal, in our system, the value of the CFO and SCO are the same. We use the frequency offset, FO, to show the CFO and SCO effect at the same time. Figure 4-9 shows the Detection rate under different FO and SNR conditions. The worst case of the detection rate is under 2800 ppm condition.



## 4-2 Simulation on PAFEE -- Frequency Error Estimation

To depict the performance of frequency error estimation, the estimation correct rate versus different frequency error is shown in Figure 4-10. The estimation correct rate is 1 when the estimation error is smaller than 50 ppm. The maximum value of remaining error after estimation is smaller than 2800 ppm. In other word, this procedure won't make the system divergent, i.e., frequency error > 2800 ppm. The maximum remaining error versus frequency error under different SNR condition is shown in Figure 4-11. Each condition is simulated by 5000 packets.





Figure 4-11 (b): Maximum remaining error versus frequency error (Zoom-in)

In order to let frequency error estimation be stable under 2 dB, we increase the short preamble length and recursive the estimation procedure. Let the estimation correct rate is Pec, and the estimation error rate is Pee, which is 1-Pec. The recursive times is N. The new estimation correct rate and estimation error rate are shown as Equations 4-4 and 4-5 respectively.

$$P_{ee}' = P_{ee}^{N} = (1 - P_{ec})^{N}$$
(4-4)

$$P_{ec}' = 1 - P_{ee}^{N} = 1 - (1 - P_{ec})^{N}$$
(4-5)

The N=3 is selected, the estimation correct rate will be approximate to 1, i.e., 0.999973, under SNR=1dB and FO=2800 ppm conditions. That is, 12 short preambles are transmitted for synchronization.

#### 4-3 Simulation on System PER Performance

The overall system performance is illustrated in Figure 4-12. The performance target is packet error rate (PER) 1%., i.e., PER = 1% @ SNR = 5.4dB. Apply the conventional method, the system only tolerate frequency error below 100 ppm. The system performance can not coverage under large frequency error conditions, i.e. PER = 100 %. With the proposed technique, the system tolerates larger frequency error and the performance remains the same. Each condition is simulated in 5000 packets.



If iteration time N=1 is simulated, the performance curves are shown in Figure 4-13. The performance is loss due to the estimation correct rate is slightly declined. Four short preambles are used in a iteration. If the iteration time is 3, that means 12 preambles are used. The clock is tuned after one synchronization procedure, that is, the clock will be tuned 3 times when the N is 3.



# Chapter 5

# Hardware Implementation

### 5-1 Overall Architecture

In this chapter, the hardware implementation for synchronization algorithm is introduced in wireless sensor node receiver part. For the detailed information of the overall WSN system tapeout chip, please reference to the appendix B. Figure 5-1 shows the receiver part in WSN, our algorithms are composed by gray blocks.



Figure 5-1: Overall architecture of WSN system

#### 5-1-1 Packet Detector

Equation 5-1 shows the operation of the packet detector.

$$V(n) = \frac{\left|\sum_{i=0}^{M-1} r_{i+n} r^{*}_{i+n+M}\right|^{2}}{\sum_{i=0}^{M-1} \left|r_{i+n+M}\right|^{2}}, \text{ where } M=32.$$
(5-1)

There is one input FIFO with 192 registers in the receiver, 96 for I channel data and 96 for Q channel data. To implement Equation 5-1, the following components are required: one FIFO with 96 registers, 4 multipliers, 4 square multipliers, 7 adders, 3 substracters, and one divider. Figure 5-2 shows the hardware architecture of the packet detector.

Equation 5-2 shows the moving average operation.

$$V_{\rm L}(n) = \frac{1}{L} \sum_{i=0}^{\rm L-1} V({\rm n+i})$$
(5-2)

The moving average is a simple architecture as shown in Figure 5-3. In our application, L = 32, so there are 33 registers, 1 adder, and 1 substracter are used.



Figure 5-2: Hardware architecture of packet detector



Figure 5-3: Hardware architecture of moving average

#### 5-1-2 Phase Rotator

The phase rotator is implemented in order to solve the FFO problems. First the phase error is calculated by Equation 5-3.



Figure 5-4: Hardware architecture of phase rotator in packet detection state

The hardware architecture of the phase rotator is shown in Figure 5-4. Once the packet detector detects the data, the phase rotator estimate the FFO  $\hat{\varepsilon_{r_1}}$ . After the packet detection, the system waits 32 cycles to calibrate the data as shown in Equation

5-4.

$$r'_{i+n+M} = r_{i+n+M} \times (\cos(M\widehat{\varepsilon_{f1}}) + j\sin(M\widehat{\varepsilon_{f1}})), \quad \text{M is from 0 to 31}$$
(5-4)

The calibrated data is stored after FIFO 32.

After 32 cycles, the other fine FFO estimation is shown in Figure 5-5. The FFO  $\widehat{\varepsilon_{r_2}}$  is estimated from the calibrated data to achieve the more accurate phase value. The data calibration after that is shown in the Equation 5-5. N represents N<sup>th</sup> cycle after the 32 cycles.

$$r'_{i+n+N+32} = r_{i+n+N+32} \times (\cos((32+N)\widehat{\varepsilon_{f1}} + \sum_{i=0}^{N} \widehat{\varepsilon_{f2}}) + j\sin((32+N)\widehat{\varepsilon_{f1}} + \sum_{i=0}^{N} \widehat{\varepsilon_{f2}}))$$
(5-5)

The two operations happen in different time, so the hardware can be shared by using a multiplexer. The total hardware are composed by 6 multipliers, 4 adders, 2 subtracters, 1 arc tangent table, 1 cosine table, 1 sine table, and 1 phase accumulator. The phase accumulator is composed by one multiplier and one adder.



5-1-3 Frequency Error Estimator

The hardware architecture of the frequency error estimator is shown in Figure 5-6. From the chapter 3, the correlation coefficient is shown in Equation 5-6.

$$Ri = ifft ([zeros(1, i*4) -1-j zeros(1,63-i*4)]), where i is from 1 to 15.$$
 (5-6)

The 64-point cross-correlation is used in our system, so the the length of Ri is 64. It is easily find that 64 points is repeated every 16 points, the hardware is simplified a lot. The 16 points real and imaginary part of Ri are shown as Figure 5-7 and 5-8. The calibrated data store in the input FIFO from 32 to 95 is used for the cross-correlation. The pilot tracer decides the correlation coefficient from the correlation bank, then the correlation results tell the pilot is detected or not. After 15 position of the pilot is finished detect, the pilot tracer checks the mapping table to determine the frequency error.



Figure 5-6: Hardware architecture of frequency error estimator



Figure 5-7: Real part of the Ri, i is from 1 to 16



The baseband processor has been implemented in fixed point hardware. In this section, the simulation results after the quantization are illustrated.

#### 5-2-1 Simulation on PAFEE -- Packet Detection

To show the performance of packet detection, the ROC curve after the quantization is shown. Figure 5-9 shows the ROC curve under the frequency error 2800 ppm and SNR 2 dB. Each point is simulated in 2000 packets. There is one point with the characteristic, Pd=1 and Pf =0. The threshold 0.19 is chosen in the timing metric for detecting the packet, which is the same as floating point one. Figure 5-10 shows the detection rate under different frequency error conditions. Each condition is

simulated in 2000 packets. The detection rate is 1 under different frequency error conditions. In other words, the quantization procedure will not affect the performance of packet detection.



Figure 5-9: Fixed point ROC curves



#### 5-2-2 Simulation on PAFEE -- Frequency Error Estimation

After quantization, the estimation correct rate is decreased from 0.97 to 0.957. Figure 5-11 shows the estimation correct rate versus frequency error. The threshold of the correlation output is 0.0024. The worst condition of the estimation correct rate is 0.957. The maximum remaining error versus frequency error is shown in Figure 5-12. The worst case of the maximum remaining error is smaller than 2800 ppm. In other words, the system will be convergent after N iterations. The N has to be decided here.

The estimation correct rate is Pec, and the estimation error rate is Pee. N is the iteration times. Equation 5-7 and Equation 5-8 show the new estimation correct rate and estimation error rate after N iterations.

$$P_{ee}' = P_{ee}^{N} = (1 - P_{ec})^{N}$$
 (5-7)

$$P_{ec}' = 1 - P_{ee}^{N} = 1 - (1 - P_{ec})^{N}$$
(5-8)

The worst case of Pec is 0.957. To achieve estimation correct rate larger than 0.9999 under SNR 2 dB, the N is chosen to be 3, which is the same as fixed point one.



Figure 5-11: Fixed point estimation correct rate versus frequency error



Figure 5-13 shows the packet error rate versus SNR. The iteration number is N=3. The fixed point curve is attached to floating one. In other words, the performance is almost remains the same after the quantization.



## 5-3 Hardware Overhead

Compared to the conventional synchronization method, to achieve the large frequency error tolerance, the hardware area overhead is shown in Table 5-1. The design hardware is evaluated in UMC 90 nm process. The conventional approach, the synchronization hardware is composed by packet detector and phase rotator. The conventional approach is able to overcome 100 ppm frequency error. To overcome the larger frequency error, the different packet detector and frequency error estimator are used. From the synthesis results, the overall hardware overhead is 60104  $\mu$ m<sup>2</sup> in 90nm technology. The synthesis tool is Synopsys Design Compiler.

The power overhead is shown in Table 5-2. The power analysis tool used is Synopsys PrimePower. 500 packets are simulated. The total power overhead of the proposed one is about 148.5  $\mu$ W. The packet detector is about 50  $\mu$ W more than the conventional one. And the frequency error estimator is about 100  $\mu$ W.

| Hardware                        |                   | Conventional one     | Proposed one |
|---------------------------------|-------------------|----------------------|--------------|
|                                 |                   | (μm * μm)            | (μm * μm)    |
| Packet detector                 |                   | 20762                | 39535        |
| F                               | Phase rotator:    | 24765                | 24765        |
|                                 | FFO estimator     | E S <sup>11093</sup> | 11093        |
|                                 | cosine table      | 53250                | 5325         |
|                                 | sine table        | 4857                 | 4857         |
|                                 | arc tangent table | 3490                 | 3490         |
| Frequency error estimator       |                   | 0                    | 41331        |
| Total                           |                   | 45527                | 105631       |
| Hardware overhead               |                   | 0                    | 60104        |
| Gate count overhead             |                   | 0                    | 21298        |
| (Area / 2.822) for 90nm process |                   |                      | 21200        |

| Table 5-1: Area overh | iead |
|-----------------------|------|
|-----------------------|------|

|                           | Conventional one | Proposed one |  |  |
|---------------------------|------------------|--------------|--|--|
| Hardware                  |                  |              |  |  |
|                           | (μW)             | (μW)         |  |  |
|                           |                  |              |  |  |
| Packet detector           | 52.21            | 106.6        |  |  |
| Cosine table              | 2.053            | 2.090        |  |  |
| Sine table                | 1.900            | 1.939        |  |  |
| Arc tangent table         | 0.932            | 0.918        |  |  |
| Frequency error estimator | 0                | 94.97        |  |  |
|                           |                  |              |  |  |
| Others (FIFO + FSM)       | 117.0<br>IF S    | 117.0        |  |  |
| Total                     | 174.6            | 323.1        |  |  |
| Hardware overhead         | 1890             | 148.5        |  |  |
|                           |                  |              |  |  |

Table 5-2: Power overhead

# Chapter 6 Emulation of Crystal-less Baseband System

#### 6-1 Building Block

In this chapter the hardware emulation is implemented for crystal-less WiBoC baseband system downlink procedure. The overall system building block is shown in Figure 6-1. In the downlink procedure, the CPN is regarded as the transmitter and the WSN is regarded as the receiver. The CPN transmit some preambles to the downlink channel, and the WSN detects the preambles and performs the synchronization in FPGA. The CLK source in the CPN used is an accurate crystal clock source. In the WSN, the crystal-less tunable clock source is used. After the downlink procedure, the clock in WSN will be tuned accurately. The clock mismatch from the CPN side will be reduced.



Figure 6-1: Hardware emulation building blocks

#### 6-2 Evaluation System

The baseband and ADDA system are integrated in WPI board. The received signal from the AD is stored in the FPGA and then send to the PC. The RF system is composed by several commercial products, one modulator, one demodulator and two synthesizers. The modulator has 2 differential inputs for I channel data and 2 for Q channel data. The demodulator has 2 differential outputs for I channel data and 2 for Q channel data. To demonstrate the frequency error channel, two synthesizers are used. One is for modulator and the other is for demodulator. The synthesizer is programmed by MCU to achieve 280x from 5MHz, i.e., 1.4 GHz.



Figure 6-2: Evaluation system

|   | Component   | Type Specification                   |
|---|-------------|--------------------------------------|
| А | FPGA        | Xc2v4000 4bf957                      |
| В | FPGA        | Xc2v6000 4bf957                      |
| С | DAC         | MAX5888 16-bit 500Msps DAC           |
| D | ADC         | AD6644 14-bit 40Msps ADC             |
| Е | Modulator   | AD8346 0.8 to 2.5GHz Modulator       |
| F | Demodulator | AD8347 0.8 to 2.7GHz Demodulator     |
| G | Synthesizer | AD4360-5 1.2 to 1.4GHz Synthesizer   |
| Н | CPN clock   | Crystal oscillator clock generator   |
| Ι | WSN clock   | Clock source from function generator |
| J | RF antenna  | N/A (RF cable now), lack of BPF      |
| K | MCU         | MCU integration board                |

Table 6-1: Baseband components

The clock source in CPN side is from an accurate crystal oscillator. And the clock source in WSN side is from the function generator, which plays a role of large error and tunable clock source.

If it is desired to demonstrate the frequency error 1000 ppm, the clock frequency can be set in 5MHz and 5.005MHz. The synthesizer will produce 1.4 GHz and 1.4014 GHz respectively. The components in the overall evaluation systems are shown in Table 6-1.

#### 6-3 Emulation Results

The received signal (I channel) is shown in Figure 6-3 from the spectrum analyzer. The part (A) represents the preamble signal without any frequency error. The part (B), (C), and (D) represent the preamble with frequency error 1000, 2000, and 2800 ppm respectively. For there is no bandpass filter in the RF band, the pilot tones that shifted out of the bandwidth are preserved.



Figure 6-3: Received signal shown in spectrum analyzer

Table 6-2 shows the relationship between frequency error and clock period. The crystal oscillator clock source is used in CPN, it has 5MHz frequency and RMS jitter 210 ps. The clock source in WSN is from the function generator.

|          | CPN clock period<br>(Crystal) |                    | WSN clock period<br>(Crystal-less) |                    |
|----------|-------------------------------|--------------------|------------------------------------|--------------------|
| FO       | frequency<br>(MHz)            | RMS jitter<br>(ps) | frequency<br>(MHz)                 | RMS jitter<br>(ps) |
| 0 ppm    | 5.0000                        | 210                | 5.0000                             | 126.6              |
| 400 ppm  | 5.0000                        | 210                | 5.0020                             | 142.6              |
| 800 ppm  | 5.0000                        | 210                | 5.0040                             | 177.4              |
| 1200 ppm | 5.0000                        | 210                | 5.0060                             | 221.4              |
| 1600 ppm | 5.0000                        | 210                | 5.0080                             | 269.4              |
| 2000 ppm | 5.0000                        | E 210 M            | 5.0101                             | 320.8              |
| 2400 ppm | 5.0000                        | 210                | 5.0120                             | 368.8              |
| 2800 ppm | 5.0000                        | 210                | 5.0140                             | 415.6              |
|          |                               |                    |                                    |                    |

Table 6-2: The relationship between frequency error and clock period

Figure 6-4 and 6-5 shows the processed received signal and its estimation results by the MATLAB. The FFO in the preamble has been calibrated, and then the frequency estimator finds the pilot position by the frequency estimator and estimates the frequency error by this information. The FFT of the signal is also shown compared to the frequency estimator results.



Figure 6-4: Estimation results with frequency error 0, 400, 800, 1200 ppm



Figure 6-5: Estimation results with frequency error 1600, 2000, 2400, 2800 ppm

## Chapter 7

# **Conclusions and Future Work**

#### 7-1 Conclusions

The system performance in crystal WiBoC system and crystal-less WiBoC system is shown in Table 7-1. In the crystal platform, the main components for clock generation are the external crystal and crystal oscillator pad. The external crystal oscillator occupies large area and makes it difficult to implement system on chip (SoC) design. The oscillator pad consumes a lot of power in the crystal system. Compared to the components in the crystal-less system, the area of synchronization hardware, and DCO occupies only 0.46mm<sup>2</sup>, which have 98.4% area reduction from crystal system.

The power consumption to the overhead component in crystal-less platform is only 10.7 % of crystal system. The eCrystal system has 89.3 % power reduction and 98.4% area reduction.

The synchronization method for OFDM-based WBAN applications is proposed. With this proposal, the communications system can be integrated into the crystal-less system and is enabled with larger frequency errors, i.e. 2800 ppm. This work can be implemented in others OFDM systems. Apply this method to the others systems, the CPN has to transmit the downlink preamble to WSN for symbol timing synchronization and frequency error estimation. The tolerated frequency range is up to bandwidth over RF frequency.

|                      | WiBoC on crystal platform                                                                                             | WiBoC on crystal-less platform                                                         |  |
|----------------------|-----------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------|--|
| Hardware<br>overhead | <ol> <li>(1) Crystal oscillator [11]</li> <li>(2) Oscillator pad</li> </ol>                                           | <ol> <li>Proposed algorithm</li> <li>eCrystal [24]</li> </ol>                          |  |
| Area<br>overhead     | <ul> <li>(1) 8 mm<sup>2</sup> [11]</li> <li>(2) 0.0147 mm<sup>2</sup></li> <li>8.01 mm<sup>2</sup> (total)</li> </ul> | (1) $0.06 \text{ mm}^2$<br>(2) $0.40 \text{ mm}^2$ [24]<br>$0.46 \text{ mm}^2$ (total) |  |
| Power<br>overhead    | <ul> <li>(1) 1.8 μW (crystal) [11]</li> <li>(2) 3625 μW (osc pad)</li> <li>3626.8 μW (total)</li> </ul>               | <ul> <li>(1) 149 μW</li> <li>(2) 237 μW [24]</li> <li>386 μW (total)</li> </ul>        |  |
|                      |                                                                                                                       |                                                                                        |  |

| T 1 1 7 1  | a .    | •          |         |
|------------|--------|------------|---------|
| Table /-1: | System | comparison | summarv |
|            | ·      |            | ···· )  |

The overall design characteristics are summarized in Table 7-2 in terms of hardware and system specifications. The designed hardware is evaluated in UMC 90nm process. In the data processing, the overhead power consumption is 148.5  $\mu$ W. In the system design features, the WiBoC follows the constraints defined in WMTS with the provided maximum data rate 4.85Mbps.

| Parameter                 | Feature           |  |
|---------------------------|-------------------|--|
| Technology                | Std. 90nm CMOS    |  |
| Area overhead             | $60104 \ \mu m^2$ |  |
| Gate count overhead       | 21298             |  |
| Power overhead            | 148.5 μW          |  |
| RF band                   | 1395-1400MHz      |  |
| Spectrum Bandwidth        | 5MHz              |  |
| Constellation Mapper      | QPSK              |  |
| IFFT/FFT Block Size E S   | 64                |  |
| Maximum Data Throughput   | 4.85 Mbps         |  |
| Guard Interval Duration   | 0.4µs             |  |
| Data Bit per OFDM symbols | 48                |  |

Table 7-2: WiBoC OFDM system summary

## 7-2 Future Work

In the future, the preamble packet can be designed to improve the baseband processing, i.e., reduce the iteration times and enlarge the frequency error tolerance.

The prototype platform will be upgraded and replaced the RF cable by the RF antenna and low noise amplifier. The mechanism of auto gain control (AGC) will be added in the prototype platform. The integration communication PCB board is under

development.

However, the algorithm is limited by the bandwidth and RF specification, the initial error of the crystal-less clock generator needs to be improved. If the initial error of the eCrystal is reduced, it will ease the design effort in algorithm part.

Based on the following progress, the WSN can be re-designed and improved to become a low power, low cost, stabilization and miniaturized single chip.



# References

- Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specifications for Wireless Personal Area Networks (WPANs), IEEE Standard 802.15.1, 2005.
- [2] Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specifications for Low-Rate Wireless Personal Area Networks (LR-WPANs), IEEE Standard 802.15.4, 2003
- [3] S. Pentland, "Healthwear: Medical Technology Becomes Wearable," IEEE Computer Society, pp. 42-49, May 2004.
- [4] K. Lorincz, D.J. Malan, Fulford-Jones, Thaddeus R.F., A. Nawoj, A. Clavel, V. Shnayder, G. Mainland, S. Moulton, and M. Welsh, "Sensor Networks for Emergency Response: Challenges and Opportunities, " IEEE Pervasive Computing, pp.16-23, Oct. 2004.
- [5] B. Gyselinckx, C. Van Hoof, J. Ryckaert, R.F.Yazicioglu, P. Fiorini, V. Leonov, "Human++: autonomous wireless sensors for body area networks," Proceedings of the IEEE Custom Integrated Circuits Conference, pp.13-19, Sept. 2005.
- [6] Chen-Yi Lee and Jui-Yuan Yu, "Crystal-less Communications Device and Self-Calibrated Embedded Virtual Crystal Clock Generation Method," US/TW/JPA/Euro patent, Filed on Jul. 2008.
- [7] F. Sebastiano, S Drago, L. Breems, D. Leenaerts, K. Makinawa, and B. Bauta, "Impulse Based Scheme for Crystal-less ULP Radios," IEEE Int. Symp. on Circuit and System (ISCAS), pp. 1508-1511, May 2008.
- [8] N. M. Nathan, S. Gambini, and J. M. Rabaey, "A 2GHz 52uW Wake-Up Receiver with -72dBm Sensitivity Using Uncertain-IF Architecture," IEEE Int. Solid-State Circuits Conference (ISSCC), pp. 524-633, Feb. 2008.
- [9] E. Lopelli, J. v. d. Tang, and A. H. M. v. Roermund, "A Frequency Offset Recovery Algorithm for Crystal-Less Transmitters," IEEE Int. symp. on Personal, Indoor and Mobile Radio Communications (PIMRC), pp. 1-5, Sept. 2006.

- [10] F. Sebastiano, L. Breems, K. Makinawa, S. Drago, D. Leenaerts and B. Bauta, "A Low-Voltage Mobility-Based Frequency Reference for Crystal-Less ULP Radios," IEEE European Solid-State Circuits Conference (ESSCIRC)., pp. 306-309, Sept. 2008.
- [11] Citizen [Online]. Available: http://www.citizencrystal.com
- [12] Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications High-speed Physical Layer in the 5 GHz Band, IEEE Standard 802.11a, 1999.
- [13] J. Ryckaert, P. De Doncker, R. Meys, A. de Le Hoye, S. Donnay, "Channel model for wireless communication around human body," Electronics Letters, Volume 40, Issue 9, 29, pp.543-544, Apr. 2004.
- [14] P. S. Hall, M. Ricci, T. W. Hee, "Characterization of on-body communication channels, Microwave and Millimeter Wave Technology," Proceedings. ICMMT, 3rd International Conference, pp. 770-772, Aug. 2002.
- [15] Hao Yang, A. Alomainy, Zhao Yan, C. G. Parini, Y. Nechayev, P. Hall, C. C. Constantinou, "Statistical and deterministic modelling of radio propagation channels in WBAN at 2.45GHz," Antennas and Propagation Society International Symposium, pp. 2169-2172, Jul. 2006.
- [16] J.-J van de Beek, M. Sandell, and P.O. B<sup>-</sup>orjesson, "ML Estimation of Time and Frequency Offset in OFDM Systems", IEEE Transactions on Signal Processing, Vol. 45, No. 7, pp. 1800–1805, Jul. 1997.
- [17] T. M. Schmidl, D. C. Cox, "Robust frequency and timing synchronization for OFDM," IEEE Trans. Commun., Vol. 45, pp. 1613–1621, Dec. 1997.
- [18] H. Minn, V. K. Bhargava, K. B. Letaief, "A Robust Timing and Frequency Synchronization for OFDM Systems," IEEE Trans. on Wireless Communications, Vol. 2, No. 4, pp. 822- 839, Jul. 2003.
- [19] W. G. Bulgren, "On Representations of the Doubly Non-Central F Distribution," J. Amer. Stat. Assoc. 66, 184, 1971.
- [20] P. Stoica, R. L. Moses, "Introduction to Spectral Analysis", Prentice Hall, 1997.
- [21] P. H. Moose. A Technique for Orthogonal Frequency Division Multiplexing Frequency Offset Correction. IEEE Transactions on Communications. pp. 2908 -2914, Vol. 42, No. 10, Oct. 1994.

- [22] M. Sliskovic, "Carrier and sampling frequency offset estimation and correction in multicarrier systems", Global Telecommunications Conference, IEEE, Vol. 1, pp.285-289, Nov. 2001
- [23] Fundamentals of Statistical Signal Processing, Volume II: Detection Theory, Steven M. Kay, Prentice Hall, 1993
- [24] Chien-Ying Yu, Jui-Yuan Yu, and Chen-Yi Lee, "An eCrystal Oscillator with Self-Calibration Capability," ISCAS, 2009
- [25] R.B. Staszewski, D. Leipold, K. Muhammad, P.T. Balsara, "Digitally controlled oscillator (DCO)-based architecture for RF frequency synthesis in a deep-submicrometer CMOS Process," IEEE Trans. Circuits and Systems II, Analog and Digital Signal Processing., vol. 50, no. 11, pp. 815-822, Nov. 2003.
- [26] Wei Pang, Hongyu Yu, Hao Zhang, Eun Sok Kim, "Electrically tunable and temperature compensated FBAR," Microwave Symposium Digest, 2005 IEEE MTT-S International, June 2005
- [27] Duo Sheng, Ching-Che Chung, Chen-Yi Lee, " An Ultra-Low-Power and Portable Digitally Controlled Oscillator for SoC Applications," IEEE Trans., Circuits and Systems II, Express Briefs, vol 54, no. 11, pp. 954-958, Nov. 2007
- [28] P.-L. Chen, C.-C. Chung, and C.-Y. Lee, "A portable digitally controlled oscillator using novel varactors," IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 52, no. 5, pp. 233–237, May 2005.
- [29] T. Olsson and P. Nilsson, "A digitally controlled PLL for SoC applications," IEEE J. Solid-State Circuits, vol. 39, no. 5, pp. 751–760, May 2004.
- [30] C.-C. Chung and C.-Y. Lee, "An all digital phase-locked loop for highspeed clock generation," IEEE J. Solid-State Circuits, vol. 38, no. 2, pp. 347–351, Feb. 2003.
- [31] A. Mazzanti, et al, "On the Amplitude and Phase Errors of Quadrature LC-Tank CMOS Oscillators," IEEE J. Solid-State Ceircuits, vol. 41, no. 6, Jun. 2006.
- [32] W. T. Hsu, et al, "The New Heart Beat of Electronics Silicon MEMS Oscillators," Proc. Electronic Components and Technology Conference, pp. 1895-1899, May 29, 2007
- [33] K. Sundaresan, et al, "Process and Temperature Compensation in a 7-MHz CMOS Clock Oscillator," IEEE J. Solid-State Circuits, vol. 41, no. 2, Feb. 2006.
# Appendix A Embedded Crystal Design

## A-1 Introduction

The wireless body area network (WBAN) plays a crucial role in human body health monitoring for medical services. For the low cost, low power and tiny area WBAN applications, the embedded crystal (eCrystal) method is proposed in [6] by NCTU. In the wireless communications system, the system clock generates by a crystal oscillator. The oscillator occupies large area and consumes large power. By the eCrystal technique, it is tend to replace the crystal oscillator by other low power, standard CMOS process oscillator circuits.

One solution of the fine resolution DCO is LC tank DCO [25] by 0.13  $\mu$ m CMOS process. However, the LC tank DCO is not the standard CMOS process and needs extra lithography technology. The LC tank occupies large area and consumes a lot of power. This is not suitable for WBAN applications.

Another solution of the accurate DCO is MEMS [26] oscillator. The MEMS has smaller frequency error and better accuracy than the other oscillator in standard CMOS process. But the drawbacks of the MEMS are the long design cycle and the tunable mechanism is not digital.

The standard CMOS oscillator design depicts the flexibility of design CMOS circuits, and it is easy to integrate with other CMOS circuits. The well-known CMOS oscillator directly uses an inverter chain to create the oscillator behavior. But

it consumes much power in transition current. To save the power, the hysteresis delay cell (HDC) is used in [27]. Based on the segmental delay line and hysteresis delay cell architecture, the power consumption can be saved up to 70%.

In our design, we proposed another hysteresis delay cell to save the power. However, the hysteresis cell does not have good resolution, so we used a fine-tuned stage to cover the coarse-tuned range and increase the resolution.

The DCO is realized on a 1p9m 90nm CMOS process. The target frequency is from 10MHz to 35MHz, which is the frequency for baseband clock applications in WBAN system. This design have been simulated in three PVT conditions, (FF, 1.1v, 0oC), (TT, 1.0v, 25oC), and (SS, 0.9v, 100oC). The operating frequency range at typical case is from 3MHz to 350MHz from the HSPICE simulation results. The measurement results of the test chip are followed in section A-3. We measure the chip in several supply voltage condition to observe the voltage variation in this DCO test circuit.

This appendix A is arranged as followed. In section A-2, proposed DCO architecture are presented, and section A-3 shows the test chip measurement results. The test chip is measured under different supply voltage conditions. The results are present in terms of the frequency range and RMS jitter performance. Then the comparison of existing solutions of DCO is provided in section A-4. The conclusion is followed in section A-5.

# A-2 Proposed DCO

#### A-2-1 DCO Architecture

Figure A-1 illustrates the proposed DCO architecture. This design is composed of one coarse-tuned stage and one fine-tuned stage. The SEL signal is the codeword to control the DCO frequency. The coarse-tuned stage provides longer delay and spends less power. The fine-tuned stage provides smaller resolution. In each stage, the delay time of the codeword is proportional to the output period.



#### A-2-1 Coarse-tuned Stage

To reduce the power and area of the DCO, we use the Schmitt trigger cell providing large delay to reduce the area and the power. The architecture of coarse-tuned stage is shown in Figure A-2. The bit number of tuning codeword is 7, and there are 128 codewords. We create the delay cell in each element and make the delay time 2x in each element compared to the previous element. In other words, the delay time in (N+1)th element is twice of the Nth element, N is from 1 to 6. The delay cell in each element, ST1D, ST2D, and ST3D as shown in Figure A-2 are one Schmitt trigger cell with some umc 90nm standard cell as its output loading. The delay cell in ST4D, ST5D, ST6D and ST7D are composed by the three elements of ST1D, ST2D, amd ST3D. They are designed to create different delay. The

transistor-level structure of Schmitt trigger delay cell is shown in Figure A-3. And the cell in each stage is shown in the Table A-1.

The delay time of the coarse-tuned stage can be illustrated as the following equation,

$$T_{coarse} = \sum_{N=0}^{6} T_{STND} \times SEL[N] + T_{cons \tan t}$$
(A-1)

Where  $T_{STND}$  represents the delay in each delay element of Schmitt trigger stage, i.e, N=3 means ST3D. Each bit of the select signal can be 1 or 0. If the delay time of the element follows the rule that the delay time in the N+1 element is twice of the N element, then the  $T_{coarse}$  can be expressed as

$$T_{coarse} = \sum_{N=0}^{6} 2^{N} \times SEL[N] + T_{eonstant}$$
(A-2)

This structure can reduce one decoder and save the area of the large input of multiplexers compared to the traditional DCO structure.



Figure A-2: Coarse-tuned stage



Figure A-3: Schmitt trigger cell

|      | ST cell (W/L) | Cload                 |
|------|---------------|-----------------------|
| ST1D | 0.2/0.08      | 1 BUFM16H             |
| ST2D | 0.2/0.08      | 1 BUFM16H + 2 BUFM14H |
| ST3D | 0.2/0.08      | 5 BUFM20H + 1 BUFM8H  |
| ST4D | 0.2/0.08      | 2 ST3D                |
| ST5D | 0.2/0.08      | 3 ST3D + 1 ST2D       |
| ST6D | 0.2/0.08      | 7 ST3D + 1 ST1D       |
| ST7D | 0.2/0.08      | 13 ST3D + 2 ST2D      |

Table A-1: Delay element in coarse-tuned stage

# A-2-2 Fine-tuned Stage

Figure A-4 shows the fine-tuned stage. The delay element in this stage is

INVMON, which is the standard cell in UMC 90nm CMOS process. The structure in this stage is the same as the coarse-tuned stage. The delay time in N+1 elements is twice of the N elements, N is from 1 to 6. The fine-tuned stage have 7 delay elements, INV1D, INV2D, INV3D, INV4D, INV5D, INV6D, INV7D. The number of the delay cell in N+1 elements is designed to the twice of the N elements, N is from 1 to 6, too. This can achieve linearity in fine-tuned stage. There are 128 codewords in this stage, too. This stage provides small resolution and the delay range can cover the LSB resolution in coarse-tuned stage. The delay time is proportional to the number of the delay inverter cell. The cell in each stage is shown in the Table A-2.

In the same way, the period of the DCO in fine-tuned stage also can be expressed as followed,  $T_{\text{fine}} = \sum_{N=0}^{6} T_{\text{INVND}} \times \text{SEL}[N+7] + T_{\text{constant}}$  $= \sum_{N=0}^{6} 2^{N} \times \text{SEL}[N+7] + T_{\text{constant}}$ (A-3)

The overall period of the DCO output clock can be expressed as the following equation,

$$T_{\text{period}} = T_{\text{coarse}} + T_{\text{fine}} + T_{\text{constant}}$$
(A-4)

where the  $T_{coarse}$  denotes the delay time in the coarse-tuned stage resulting in coarse-tuned codeword as Equation A-2. The  $T_{fine}$  represents the delay time in the fine-tuned stage as Equation A-3.



Figure A-4: Fine-tuned stage

|       | INV cell (W/L) | C <sub>load</sub> |
|-------|----------------|-------------------|
| INV1D | 0.36/0.08      | 2 INVM0N          |
| INV2D | 0.36/0.08      | 4 INVM0N          |
| INV3D | 0.36/0.08 E S  | 8 INVM0N          |
| INV4D | 0.36/0.08      | 16 INVMON         |
| INV5D | 0.36/0.08      | 32 INVMON         |
| INV6D | 0.36/0.08      | 64 INVMON         |
| INV7D | 0.36/0.08      | 128 INVM0N        |

Table A-2: Delay element in fine-tuned stage

## A-3 Measurement Results

The test chip is fabricated using a standard 90nm 1p8m CMOS process. The DCO behavior is described in gate-level by Verilog-HDL. The DCO model is also built in HSPICE simulation and the delay time is accurately simulated. This design have been simulated in three PVT conditions, (FF, 1.1v, 0oC), (TT, 1.0v, 25oC), and (SS, 0.9v, 100oC). The Schmitt trigger cell is layouted by EDA tool in a format of

standard cell. An automatic placement and routing (APR) tool is used to complete the physical layout with the added hand-made Schmitt trigger cell. The post layout after APR is also simulated to ensure the codeword linearity of the DCO.

The proposed DCO occupies only  $0.0063 \text{ mm}^2$  (210µm x 30µm). This DCO has been measured in 6 different voltage conditions to observe the linearity, resolution, power and RMS jitter in each condition.

#### A-3-1 Coarse-tuned Stage

Figure A-5 shows the period of the coarse-tuned stage DCO under different supply voltage condition. The operating frequency is from 3.5MHz to 153.25MHz when the supply voltage is 1.0v. The mean value of LSB resolution in the coarse-tuned stage is 2.15ns. Table A-3 shows frequency, LSB resolution, and RMS jitter of the DCO under every different supply voltage condition. The operating supply voltage is from 1.5v to 0.8v. We can observe the DCO frequency in terms of voltage variation. When the supply voltage is 1.1v, the period is about 0.7 times of the one with supply voltage 1.0v. When the supply voltage is 0.9v, the period is about 1.6 times of the one with supply voltage 1.0v. The small amount of voltage variation will cause large frequency change. The larger the supply voltage results in the high frequency, small resolution, small RMS jitter, and high linearity.

Figure A-6 shows the RMS jitter performance in different codeword under different supply voltage conditions. The jitter is large under low supply voltage condition. The variation of the jitter in different codeword is not so much.



The period of DCO in coarse-tuned stage

Figure A-6: The RMS jitter of DCO in coarse-tuned stage

| Supply Operating |                 | Resolut | ion (ns) | RMS jitter (ps) |        |
|------------------|-----------------|---------|----------|-----------------|--------|
| voltage          | (MHz)           | Max     | Mean     | Max             | Mean   |
| 1.5 V            | 11.56 to 348.44 | 3.74    | 0.66     | 71.02           | 46.77  |
| 1.2 V            | 6.70 to 238.11  | 6.13    | 1.14     | 168.87          | 96.27  |
| 1.1 V            | 5.10 to 197.88  | 7.81    | 1.50     | 222.83          | 148.72 |
| 1.0 V            | 3.56 to 153.25  | 10.62   | 2.15     | 416.07          | 242.40 |
| 0.9 V            | 2.20 to 110.37  | 20.87   | 3.50     | 807.97          | 511.73 |
| 0.8 V            | 1.23 to 59.52   | 31.87   | 6.26     | 1503.2          | 968.03 |

Table A-3: DCO in coarse-tuned stage



## A-3-2 Fine-tuned Stage

Figure A-7 shows the period of the fine-tuned stage DCO under different supply voltage condition. The mean of the resolution is 134.5 ps at the condition that supply voltage = 1.0 v. When the supply voltage is 1.1 v, the LSB resolution is about 0.84 times of the one with supply voltage 1.0 v. When the supply voltage is 0.9 v, the LSB resolution is about 1.28 times of the one with supply voltage 1.0 v. It shows that the effect of voltage variation is lighter than the coarse-tuned stage. The tuning range in the fine-tuned stage can cover the LSB resolution in coarse-tuned stage. When the supply voltage = 1.5 v, the mean and max RMS jitter in fine-tuned stage are close to the ones in the coarse-tuned stage, and can achieve LSB resolution 76 ps in overall DCO. Table A-4 shows the tuning range, LSB resolution, and RMS jitter of the DCO fine-tuned stage under every different supply voltage condition.

The following Table A-5 shows the power consumption under different supply voltage conditions. The higher the frequency, the larger the power consumes. The largest power consumption 190  $\mu$ W happens at frequency 160MHz when the supply voltage is 1.0v.



Figure A-7: The period of DCO in fine-tuned stage



Figure A-8: The RMS jitter of DCO in fine-tuned stage

Table A-4: DCO in fine-tuned stage

| Supply  | ply<br>age Tuning<br>Range<br>(ns) | Resolut | Resolution (ns) |       | RMS jitter (p) |  |
|---------|------------------------------------|---------|-----------------|-------|----------------|--|
| voltage |                                    | Max     | Mean            | Max   | Mean           |  |
| 1.5 V   | 9.65                               | 195.04  | 75.99           | 67.79 | 33.00          |  |

| 1.2 V | 12.53 | 421.44  | 98.67  | 119.49  | 39.08  |
|-------|-------|---------|--------|---------|--------|
| 1.1 V | 14.47 | 283.65  | 113.29 | 94.46   | 40.36  |
| 1.0 V | 17.08 | 669.32  | 134.50 | 90.84   | 42.34  |
| 0.9 V | 21.88 | 1796.25 | 172.26 | 210.36  | 69.32  |
| 0.8 V | 26.21 | 5275.42 | 198.26 | 2601.23 | 259.86 |

Table A-5: DCO power consumption under different supply voltage

| Supply voltage | Working frequency<br>(MHz) | Power<br>consumption (µW) |
|----------------|----------------------------|---------------------------|
| 1.5 V          | 10.09 to 348.87            | 832.5 to 4335             |
| 1.2 V          | 6.02 to 228.33             | 157.2 to 368.4            |
| 1.1 V          | 4.64 to 200.09             | 103.2 to 267.3            |
| 1.0 V          | 3.29 to 159.23             | 62.4 to 190.0             |
| 0.9 V          | 2.06 to 115.40             | 34.5 to 126.0             |
| 0.8 V          | 1.17 to 65.34              | 18.2 to 77.5              |

# A-4 System Comparison

Table A-6 lists the chip measurement results compared with some conventional approaches [28], [29], [30]. The comparison results show in terms of resolution and power consumption. The proposed DCO also have the benefits of portability due to its tiny area and less power consumption.

| Performance<br>Index                 | Proposed<br>DCO  | TCAS2'05<br>[28]  | JSSC'04<br>[29] | JSSC'03<br>[30] |
|--------------------------------------|------------------|-------------------|-----------------|-----------------|
| Process                              | 90nm<br>CMOS     | 0.35mm<br>CMOS    | 0.35mm<br>CMOS  | 0.35mm<br>CMOS  |
| Supply                               | 1.0              | 3.3               | 3.0             | 3.3             |
| Wordlength                           | 14               | 15                | 7               | 12              |
| Operation<br>Range(MHz)              | 3.3~159          | 18~214            | 152~366         | 45~510          |
| LSB resolution                       | 134ps            | 1.55ps            | 10~150ps        | 5ps             |
| Lower freq.<br>(<10 MHz )<br>enabled | YES              | NG A              | NO              | NO              |
| Power<br>Consumption                 | 190µW<br>@160MHz | 18mW96<br>@200MHz | 12mW<br>@366MHz | 50mW<br>@500MHz |

Table A-6: DCO performance comparison

# A-5 Conclusion

In this section, an ultra small area, low power, and large delay DCO based on Schimitt trigger delay element is designed. The coarse-tuned stage provides large delay and saves much more area and power. The fine-tuned stage covers LSB resolution of the coarse tuned stage and provides good linearity. The test chip measurement results show the DCO can achieve 134 ps resolution and large frequency operation range under supply voltage = 1v. The DCO is enabled in lower frequency, which is suitable for baseband applications. The DCO is evaluated in standard CMOS process and implemented by APR tool.

# Appendix B IP Deliverable

## **B-1** Overall Architecture

This wireless sensor node (WSN) is a baseband platform in communications system. It contains 2 major components, a transmitter (WSN\_TX) and receiver (WSN\_RX) module. The baseband receiver receives the downlink data from the downlink channel, and analyzes the data, feedback a frequency error value to tune the baseband clock. After the RX processing, the TX modulate the signal and transmit to the uplink channel. The overall architecture is shown in Figure B-1.



Figure B-1: Overall architecture of WSN system

# B-2 Architecture of Receiver

## **B-2-1** Components

The baseband receiver is composed by phase rotator, downlink synchronizer, and the frequency error estimator. When the receiver receives the downlink data from the downlink channel, the baseband processor will analyze the data and feedback a frequency error value to the DCO-based clock generator. The baseband clock will be tuned accurately and reduce the clock mismatch from the central processing node side. The component in WSN\_RX is list in Table B-1.

| RTL code file name | Descriptions                                                                                                 |
|--------------------|--------------------------------------------------------------------------------------------------------------|
| WSN_RX.v           | WSN RX top module                                                                                            |
| Sin13to10.v        | Sin table for phase rotator                                                                                  |
| Cos13to10.v        | Cos table for phase rotator                                                                                  |
| Arctan6to13.v      | Arc tangent table for phase rotator                                                                          |
| MULTI64.v          | The multiplier shared by boundary detection and correlation part.                                            |
| INNER_PRO.v        | The autocorrelation part. The correlation length and the moving average of the correlation sum can be tuned. |
| INNER_PRO2.v       | The phase rotator.                                                                                           |

Table B-1: The component in WSN\_RX

Table B-2 shows the mode of operation in WSN\_RX. The M and L can be adjusted for different SNR conditions. The PARAM\_RF\_TH can be adjusted for different bandwidth and frequency allocation conditions.

| Name          | I/O | Width | Simple description                                      | Default<br>value |
|---------------|-----|-------|---------------------------------------------------------|------------------|
| PARAM_AUTOM   | Ι   | 1     | Autocorrelation length.<br>0: M=16, 1: M=32             | From 0~ 1        |
| PARAM_AUTOL   | Ι   |       | Moving average.<br>0: L=1, 1: L=16, 2: L=32, 3:<br>L=48 | From 0~ 3        |
| PARAM_AC_TH   | Ι   |       | Autocorrelation threshold                               | 7'h 17           |
| PARAM_BOOK_TH | Ι   | 19    | Pattern book writing threshold                          | 19'h 50          |
| PARAM_BANK_TH | Ι   | 9     | Correlation threshold                                   | 9'h 00e          |
| PARAM_CC_TH   | Ι   | 11    | Boundary detection threshold                            | 11'h 3d          |
| PARAM_RF_TH   | Ι   | 13    | RF spec. (i.e: BW/RF/16 = 223)                          | 13'd 223         |

Table B-2: Mode of operation in WSN\_RX

## **B-2-3** External Signal Descriptions

The external signal description is shown in Table B-3. After the reset state, the baseband receiver will process the IN\_I and IN\_Q input signal. It performs the packet detection and frequency error estimation. The first output is FERR\_COARSE, represents the coarse frequency error estimation value. After boundary detection, the second output FERR FINE is given. The overall receiver processing is over.

| name             | I/O | width | Simple description                                                      |  |
|------------------|-----|-------|-------------------------------------------------------------------------|--|
| CLK              | Ι   | 1     | Clock signal. This design operates at positive edge clock               |  |
| RESET            | Ι   | 1     | Synchronous, active-high reset signal                                   |  |
| IN_I             | Ι   | 6     | I channel input signal                                                  |  |
| IN_Q             | Ι   | 6     | Q channel input signal                                                  |  |
| FUNC_EN          | Ι   | 1     | Function enable. 1: enabling the overall state diagram                  |  |
| P_EN             | Ι   | 1     | Process enable. 1: enabling the initial state diagram                   |  |
| FERR_COARSE      | 0   | 13    | coarse frequency error estimation amount valid<br>with OUT_VALID_COARSE |  |
| FERR_FINE        | 0   | 13    | fine frequency error estimation amount<br>valid with OUT_VALID_FINE     |  |
| OUT_VALID_COARSE | 0   | 1     | valid signal for coarse frequency error estimation                      |  |
| OUT_VALID_FINE   | 0   | 1     | valid signal for fine frequency error estimation                        |  |
| D_STATE          | 0   | 4     | debug output (state diagram)                                            |  |
| PARAM_AUTOM      | Ι   | 1     | autocorrelation length. 0: M=16, 1: M=32                                |  |
| PARAM_AUTOL      | Ι   | 2     | moving average. 0: L=1, 1: L=16, 2: L=32, 3:<br>L=48                    |  |
| PARAM_AC_TH      | Ι   | 7     | autocorrelation threshold                                               |  |
| PARAM_BOOK_TH    | Ι   | 19    | pattern book writing threshold                                          |  |
| PARAM_BANK_TH    | Ι   | 9     | correlation threshold                                                   |  |
| PARAM_CC_TH      | Ι   | 11    | boundary detection threshold                                            |  |
| PARAM_RF_TH      | Ι   | 13    | RF spec. (i.e: BW/RF/16 = 223)                                          |  |

Table B-3: External signal description in WSN\_RX

## B-2-3 FSM

The FSM in WSN\_RX is shown in Figure B-2. The state description is list in Table B-4. The state is designed simple and easy to control. The hardware can be shared under different state conditions. When one state is active, the other states are inactive and consume less power.



Figure B-2: FSM of WSN\_RX

| state  | state descriptions:                                |
|--------|----------------------------------------------------|
| inist  | initial state                                      |
| autost | autocorrelation state for packet detection.        |
| syncst | synchronization state for frequency tone detection |
| calist | calibration state: calibrate 64 data               |
| cfoest | frequency error estimation state                   |
| bookst | (one cycle) pattern book match                     |
| bdetst | boundary detection state                           |
| finest | fine CFO estimation state                          |
| fishst | (one cycle) finish state                           |

Table B-4: State description in WSN\_RX

B-2-4 Area

# Table B-5: Area reports in WSN\_RX

| module name |             | area ( m <sup>2</sup> ) |
|-------------|-------------|-------------------------|
| WSN_RX      |             | 344239.967741           |
|             | cos13to10   | 5327.893139             |
|             | sin13to10   | 4857.282127             |
|             | arctan6to13 | 3491.967093             |
|             | MULTI64     | 222041.215886           |
|             | INNER_PRO   | 42006.091436            |
|             | INNER_PRO2  | 11236.061129            |

The area cost in terms of synthesis results and post-layout results are described. The hardware area from synthesis report is shown in Table B-5. The synthesis tool is Design Compiler. The post-layout results are shown as below. The place and route (P&R) tool is Cadence RTL Encounter. The P&R view of WSN\_RX is shown in Figure B-3.



Core area: 770 \* 800  $\mu$ m<sup>2</sup> Hardmacro area: 850 \* 880  $\mu$ m<sup>2</sup> (Include Ring) Density: 86.513 %

Figure B-3: P&R view of WSN RX

#### B-2-5 Power

The power consumption in terms of gate-level simulation results and post-layout results are introduced. First, the power classification is introduced in Table B-6. The EDA tool used for power analysis is Prime Power. Table B-7 shows the power consumption in gate-level simulation and post-layout simulation.

| Classification  | Definition                                               |
|-----------------|----------------------------------------------------------|
| Total Power     | Dynamic + Leakage                                        |
| Dynamic Power   | Switching + Internal                                     |
| Switching Power | load capacitance charge or discharge power               |
| Internal Power  | power dissipated within a cell                           |
| X-tran Power    | component of dynamic power-dissipated into x-transitions |
| Glitch Power    | component of dynamic power-dissipated into detectable    |
| Leakage Power   | reverse-biased junction leakage + subthreshold leakage   |

Table B-6: Power classification and definition

Table B-7: Power report in WSN\_RX

| Classification  | Gate-level simulation results | Post-layout simulation results |  |
|-----------------|-------------------------------|--------------------------------|--|
| Total Power     | 5.056e-04 W (100%)            | 5.728e-04 W (100%)             |  |
| Dynamic Power   | 4.078e-04 W (80.65%)          | 4.639e-04 W (80.98%)           |  |
| Switching Power | 6.709e-05 W (16.45%)          | 1.024e-04 W (22.07%)           |  |
| Internal Power  | 3.407e-04 W (83.55%)          | 3.615e-04 W (77.93%)           |  |
| X-tran Power    | 0.000e+00 W (0.00%)           | 3.323e-12 W (0.00%)            |  |
| Glitch Power    | 1.153e-06 W (0.28%)           | 5.391e-06 W (1.16%)            |  |
| Leakage Power   | 9.782e-05 W (19.35%)          | 1.089e-04 W (19.02%)           |  |

# B-3 Architecture of Transmitter

## **B-3-1** Components

The baseband transmitter is composed by a QPSK mapper, IFFT processor, and GI insertion FIFO. After the downlink process, the baseband clock has tuned accurately, the transmitter begins to gather the body signal and transmit to the uplink channel. The component in WSN\_TX is list in Table B-8.

| RTL code file name    | Descriptions                             |
|-----------------------|------------------------------------------|
| WSN_TX.v              | WSN TX top module                        |
| I_FFT_64p_8in_10out.v | IFFT module (provided by Chen-Fong Shao) |
| BITREV.v              | Bit reversal part in IFFT module         |

Table B-8: The component in WSN\_TX

The only signal in WSN TX for mode operation is PSDU signal. It represents the data length that modulated. Table B-9 shows the mode of operation in WSN\_TX.

| PSDU [1:0] | IN BITS number | Output IFFT symbol                   |
|------------|----------------|--------------------------------------|
| 2'b 00     | 128            | 321 + 128 * 8 / 2 / 24 * 66 = 1773   |
| 2'b 01     | 256            | 321 + 256 * 8 / 2 / 24 * 66 = 3159   |
| 2'b 10     | 512            | 321 + 512 * 8 / 2 / 24 * 66 = 5997   |
| 2'b 11     | 1024           | 321 + 1024 * 8 / 2 / 24 * 66 = 11607 |

Table B-9: Mode of operation in WSN\_TX

## **B-3-2** External Signal Descriptions

The external signal description is shown in Table B-10. The transmitter sends the preamble and packets after the clock is tuned accurately by the receiver processing. The data is required from the sensor memory. The TX sends an IN\_ACK signal to the memory, gets the data and transmit to the uplink channel.

| name      | I/O | width | Simple description                                                                      |  |
|-----------|-----|-------|-----------------------------------------------------------------------------------------|--|
| CLK       | I   | 1     | Clock signal. This design operates at positive edge clock                               |  |
| RESET     | I   | 1     | Synchronous, active-high reset signal                                                   |  |
| IN_BIT    | I   | 2     | input data followed with IN_ACK, for qpsk modulation                                    |  |
| IN_ACK    | 0   | 1     | input requirement signal, tell MCU that the IN_BIT is valid                             |  |
| FUNC_EN   | I   | 1     | Function enable. 1: enabling the overall state diagram                                  |  |
| P_EN      | I   | 1     | Process enable. 1: enabling the initial state diagram                                   |  |
| PSDU      | I   | 2     | Indicate the data amount.<br>0: 128 bytes, 1: 256 bytes, 2: 512 bytes, 3: 1024<br>bytes |  |
| OUT_I     | 0   | 7     | I channel output signal (enabled with OUT_VALID)                                        |  |
| OUT_Q     | 0   | 7     | Q channel output signal (enabled with OUT_VALID)                                        |  |
| OUT_VALID | 0   | 1     | indicate the output is valid now                                                        |  |

Table B-10: External signal description in WSN\_TX

#### B-3-3 FSM

The FSM in WSN\_TX is shown in Figure B-4.



Table B-11: State description in WSN\_TX

| state | state descriptions:    |
|-------|------------------------|
| inist | initial state          |
| pvast | transimit the preamble |
| inpst | valid for IN_BIT       |

## B-3-4 Area

We show the area cost in terms of synthesis results and post-layout results. The hardware area from synthesis report is shown in Table B-12. The synthesis tool is Design Compiler.

| module name |                   | area (μm²)   |
|-------------|-------------------|--------------|
| WSN_TX      |                   | 92983.287945 |
|             | IFFT              | 82628.369811 |
|             | GI insertion FIFO | 8403.823131  |

| Tabl | le B- | -12: | Area | report | in | WSN | ТΧ |
|------|-------|------|------|--------|----|-----|----|
|      |       |      |      |        |    |     |    |

The post-layout results are shown as below. The place and route (P&R) tool is

Cadence RTL Encounter. The P&R view of WSN\_TX is shown in Figure B-5.



Figure B-5: P&R view of WSN\_TX

## B-3-5 Power

Here the power consumption in terms of gate-level simulation results and post-layout results are described. First, the power classification is introduced in Table B-6 in previous section. The EDA tool used for power analysis is Prime Power. Table B-13 shows the power consumption in gate-level simulation and post-layout simulation.

| Classification  | Gate-level simulation results | Post-layout simulation results |  |
|-----------------|-------------------------------|--------------------------------|--|
| Total Power     | 1.860e-04 W (100%)            | 2.329e-04 W (100%)             |  |
| Dynamic Power   | 1.547e-04 W (83.19%)          | 1.990e-04 W (85.46%)           |  |
| Switching Power | 2.182e-05 W (14.10%)          | 4.689e-05 W (23.56%)           |  |
| Internal Power  | 1.329e-04 W (85.90%)          | 1.521e-04 W (76.44%)           |  |
| X-tran Power    | 9.944e-08 W (0.06%)           | 1.064e-07 W (0.05%)            |  |
| Glitch Power    | 4.669e-07 W (0.30%)           | 4.546e-06 W (2.28%)            |  |
| Leakage Power   | 3.126e-05 W (16.81%)          | 3.386e-05 W (14.54%)           |  |

Table B-13: Power report in WSN\_TX

