### 國立交通大學

電子工程學系 電子研究所

### 碩士論文

適用於展頻時脈之數位相位調整的時脈資 料回復電路

ESN

Digitally Phase Adjusted Clock Data Recovery for Receiver with Spread Spectrum Clocking

指導教授:周世傑 博士

研究生:李舒蓉

中華民國 九十七年 十月

# 適用於展頻時脈之數位相位調整的時脈資 料回復電路

# Digitally Phase Adjusted Clock Data Recovery for Receiver with Spread Spectrum Clocking



A Thesis

Submitted to Department of Electronics Engineering & Institute of Electronics College of Electrical and Computer Engineering National Chiao Tung University in partial Fulfillment of the Requirements for the Degree of Master of Science in Department of Electronics Engineering October 2008

Hsinchu, Taiwan, Republic of China

中華民國九十七年十月



#### 適用於展頻時脈之數位相位調整的時脈資料回復電路

研究生:李舒蓉

指導教授:周世傑 博士

#### 國立交通大學

電子工程學系 電子研究所碩士班

#### 摘要

各式的高效能低成本串列傳輸技術廣泛應用於各種現代電子產品中,而時脈 資料回復電路則是高速串列傳輸系統的接收端中最關鍵的部分。現代時脈資料回 復電路設計的趨勢包括:隨著資料頻寬的提升與成本的下降,多通道的串列傳輸 已成為主流。而數位設計的時脈資料回復電路往往比類比電路設計更適合於此類 應用且不受製程/電壓/溫度變化的影響。另外,為了對抗電磁波干擾的問題,展 頻技術也被運用在資料傳輸內,因此時脈資料電路需要具備從展頻時脈中回復正 確資料的能力。

在高速時脈資料回復電路中,二元相位偵測器是主流的趨勢。但是二元相位 偵測的非線性行為卻會為相位追蹤迴路帶來諸多不利影響,如:增益隨抖動量改 變、穩態下振盪等。因此我們提出「多重交替式轉態取樣技術」及「增益補償」, 能有效的使二元相位偵測器的增益線性化,從而達到穩定的相位追蹤。

展頻技術是對時脈信號的頻率做微量的調變,使其在頻譜上分散在較寬的頻帶範圍,因此可降低時脈造成的能量峰值。提出的展頻時脈產生器以鎖相迴路為基本架構,並使用和差調變器及相位旋轉方式完成之。我們所提出的展頻時脈產 生器主要應用於第三代 Serial ATA 中,向下展頻 5000ppm 同時採用三角波調變 且調變頻率為 30KHz。本論文的主題之一即是討論不同階數之和差調變器的影 響。由我們理論的結果顯示,只要相位內插器解析度夠細,不同階數的調變器所造成的抖動差異是小到足以被忽略的。

實作晶片使用聯電標準臨界電壓 90 奈米互補式金氧半導體製程來製造,佈 局後之模擬的資料頻率為 5.5Gbps 到 6.5Gbps,回復時脈的峰對峰抖動值為 17.52ps。展頻時脈最大週期對週期時脈抖動為 1.13ps 並操作在 1.4GHz 時消耗 7.57 毫瓦。能量峰值所能降低的最大數量為 20.6dB。





#### **Digitally Phase Adjusted Clock Data Recovery for Receiver**

#### with Spread Spectrum Clocking

Student: Shu-Rung Li

Advisor: Porf. Shyh-Jye Jou

Department of Electronics Engineering Institute of Electronics National Chiao Tung University

#### Abstract

Recently, many high-speed and low cost serial link transmission technologies are developed and are widely used in modern electronic products. The clock and data recovery module is the most important component in the receiver end of high speed serial link systems. Modern trend of CDR circuit design includes: First, as the increase of transmission bandwidth and the decrease of fabrication cost, multi-channel transmission system has become the mainstream. Second, digitally implemented CDRs are often more favorable than analog ones for the broad applications and robustness against PVT (process, voltage, temperature) variations. Finally, in order to reduce EMI (electro-magnetic interference) problem, spread spectrum clock technology is used in data transmission. Therefore, it is necessary for CDR to recover correct data from spread spectrum clock transmission.

In the high speed CDR, binary phase detection is the mainstream. However the non-linear characteristic of binary phase detection introduces unwanted effects like PD gain varies with jitter amplitude, and oscillatory steady state of phase tracking. Therefore we propose a Multiple-Alternating Edge Sampling (M-AES) scheme and Gain Compensation to linearize the PD gain and achieve a stable phase detection.

Spread spectrum technique slightly modulates the frequency of clock signal and spreads it over a wider bandwidth. This would lead to a reduction of the peak level of the clock energy. A spread spectrum technique using PLL with a sigma delta modulator and phase rotation algorithm is proposed. Our spread spectrum clock generator (SSCG) for Serial ATA III Specification is down spread 5000ppm with a triangular modulation profile and the modulation frequency is 30 KHz. One objective of this thesis is to analyze the effect of different order of  $\Sigma\Delta$  modulation. Our theoretical results have shown that, once the phase resolution of the interpolator is high enough, the difference of the jitter from different order of modulators is so insignificant that can be neglected.

The test chip is fabricated in UMC 90nm CMOS regular-Vt process. The post-layout simulated data rate from 5.5Gbps to 6.5Gbps, the peak-to-peak jitter is 17.52ps. The spread clocking has a peak-to-peak cycle-to-cycle jitter of 1.13ps and consumes 7.57mW at 1.4GHz. The EMI reduction in this circuit is about 20.6dB. The analog circuit power consumption is 55mW under 1.0V supply voltage.

#### 致 謝

首先感謝周世傑老師在我碩士生涯中細心的指導與鼓勵。從大學專題就跟著 老師學習做研究的方法和態度。老師認真、聰明地處理每一件事的態度,是個讓 我獲得激勵的好榜樣,謝謝老師耐心地教導我,讓我在求學生涯獲益良多。祝福 老師家庭幸福,健康快樂。同時也感謝蘇朝琴教授、陳巍仁教授與蔡嘉明教授撥 空參加口試,並賜予寶貴的意見讓我的論文更加完整。

其次我要特別感謝志憲學長,雖然一把年紀,卻仍親切幽默好相處,學長無 私地教導讓我學到許多,真的很謝謝你,快生個小totoro讓我欺負吧,哈哈。另 外最要感謝彥穎學長,這一年多來,不管是學業上、人生態度上,常常一針見血 地提出寶貴的意見,耐心地解惑我的疑問,你真是位良師益友的好學長!還有謝 謝碩一坐我旁邊的俊誼學長,熱心地幫助我許多煩碎的雜事,你那些無厘頭的話, 為實驗室製造許多愉快。感謝傳說中的完美男人--俊男學長,不時給予我溫暖的 鼓勵,幽默又窩心!國光學長,雖然嘴巴很壞,但大部份時候很溫馨,是個窩心 的學長。廷禎學長,除了幫助我學業上的疑惑,一起對時事的討論,讓我枯燥的 碩二生活活潑許多。昭安同學,有趣又溫暖,你的厚道給予我滿滿的正面力量。 漂亮個性又好的momo學姊,因為有妳的陪伴才能讓我對智財學程堅持到底。秉 威是個體貼的朋友,實驗室有你,多了份溫暖。謝謝企鵝的義氣相伴,延畢的路 上才不寂寞。篤雄同學,謝謝你義氣相助316很多忙。小胖學長,謝謝你大方的 與我們分享食物與各種故事。衷心謝謝你們,不論是一起閒聊、開生日玩笑、看 棒球、打羽球、熬夜趕進度、出遊、爬山…,在我碩士生活裡,因為有你們,讓 我充滿歡笑和淚水,過得相當多采多姿。

另外要謝謝婉清,除了是可分享心事的好同學外,還身兼我腳踏車和羽球教練,好笑的妳,常帶給我大大的快樂。謝謝康帆、校慈、姿虹,因為有妳們的傾聽和鼓勵,我才能不忘初衷,堅持自己。謝謝最重要的全雯,因為有妳,交大六年很充實、快樂,還好有妳、有妳真好。

最後感謝我的家人,我親愛的爸爸、媽媽、哥哥、艾艾,謝謝你們疼我愛我, 也讓我愛你們疼你們,你們是我努力的動力。謝謝老天賜予我溫暖可愛的家人 們。

這一路要感謝的人太多,走到這裡,深感自己是個幸運的孩子,我會繼續努力認真,盡我所能幫助需要幫助的人,願大家平安、快樂!

李舒蓉

于 新竹交大

2008.10.21



# Contents

| 1 | Intr | oduction                                 | 1  |
|---|------|------------------------------------------|----|
|   | 1.1  | Background                               | 1  |
|   | 1.2  | Motivations and Goals                    | 2  |
|   | 1.3  | Thesis Organization                      | 3  |
| 2 | CD   | R Basics                                 | 4  |
|   | 2.1  | Introduction to CDR                      | 4  |
|   |      | 2.1.1 Feed-Forward Phase Adjusted Scheme | 6  |
|   | 2.2  | Jitter Fundamentals                      | 6  |
|   |      | 2.2.1 Jitter                             | 7  |
|   |      | 2.2.2 Quantifying Jitter                 | 8  |
|   |      | 2.2.3 Sources of Jitter                  | 9  |
|   | 2.3  | Timing and Data Format Specifications    | 11 |
|   |      | 2.3.1 Data Format                        | 11 |
|   |      | 2.3.2 Timing and Jitter Performance      | 12 |
| 3 | The  | oretical Analysis of the proposed CDR    | 15 |
|   | 3.1  | Overview                                 | 15 |

|          | 3.2            | Phase  | /Frequency tracking CDR                                 | 16 |
|----------|----------------|--------|---------------------------------------------------------|----|
|          |                | 3.2.1  | Binary Phase Detector                                   | 18 |
|          |                | 3.2.2  | Proportional/Integral Path and Phase Rotation Counter . | 19 |
|          |                | 3.2.3  | Phase Selection                                         | 21 |
|          | 3.3            | Loop   | Bandwidth Stabilization                                 | 21 |
|          |                | 3.3.1  | Majority Vote                                           | 24 |
|          |                | 3.3.2  | Multiple Alternating Edge Sampling (M-AES) Scheme       | 26 |
|          |                | 3.3.3  | Gain Compensation                                       | 30 |
|          | 3.4            | Simula | ation Results                                           | 31 |
| 4        | $\mathbf{Spr}$ | ead Sp | ectrum Clocking                                         | 38 |
|          | 4.1            | Backg  | round                                                   | 38 |
|          | 4.2            | Spread | l Spectrum Mechanism                                    | 40 |
|          |                | 4.2.1  | Introduction to Modulation Mechanism                    | 40 |
|          |                | 4.2.2  | Noise Transfer Function                                 | 43 |
|          |                | 4.2.3  | Conception of SSCG Using Phase Rotation                 | 46 |
|          |                | 4.2.4  | Phase Rotation Mechanism                                | 49 |
|          | 4.3            | ΣΔ Μ   | lodulators                                              | 51 |
|          |                | 4.3.1  | Basic Principles of $\Sigma\Delta$ Modulation           | 52 |
|          |                | 4.3.2  | Quantization Noise                                      | 53 |
|          |                | 4.3.3  | Digital implementation of the $\Sigma\Delta$ modulator  | 55 |
|          | 4.4            | Analy  | sis of Quantization Noise                               | 57 |
| <b>5</b> | Exp            | perime | ntal Results and Conclusions                            | 67 |

#### v

| Bi | Bibliography |                               |    |  |  |  |
|----|--------------|-------------------------------|----|--|--|--|
| 6  | Con          | clusions                      | 75 |  |  |  |
|    | 5.3          | Measurement Environment Setup | 69 |  |  |  |
|    | 5.2          | Layout                        | 69 |  |  |  |
|    | 5.1          | Circuits Simulation           | 67 |  |  |  |



# List of Figures

| 1.1 | High speed link block dargram                                                      | 2  |
|-----|------------------------------------------------------------------------------------|----|
| 2.1 | Block diagram of the analog PLL based CDR                                          | 5  |
| 2.2 | Block diagram of dual loop CDR                                                     | 6  |
| 2.3 | Feed-forward phase adjusted CDR                                                    | 7  |
| 2.4 | Example of eye diagram. Vertical axis is voltage and the horizontal axis is time.  | 8  |
| 2.5 | Adjacent cycles used to calculate cycle-to-cycle jitter                            | 9  |
| 2.6 | Total jitter is the sum of several components                                      | 10 |
| 2.7 | Receiver eye diagram [15]                                                          | 12 |
| 2.8 | The target jitter tolerance mask [16]                                              | 13 |
| 3.1 | (a)<br>The concept of $2^{nd}$ -order CDR (b)<br>The proposed $2^{nd}$ -order CDR. | 16 |
| 3.2 | Block diagram of proposed CDR                                                      | 17 |
| 3.3 | The architecture of Proportional path                                              | 19 |
| 3.4 | Block diagram of a simplified Proportional, Integral and Counter.                  | 20 |
| 3.5 | Block diagram of the multi-phase VCO and phase selection                           | 22 |
| 3.6 | Binary PD operation under various jitter condition. (a)Small jitter                |    |
|     | (b)Large jitter[26]                                                                | 23 |

| 3.7  | (a)The ideal operation of data and edge sampling (b)Under jittery                |    |
|------|----------------------------------------------------------------------------------|----|
|      | condition, the operation of binary PD and Majority vote                          | 25 |
| 3.8  | Different jitter sigma conditions for comparison of PD output                    | 25 |
| 3.9  | The normalized PD output for various jitter conditions                           |    |
|      | (a) without Majority Vote                                                        |    |
|      | (b)with Majority Vote                                                            | 27 |
| 3.10 | The effective PD gain for various jitter conditions                              |    |
|      | (a)<br>Under transition density = $100\%$ condition (without Majority            |    |
|      | Vote)                                                                            |    |
|      | (b)<br>Under transition density = $100\%$ condition (with Majority Vote)         |    |
|      | (c)<br>Under transition density = $20\%$ condition (without Majority             |    |
|      | Vote)                                                                            |    |
|      | (d)Under transition density $\stackrel{E}{=}$ 20% condition (with Majority Vote) | 28 |
| 3.11 | The proposed multiple alternating edge sampling                                  | 29 |
| 3.12 | Different data transition density results in different PD gain, $\phi_{e1} =$    |    |
|      | $\phi_{e1}$                                                                      | 30 |
| 3.13 | Normalized gain versus (lead-lag) for Majority Vote and Gain Com-                |    |
|      | pensation                                                                        | 32 |
| 3.14 | The discrete-time model of proposed CDR                                          | 32 |

3.15 Simulation results of periodic jitter with different conditions (without M-AES scheme).

(a)Binary PD; Transition density=100%

- (b)Binary PD; Transition density=50%
- (c)Majority Vote; Transition density=100%
- (d)Majority Vote; Transition density=50%

(e)Gain Compensation ; Transition density=100%

(f)Gain Compensation ; Transition density=50%

34

- 3.16 Simulation results of periodic jitter with different conditions (with M-AES scheme).
  (a)Binary PD; Transition density=100%
  (b)Binary PD; Transition density=50%
  (c)Majority Vote; Transition density=100%
  (d)Majority Vote; Transition density=50%
  (e)Gain Compensation ; Transition density=100%
  (f)Gain Compensation ; Transition density=50%
  3.17 Simulation results of 100ppm frequency offset under different conditions (with M-AES scheme).
  (a)Majority Vote; Transition density=100%
  - (a)majority (coo) realisition density (co))
  - (b)Majority Vote; Transition density=50%
  - (c)Gain Compensation; Transition density=100%
  - (d)Gain Compensation; Transition density=50% ..... 37

| 4.4  | Block diagram of a SSCG with modulation on VCO.                        | 41 |
|------|------------------------------------------------------------------------|----|
| 4.5  | Block diagram of a SSCG with modulation on input reference clock.      | 42 |
| 4.6  | Block diagram of a SSCG with modulation on divider. $\ldots$ .         | 42 |
| 4.7  | Block diagram of a SSCG with phase selection method                    | 43 |
| 4.8  | Noise transfer function of each modulation mechanism                   | 44 |
| 4.9  | Triangular modulation profile for SATA-III (a)Ideal profile (b)Digital |    |
|      | approach                                                               | 46 |
| 4.10 | Block diagram of SSCG using phase rotation scheme                      | 48 |
| 4.11 | Timing diagram of the phase rotation.                                  | 48 |
| 4.12 | Block diagram of phase rotation and phase rotation control. $\ldots$   | 50 |
| 4.13 | Randomization and noise shaping to eliminate unwanted spurs            | 51 |
| 4.14 | General $\Sigma\Delta$ modulator: (a) block diagram (b) linear model   | 53 |
| 4.15 | Definition of phase-noise spectrum.                                    | 55 |
| 4.16 | Realization of $1^{st}$ -order $\Sigma\Delta$ modulator                | 56 |
| 4.17 | Implementation of $2^{nd}$ -order $\Sigma\Delta$ modulator             | 56 |
| 4.18 | Magnitude spectrum for noise transfer function. (log scale)            | 58 |
| 4.19 | PLL closed loop response.                                              | 61 |
| 4.20 | Different modulation order conditions for comparison of RMS jit-       |    |
|      | ter.                                                                   |    |
|      | ( phase resolution of the interpolator= $\frac{1}{160}$ )              |    |
|      | (a)on the front of PD (b)filter through the loop                       | 62 |

| 4.21 | Different modulation order conditions for comparison of RMS jit-               |    |
|------|--------------------------------------------------------------------------------|----|
|      | ter.                                                                           |    |
|      | ( phase resolution of the interpolator= $\frac{1}{40}$ )                       |    |
|      | (a)<br>on the front of PD (b)<br>filter through the loop $\ . \ . \ . \ . \ .$ | 63 |
| 4.22 | Different modulation order conditions for comparison of RMS jit-               |    |
|      | ter.                                                                           |    |
|      | ( phase resolution of the interpolator= $\frac{1}{10}$ )                       |    |
|      | (a)<br>on the front of PD (b)<br>filter through the loop $\ . \ . \ . \ . \ .$ | 64 |
| 4.23 | TIE jitter of VCO in different order modulators conditions                     | 65 |
| 4.24 | The spectrum of VCO output with different order $\Sigma\Delta$ modulations.    | 66 |
| 5.1  | (a) K28.5 Input pattern (b)<br>Verification of CDR functionality               | 68 |
| 5.2  | The spectrum of recovered clock and receiver clock in SSC simu-                |    |
|      | lation                                                                         | 68 |
| 5.3  | Period of VCO output waveform vs. time                                         | 70 |
| 5.4  | The spectrum of VCO output under different order modulation                    | 71 |
| 5.5  | Spectrum of VCO output clock with and without spread spectrum                  |    |
|      | modulation.                                                                    | 71 |
| 5.6  | Layout view of test chip                                                       | 72 |
| 5.7  | Test Environment Setup                                                         | 74 |

### List of Tables

| 2.1 | Generations of SATA [14]                                                                  | 11 |
|-----|-------------------------------------------------------------------------------------------|----|
| 2.2 | The parameter of receiver eye diagram of Figure 2.7                                       | 12 |
| 3.1 | The operation of gain compensation.                                                       | 31 |
| 4.1 | The simulated jitter on the front of PD                                                   | 60 |
| 4.2 | The simulated jitter filter through the loop $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$ | 61 |
| 4.3 | The RMS jitter of the VCO output                                                          | 66 |
| 5.1 | Design summary of proposed CDR                                                            | 72 |
| 5.2 | Design summary of proposed SSCG                                                           | 73 |
| 6.1 | Design summary of proposed SSCG                                                           | 76 |

### Chapter 1

### Introduction

#### 1.1 Background

Technology scaling has dramatically increased the amount of computation that can be integrated onto a small piece of silicon. This increased computation has highlighted the requirement for chip I/O that can supply the information fast enough to keep the compute engine running. As a result, the design of chip I/O has become increasingly sophisticated, with multi-Gb/s bandwidths now prevalent in high performance computer systems and networks. These high speed links are composed of a transmitter and a receiver communicating over a channel, as shown in Figure 1.1. The transmitter generates a high bandwidth signal. Then the receiver must reconstruct the original transmitted bitstream from the received waveform. The system task spans a wide area of disciplines including channel design, package design, signaling methods, equalization and clock and data recovery (CDR), which is the subject of this thesis. Another significant problem associated with high speed is EMI. Faster operating speeds of electrical devices result in more electromagnetic interference (EMI) at higher frequencies. EMI may well interfere the operation of its source circuit and the equipments adjacent to it. Since heavy metal shielding is not the low-cost option in the lightweight portable device, spread spectrum techniques have been frequently

used for EMI reduction. However, there are some challenges existed for circuit designer due to the fact that the modified clock does not align any more in the synchronous system. The transmitter and receiver run into new difficulties in spread spectrum clocking (SSC) system due to the occurrence of the deterministic timing jitter.



The requirements of high data rate and highly integration of modern high speed serial link motivates the design of clock and data recovery circuit that is able to work at multi-Gb/s and is suitable in large scale integration. The work of this thesis is motivated from two issues, one is to increase the performance of a conventional binary PD and the other is to conquer spread spectrum. The proposed CDR is a 2nd-order phase/frequency tracking, feed-forward phase adjusted CDR with Multiple Alternating Edge Sampling (M-AES) function. The CDR is able to track spread spectrum clock and is suitable for Multi-I/O integration. The developing SATA generation 3 [1] provides a good design example because of the high speed and similarity to a variety of modern high speed serial applications, such as HDMI [2] and PCI-E [3]. This work presents the design and implementation of a CDR applicable to SATA -III specifications. The specifications of SATA-III are investigated and become design target. The theory analysis and behavioral simulation will be carried out on MATLAB and Wolfram Mathematica platform. The functional circuits designed in HSPICE and Verilog are simulated in mixedsignal simulator Nanosim. The test chip was fabricated in UMC 1P9M 90nm 1.0V Regular-Vt CMOS process.

#### **1.3** Thesis Organization

This chapter provided an introduction on the objective of CDR and spread spectrum clocking.

Chapter 2 will first review tracking type and phase picker which are the most commonly used CDR architecture. Chapter 2 will then discuss jitter fundamentals and timing, data format specifications.

Chapter 3 describes the  $2^{nd}$ -order Phase/Frequency tracking algorithm for Feed-Forward Phase Adjusted CDR. Then, we will propose some schemes to improve loop bandwidth stabilization. Finally, behavioral simulations are carried out.

Chapter 4 describes links that use spread spectrum clocking. Spread spectrum clocking is used in wireline communication in order to reduce EMI. In our work, we adopt  $\Sigma\Delta$  modulator to adjust the phase rotation. Despite the fact that higher-order  $\Sigma\Delta$  modulation is commonly used, we analyze the advantages and disadvantages of different order of  $\Sigma\Delta$  modulation and their applications in SSC.

Chapter 5 shows the experimental results and Chapter 6 draws a conclusion.

# Chapter 2 CDR Basics

#### 2.1 Introduction to CDR

In multi-gigabit serial link systems, due to the extremely high data rate, the bit time becomes small as compared to signal propagation time. It is therefore impractical to provide additional serial clock with a separate wire because even the slightest difference in length of the data and clock line will introduce significant skew. In modern high speed serial links, the clock is no longer transmitted through the channel, but is extracted from the data by the clock and data recovery (CDR) circuits. The CDR must detect the phase and frequency information from the received data transition and adjust the local clock generator to recover the link clock signal. The objective of the CDR is to recover the data with as few errors as possible. In other words, the goal of the CDR is to minimize the bit error rate (BER) through the channel. The CDR is therefore an important building block in the receiver architecture, and is used in many serial link systems, such as Gigabit Ethernet, serial ATA, PCI-Express, HDMI, SONET/SDH, XAUI, etc.

Traditionally, the earlier design [4]-[6] incorporate a PLL in the CDR loop to track the phase and frequency of incoming bit steam. As shown in Figure 2.1, the PLL based CDR uses a feedback loop control to adjust the phase of the sampling clock for recovering the data in the received stream. Although the PLL based CDR design is straightforward, the direct use of PLL to recover clock leads to undersired bandwidth conflict. The bandwidth requirements of PLL loop and CDR loop may be different. For example, in order to achieve good jitter tolerance, high bandwidth is required so that the CDR can track the jittery input signal and be able to recover the incoming data. However, the the bandwidth of PLL must set low for stability reasons.



Figure 2.1: Block diagram of the analog PLL based CDR.

Such bandwidth issue leads to the development a dual loop configuration. A simplified block diagram is shown in Figure 2.2. A separate feedback phase/frequency recovery loop chooses among multiple phase from PLL to track the receive stream. The dual-loop architecture provides an additional advantage for modern high traffic serial link application [8]. In modern communication systems, multi-IO systems integrated on System-On-Chip (SOC) is desired because the high data rate requirements and reduced area and power. For multi-IO systems, many dual-loop CDRs can share one common frequency synthesize loop to provide plesiochronous clock, while each recovery loop is independent from other IOs and function individually.



Figure 2.2: Block diagram of dual loop CDR.

a liller

#### 2.1.1 Feed-Forward Phase Adjusted Scheme

Due to the feed-back architecture suffering the bandwidth requirements conflict as mentioned above, the frequency synthesize loop (PLL) and clock recovery loop must be independent from each other [7] [9] [10]. As shown in Figure 2.3, the phase selection is moved away from PLL's feedback clock to the direct output of multi-phase VCO. In our proposed architecture, bandwidth conflict is avoided and can support multi-IO applications. Multiple phase multiplexer and phase interpolator are used in phase selection because of parallel sampling. Figure 2.3 shows the case with parallel sampling of five bits. This will have extra four multiplexer and interpolator blocks, but trades for application flexibility, and much more power/area saving in multi-IO systems.

#### 2.2 Jitter Fundamentals

In a data link system, the signal integrity is degraded by noise interference. The waveform of data received at the receiving end is slightly different from the wave-



Figure 2.3: Feed-forward phase adjusted CDR

a sullar

form of original data at the transmitting end due to channel imperfections. An eye diagram, which is created by overlaying consecutive bits onto a single bit time, is useful in explaining the function of a CDR (Figure 2.4). The 'eye' shape is created by the four possible transitions. The transitions have uncertainty caused by timing and voltage noise in the system. What complicates CDR design is that the data transitions are not stationary with respect to a timing reference. There are two reasons for this movement. The first is due to any deterministic phase offset trajectory that is a result of frequency offsets between the TX and RX reference clocks. The second is due to timing uncertainty or noise that is referred to as jitter. These will be explained next.

#### 2.2.1 Jitter

Short-term frequency instabilities, seen in the time domain as jitter, can cause problems in both analog and digital signals. As system operating frequencies have increased, these instabilities have gained increasing importance, because their



Figure 2.4: Example of eye diagram. Vertical axis is voltage and the horizontal axis is time.

relative size to the total period length is larger. Jitter refers to the the deviation of the significant instances of a signal from their ideal location in time. To put it more simply, jitter is how early or late a signal transition is with reference to when it should transition. It is a key performance factor in high-speed data communications. Jitter isn't measured simply to create statistics, it is measured because jitter can cause transmission errors. For if jitter results in a signal being on the wrong side of the transition threshold at the sampling point, the receiving circuit will interpret that bit differently than the transmitter intended, causing a bit error.

Jitter and noise are deviations from an ideal signal. Jitter can have many causes. The physical nature of various jitter sources for a communication system can be classified into two major classes: random jitter and deterministic jitter. These types are discussed in detail as noted next.

#### 2.2.2 Quantifying Jitter

**Cycle-To-Cycle Jitter** Cycle-to-cycle jitter compares the difference in the period length of adjacent cycles. This would be calculated by subtracting period  $\tau_1$  from period  $\tau_2$  in the example shown in Figure 2.5.

- **Period Jitter** Period jitter compares the length of each period to the average period ( $\tau_{ave}$ ) of an ideal clock at the long-term average frequency of the signal. Each datapoint would be generated by subtracting  $\tau_n$ - $\tau_{ave}$ , where n is the period being measured.
- **Time Interval Error (TIE)** The difference in time between the actual threshold crossing and the expected transition point. The deviations in time use either the actual transmitter clock or a reconstruction of it from the sampled data set and take the form of instantaneous phase variations for each bit period of the waveform captured. Incidentally, this representation of jitter is of most interest for current standards [12].



Figure 2.5: Adjacent cycles used to calculate cycle-to-cycle jitter.

#### 2.2.3 Sources of Jitter

The sources of jitter are often categorized as bounded and unbounded. Bounded jitter sources reach maximum and minimum phase deviation values within an identifiable time interval. This type of jitter is also called deterministic, and results from systematic and data-dependent phenomena. Examples of systematic phenomena include crosstalk and impedance mismatch. Data-dependent sources include intersymbol interference and duty-cycle distortion.

Unbounded jitter sources do not achieve a maximum or minimum phase deviation within any time interval, and jitter amplitude from these sources approaches infinity, at least theoretically. This type of jitter is also referred to as random and results from random noise sources. The total jitter on a signal, specified by the phase error function  $\phi_j(t)$ , is the sum of the deterministic and random jitter components affecting the signal:

$$\phi_i(t) = \phi_i(t)^D + \phi_i(t)^R \tag{2.1}$$

where  $\phi_j(t)^D$ , the deterministic jitter component, quantified as a peak-to-peak value,  $J_{pp}{}^D$ , is determined by adding the maximum time advance and time delay produced by the deterministic jitter sources.  $\phi_j(t)^R$ , the random jitter component, quantified as a standard deviation value,  $J_{rms}{}^R$ , is the aggregate of all the random noise sources affecting the signal. Random jitter is assumed to follow a Gaussian distribution and is defined by the mean and sigma of that Gaussian distribution. To determine the jitter produced by the random noise sources, the Gaussian function representing this random jitter must be determined and its sigma can then be evaluated [12].



Figure 2.6: Total jitter is the sum of several components.

In short, total jitter is composed of a random jitter component and a deterministic jitter component, denoted in Figure 2.6.

#### 2.3 Timing and Data Format Specifications

Our proposed CDR is targeted for modern multi-gigabit serial transmission systems that are applicable to different standards. One suitable standard is Serial Advanced Technology Attachment Generation 3 (SATA-III) [1]. The SATA is a high speed interconnection applied in computer and storage devices like hard disk and optical drivers and is expected to replace the widely used ATA technology. Although SATA-II already found applicability in modern hard disk drive and is able to cover foreseeable improvement of hard disk drive transfer rate in near future, SATA-III is still being developed and will be used in port multipliers, solid-state drives, and the continuing of storage evolution based on historic trends [13]. Table 2.1 shows the generations of SATA.

| Table 2.1. Generations of SATA [14].   |                 |                 |                      |  |  |  |  |
|----------------------------------------|-----------------|-----------------|----------------------|--|--|--|--|
| Generation 1 Generation 2 Generation 3 |                 |                 |                      |  |  |  |  |
| Approximate speed                      | 1.2 Gbits/s     | 2.4 Gbits/s     | $4.8^{1}$ Gbits/s    |  |  |  |  |
| (8b  side)                             | (150  Mbytes/s) | 196 N           |                      |  |  |  |  |
| Approximate speed                      | 1.5  Gbits/s    | 3.0 Gbits/s     | $6.0^{1}$ Gbits/s    |  |  |  |  |
| (10b  side)                            |                 |                 |                      |  |  |  |  |
| Estimated                              | mid 2001        | mid 2004        | mid 2007             |  |  |  |  |
| introduction date                      |                 |                 |                      |  |  |  |  |
| Connector                              |                 | Same as Gen1    | May be upgraded      |  |  |  |  |
| Cable                                  |                 | Same as Gen1    | May be upgraded      |  |  |  |  |
| Signaling                              |                 | Compatible with | Compatible with      |  |  |  |  |
| $\operatorname{compatibility}$         |                 | Gen1            | Gen2-may be          |  |  |  |  |
|                                        |                 |                 | compatible with Gen1 |  |  |  |  |

Table 2.1: Generations of SATA [14].

NOTE-

1. These speed specifications and schedules are subject to change

#### 2.3.1 Data Format

Because the specification of SATA-III is still under development, our proposed CDR will use the known specifications of SATA-II. According to [1], the data rate is 6Gb/s. The receiver should be able to detect differential NRZ stream with data rates of  $\pm 350 \, ppm$  with  $0/-5000 \, ppm$  spread spectrum clock from nominal rate. The minimum and maximum differential input voltage is 275mV and 750mV respectively.

#### 2.3.2 Timing and Jitter Performance

The timing requirements are specified in eye diagram and jitter performances. Although eye diagram is not specified in SATA documents, it can be referenced from 3Gb/s standards of Serial Attached SCSI [15] which is capable of interoperating with SATA. Figure 2.7 shows the eye diagram and Table 2.2 shows the parameters.



Figure 2.7: Receiver eye diagram [15].

| Table $2.2$ : | The | parameter | of | receiver | eve | diagram | of | Figure | 2. | 7 |
|---------------|-----|-----------|----|----------|-----|---------|----|--------|----|---|
|               |     |           |    |          |     |         |    | ()     |    |   |

|                                    | Units         | Alias | Value |
|------------------------------------|---------------|-------|-------|
| Min. Rx differential input voltage | mV(P P)       | Z1    | 275   |
| Max. Rx differential input voltage | 111 V (1 -1 ) | Z2    | 750   |
| Half of maximum jitter             | III           | X1    | 0.275 |
| Center UI                          | 01            | X2    | 0.500 |

The jitter performance is specified in [1] and is divided into 2 categories: one is random jitter (RJ), normally measured in standard deviation  $\sigma_{RJ}$  and as a rule of thumb, the data transition edge can be 14 times of the standard deviations away from the mean during 10<sup>12</sup> data transmitted. The other class of jitter is deterministic jitter (DJ), which is characterized by bounded, peak-to-peak value. To ensure 10<sup>-12</sup> BER, the SATA calculates total jitter (TJ) by

$$TJ = DJ + 14 \cdot \sigma_{RJ}$$

Given TJ=0.60UI and DJ=0.42UI [1], one can calculate that  $\sigma_{RJ}$ =0.013UI, and this will be the target specification.



Figure 2.8: The target jitter tolerance mask [16].

Jitter tolerance mask is another important measure of CDR systems that describes the frequency response of the CDR loop under the input phase variations. The jitter tolerance mask is not clearly specified in SATA, therefore we reference the tolerance standard of synchronous digital hierarchy (SDH) STM-64 interface [16], whose data rate is 10 Gb/s, as our design target specification. The specifications are shown in Figure 2.8. From the specification we can see that CDR is required to track low frequency jitter to very large amplitude, while high frequency (> 10 MHz) jitter is allowed to pass directly without any tracking.



### Chapter 3

### Theoretical Analysis of the proposed CDR

#### 3.1 Overview

The oversampling clock and data recovery circuits introduced in Chapter 2 use phase adjusting method, rather than voltage controlled oscillators, to track incoming phase and frequency deviation. Therefore, it needs an algorithm to calculate the required phase adjustment from the information of binary phase detector (PD). In order to track both phase and frequency, it needs a  $2^{nd}$ -order algorithm and has been reported in [9], [17]-[19]. The theoretical analysis can be found in [20] and is very useful in designing the  $2^{nd}$ -order behavioral model.

The s-domain concept of  $2^{nd}$ -order CDR can be seen in Figure 3.1(a), the binary PD detects the phase difference  $\phi_e$ , then  $\phi_e$  is proportionally counted with a gain  $G_P$ , and integrated with gain  $G_I$ . The ratio of the phase adjustments from the proportional path to that from the integral path is defined to be the stability factor  $\xi$  [20]. In Figure 3.1(a), the stability factor equals  $G_P/G_I$ . In binary phase detection without dead-zone,  $\xi$  should be greater than two times the loop latency in UI to achieve unconditionally stable [20]. However, in our design this constrain may be relaxed because of the dead-zone from M-AES as will be described later.

The results from two paths are summed and is used to direct the digital



Figure 3.1: (a)The concept of  $2^{nd}$ -order CDR (b)The proposed  $2^{nd}$ -order CDR. phase rotator. The rotator acts as the VCO in s-domain, which is an integrator at filter output, it integrates the phase +/- information and adjust the phase of sampling edges. In our proposed architecture, however, in order to reduce hardware overhead in implementing integral path while maintaining loop stability, the arrangement is modified as in Figure 3.1(b).

#### 3.2 Phase/Frequency tracking CDR

The block diagram of the proposed feed-forward phase adjusted CDR is shown in Figure 3.2. At data rate of locking  $f_s$ =6GHz, a reference clock of 100MHz is given to the PLL to generate a clock with 1.2GHz, 10 phases. Phase selection block, controlled by the digital 2<sup>nd</sup>-order algorithm, selects 5 phases for data sample and 5 phases for edge sample that tracks incoming stream with phase resolution of 1/32 UI of 6Gb/s data rate. At the sampler, the incoming stream is sampled and synchronized with parallel 5-bits at 1.2GHz, equivalent to 6GHz data rate.



Figure 3.2: Block diagram of proposed CDR.

The Phase Detector is a binary PD, that extract the phase lead/lag information from the data and edge samples. Then a Pre-Filter which composed of a Gain Compensation and a sliding window used to average out the effect of random jitter and balance the loop gain in different data transition density. The sliding window operates at the rate to 600MHz and its output is used in the Proportional and Integral Path. The proportional path and integral path behaves like a  $2^{nd}$ -order digital loop filter that interpret the Up/Down into phase and frequency adjustment. In order to track not only the phase but also frequency of incoming data, the Up/Down information must be integrated to form the frequency information. The integral path is therefore designed to accommodate frequency offset and spread spectrum clock. When the maximum frequency deviation of SSC, the maximum phase adjustment of integral path is
$$PhaseAdjustment_{max} = \frac{5000ppm}{\frac{600MHz}{6GHz} \times \frac{1}{32}UI}$$
(3.1)

Then the phase rotator and decoder controls the phase selection block.

See [21] for further details of architecture design and implementation. This thesis focuses on theoretical analysis and the design tradeoffs.

## 3.2.1 Binary Phase Detector

CDRs can be categorized into two groups according to the phase detection method: one is the linear CDR and the other is the binary CDR. The linear PD detects both the magnitude and the direction of the phase error whereas the binary PD detects the direction only. The linear PD requires a lot of effort in terms of analog building blocks, such as high-speed limiting amplifiers, high-resolution phase comparators, and data retimers [10]. In contrast to this, most parts of the binary PD are implemented in digital, thus requiring less circuit complexity. Hence the binary PD is used in our work. Its binary output simplifies integration with the digital loop filter and allows multi-phase operation so that the CDR can operate beyond the intrinsic speed limit of a flip-flop in a given process.

Binary PD, also referred to as bang-bang PD. The purely digital nature of the binary PD output allows fast and robust digital processing of the timing information, which works well for high-speed application. However, the loss of error magnitude information makes the clock-recovery loop exhibit highly nonlinear behavior. For example, the feedback loop would update the recovered clock timing by a fixed amount based solely on the error polarity, since it does not know how much timing is off by. This updated amount could be too small for large errors, or too large for small errors. In other words, the effective gain or bandwidth of the bang-bang controlled feedback loop depends on the input error magnitude since the output is constant regardless of the input magnitude [22]. The nonlinear nature results in oscillation when phase locked, thus generates intrinsic jitter in steady state [18] [19]. Another disadvantage of binary PD is that its PD gain varies greatly with different jitter conditions [10]. The binary detection of a jittery input creates a large PD gain when jitter is small and a small PD gain when jitter is large. This further deteriorates the stability of phase detection. Overall, the drawbacks of binary PD come mainly from its nonlinearity. Therefore this thesis presents some schemes for adjusting PD gain more linear which are described in the later section.

## 3.2.2 Proportional/Integral Path and Phase Rotation Counter

The  $G_P$  in Figure 3.1(b) is implemented by a modified first-order  $\Sigma\Delta$  modulator. The modification is done by adding sign bit path to handle both positive and negative inputs. The architecture is shown in Figure 3.3. The input of proportional path is from sliding window that sums two successive Up/Down. The value of proportional gain  $G_P$  is decided by the accumulator depth N, that is,  $G_P = 2^{-N}$ . In our design, the length N is programmable from 2 to 5.



Figure 3.3: The architecture of Proportional path.

This implementation produce a time averaged gain equal to a fractional number, and the output is for the phase adjustment step. The phase rotator will integrate the steps and tracks the incoming data phase. With the continuing of phase adjustment, the proportional also has a very limited frequency tracking capability. For example, in our proposed system, if programmed N=3, the proportional path has maximum frequency tolerance of

$$\frac{1}{2^N} \times \frac{600MHz}{6GHz} \times \frac{1}{32}UI = 390.625ppm \tag{3.2}$$

In order to track not only the phase but also frequency of incoming data, the Up/Down information must be integrated to form the frequency information. The integrated signal is then passed into a time averaged gain element similar to the proportional path. It is important to keep the integral gain much smaller than the proportional gain, so that the integral path does no interfere with the proportional path and become unstable.



Figure 3.4: Block diagram of a simplified Proportional, Integral and Counter.

The Phase Rotation is implemented by a 0-159 counter. The counter can be up counting or down counting and the range of 0-159 represents 160 phases, which is the 10 phases from PLL multiplied by the interpolation of 16 intervals. Assume the phase detector output is a positive value, which means up counting the phase is required. A simplified proportional path and an integral path are shown in Figure 3.4. The input is multiplied by  $G_P$  and P is the proportional output, then it sent to integrated path then multiplied by  $G_I$ . Then the proportional output, P, and the integral output, I, are summed and integrated to the counter for phase rotator.

#### 3.2.3 Phase Selection

The use of phase rotator with phase interpolation in phase adjustment has been broadly used in modern development of high speed CDRs [7]-[10], [18]-[25]. The Phase Selection block consists of phase multiplexers and phase interpolators. Of the 10 phases from PLL, the phase multiplexers choose two nearby phases for to interpolate into 16 intervals for finer resolution. As shown in Figure 3.5, due to the parallel 5-bit sampling of incoming data, the 5 data sampling and 5 edge sampling must be parallel shifted; therefore we need 5 duplications of multiplexer pair and interpolator. In order to reduce circuit complexity and avoid the glitch caused by switching of multiplexer in the interpolated signal, we use a zigzag phase selection order instead of one-way selection. Each multiplexer has only even or odd phases as its inputs; therefore we need only 5-to-1 multiplexer but not 10-to-1s. The phase selection circuits achieves 160 phase interpolation of a 1.2GHz clock, equivalent to 1/32 UI of 6Gb/s data rate.

## 3.3 Loop Bandwidth Stabilization

The conventional binary PD generates up/down signals corresponding to the phase error between the internal clock and input data. Since it detects only the direction of the phase error, the binary PD can be implemented with a simple hardware structure. Despite the fact that binary PD is conceptually simple, there are still many unwanted characteristics such as lacking in good stabilization,



Figure 3.5: Block diagram of the multi-phase VCO and phase selection.

because of its severe non-linearity. Inherent nonlinearity of the bang-bang control method makes the binary PD characteristics very sensitive to input data jitter distribution.

Phase decisions in binary PD are often contaminated by input data jitter. Figure 3.6 illustrates for two different jitter cases. Both cases assume that the sampling clock lags the data with a small phase error  $\Delta \phi$ . Jitter histograms and Up/Down decision rates are depicted in Figure 3.6. The average PD output is proportional to the difference of up and down probabilities. This implies that PD output is dependent on the phase error and the jitter distribution as well. Generally, it is proportional to the phase error, but for the same phase error, it is inversely proportional to the jitter amount. As a result, the binary PD shows a linear characteristic near the lock position, and the jitter distribution determines the phase detector gain [26]. In the case of Figure 3.6, Figure 3.6(a) with small jitter has larger gain than that of Figure 3.6(b) with large one. Since the probabilities difference between the right and left side of sampling point, PD is not bang-bang controlled any more unless the system becomes jitter-free.



Figure 3.6: Binary PD operation under various jitter condition. (a)Small jitter (b)Large jitter[26].

In the CDR with a linear PD, effective PD gain remains constant in a noisy environment. However, in the bang-bang CDR, the effective PD gain changes with the input jitter pdf, which affects loop characteristics. In other words, when the effective PD gain varies, the loop bandwidth in the CDR also varies. It is therefore a subject of this thesis to provide an better scheme in our digital implementations. In order to stabilize the loop bandwidth, the effective PD gain must remain constant. We propose a M-AES scheme to improve the linearity and a gain compensation to stabilize the effective gain. These will be further explained later.

## 3.3.1 Majority Vote

We start first by explaining the detail of majority vote used in our digital implementation. As shown in Figure 3.7(a) and (b), the binary phase detection is done by exclusive-or the data sampling and edge sampling to detect transition and compare the transition with current clock edges. The Pre-Filter could be composed of Majority Vote and sliding window. The operation of the majority vote protocol is summing the 5 lead/lag signals and making a final decision to represent current lead/lag, as shown in Figure 3.7(b). There are two primary contributions of majority vote: first, the effect of random jitter can be averaged, i.e., the randomness can be filtered out and the trend of phase drifting can be maintained; second, the difference of data transition density often causes huge variation of loop gain and results in instability or loss of tracking. The majority vote can ensure a constant gain whenever data transition too often or too rare; hence preserve a reasonable loop gain.

Next we compare the PD behavior with and without majority vote scheme. We compare three different jitter conditions, they are  $\sigma_{RJ}=0.002$ , 0.03, 0.1 UI, respectively, as shown in Figure 3.8. Figure 3.9(a) and Figure 3.9(b) plot the PD output versus phase error with and without Majority Vote respectively. In order to understand the impact of Majority Vote, both PD outputs are normalized from 1 to -1. After the PD output is determined, the effective PD gain in the locked



Figure 3.7: (a)The ideal operation of data and edge sampling (b)Under jittery condition, the operation of binary PD and Majority vote.





Figure 3.8: Different jitter sigma conditions for comparison of PD output.

state can be obtained by calculating the curve slope.

The transition density for 20% and 100% cases and the effective PD gain with or without Majority Vote are shown in Figure 3.10. Figure 3.10(a) and Figure 3.10(c) show the effective PD gain of the conventional binary PD while transition density is 100% and 20% respectively. Clearly, its PD gain becomes 5 times as the transition density turns into fivefold. On the other hand, since the majority vote scheme makes a final decision over 5 incoming bits, it behavior as only one transition occurring among the five input data, meaning transition density of 20% is the same with the conventional binary PD under transition density of 100%. The results are shown in Figure 3.10(a) and (d). As shown in Figure 3.10), these results reflect that the effective PD gain of majority vote varies from 400 to 750 (less than 2 times) while the transition density turns into fivefold. In other words, the majority vote can maintain more constant characteristics over input data transition density variations than that without Majority Vote.

# 3.3.2 Multiple Alternating Edge Sampling (M-AES) Scheme

\$ 1896

To overcome the drawbacks from binary PD, many good edge sampling schemes were proposed in [10] [27] [18] [19]. [10] [27] offered frameworks for overcoming PD gain variation and asymmetric jitter distribution by introducing an adaptive dead-zone. [18] [19] provided methods to reduced intrinsic jitter caused by oscillatory steady state of binary PD by introducing dithering in interpolator control signal and creating variation of edge sampling position. However, both the above methods are not suitable for our application. First, adaptive dead-zone in [10] [27] are analog implementation using PLL tracking type CDR, which is not a dual loop CDR that benefits from bandwidth relaxation. Also the effect of asymmetric jitter presents only under large jitter conditions ( $\sigma_{RJ} > 0.06UI$ ) which is beyond the SATA specification. Furthermore, the adaptive dead-zone has diffi-



Figure 3.9: The normalized PD output for various jitter conditions (a)without Majority Vote (b)with Majority Vote



Figure 3.10: The effective PD gain for various jitter conditions (a)Under transition density = 100% condition (without Majority Vote) (b)Under transition density = 100% condition (with Majority Vote) (c)Under transition density = 20% condition (without Majority Vote) (d)Under transition density = 20% condition (with Majority Vote)

culty in discriminating large periodic jitter or frequency offset from ordinary ISI, hence isn't appropriate for SSC applications. Second, the dithering of edge sampling signal in [18] [19] requires different interpolator control for data sampling and edge sampling, this requires huge amount of circuit complexity especially in multi-phase parallel sampling CDR that uses multiple interpolators. We therefore propose an edge sampling scheme to linearize PD gain and it is suitable for digital implementation with very simple circuit design. The concept is described below.



Figure 3.11: The proposed multiple alternating edge sampling.

The proposed Multiple Alternating Edge Sampling is shown in Figure 3.11. Unlike the 3x over-sampling that uses two edge samplings per UI in [27], one edge sampling altering at two sides of original point is enough to create deadzone. Furthermore, since there is five parallel bit sampling in our work, we can alternate the five edge sampling clock E0 to E4, each to different amount of phase. This equivalently creates eleven different levels of PD gain proportional to the phase deviation. In our design the altering amount are chosen to be (0.04/0.06/0.08/0.10/0.12)UI where 0.08UI is the intrinsic jitter which is the effective loop gain multiplied by loop latency. AES PD creates a small dead-zone which equals  $2 \times E0$ .



Figure 3.12: Different data transition density results in different PD gain,  $\phi_{e1} = \phi_{e1}$ .

## 3.3.3 Gain Compensation

Since the nonlinearity nature of the conventional binary PD, the PD gain varies with data transition density. It produces a nonzero output of either lead or lag for data transitions and a zero output for non-transitions. Figure 3.12 highlights differences between the two different data transition density condition. Although the phase error in both condition is the same, PD gain changes with the transition density. In order to maintain the stability of the PD gain during high data transition or low data transition, we propose a gain compensation scheme. The scheme counts the data transition and normalize its output with respect to data transition counts. The PD output is therefore less sensitive to the transition density of data patterns. Table 3.1 provides an example of PD gain decoding, where  $G_n$  represents Normalized gain which is (lead-lag)/transition counts.

Although Majority Vote scheme could improve the performance of the conventional binary PD, there are still some drawbacks. Then we compare the behavior of Gain Compensation and Majority Vote. Note that the hardware complexity is similar between these two.

Figure 3.13 show the normalized gain versus (lead-lag) using Gain Compensation and Majority Vote. It can be observed that the PD output with Majority Vote is less sensitive for various (lead-lag). On the other hand, there is greater

| 5 transition |     |          |       |
|--------------|-----|----------|-------|
| lead         | lag | lead-lag | $G_n$ |
| 5            | 0   | 5        | 1     |
| 4            | 1   | 3        | 0.6   |
| 3            | 2   | 1        | 0.2   |
| 2            | 3   | -1       | -0.2  |
| 1            | 4   | -3       | -0.6  |
| 0            | 5   | -5       | -1    |
| 4 transition |     |          |       |
| 4            | 0   | 4        | 1     |
| 3            | 1   | 2        | 0.5   |
| 2            | 2   | 0        | 0     |
| 1            | 3   | -2       | -0.5  |
| 0            | 4   | -4       | -1    |
| 3 transition |     |          |       |
| 3            | 0   | ESX3     | 1     |
| 2            |     |          | 0.33  |
| 1            | - 2 | 1896     | -0.33 |
| 0            | 3   | -3,11    | -1    |
| A ALLER A    |     |          |       |

Table 3.1: The operation of gain compensation.

actual information about lead/lag signals while using Gain Compensation. To take one example, assuming case1 of lead/lad signals is 3 leads and 2 lags, and case2 is 5 leads. The PD output for both cases is 0.2 and 1 respectively for Gain Compensation scheme while it's both 1 for Majority Vote. As a result, the loop gain of Gain Compensation is more suitable than Majority Vote especially near the locking state.

# 3.4 Simulation Results

The behavior of the CDR can be modeled with a discrete-time closed-loop system. Figure 3.14 shows a conceptual model. There are three gain parameters that are tunable to fit jitter specifications. They are phase-rotator counter gain  $K_R$ ,



Figure 3.13: Normalized gain versus (lead-lag) for Majority Vote and Gain Compensation.

Proportional gain  $G_P$  and Integral gain  $G_I$ . The  $z^{-n}$  models the total loop delay. The loop delay directly affects loop stability and jitter performance and should be carefully designed to minimize it. Using this model,  $G_P$  and  $G_I$  can be designed according to the simulation results.



Figure 3.14: The discrete-time model of proposed CDR.

The comparison of PD output and phase step response among conventional

binary PD, PD with majority vote, and PD with gain compensation is shown in Figure 3.15. The vertical axis is time offset in UI and the horizontal axis is bit cycle number. In addition, AES scheme is adopted in Figure 3.16. The frequency of the period jitter for each case in Figure 3.15 and Figure 3.16 is 3MHz. There are a number of points worth noting below:

First, since M-AES scheme enhances the linearity of phase detection, tracking ability in Figure 3.16 is obviously improved as compared to Figure 3.15.

Second, as observed in Figure 3.15(a)(b) and Figure 3.16(a)(b), due to lacking in good stabilization, the tracking ability of conventional binary PD becomes poor while the data transition density turns into half.

In addition, the difference between Majority Vote and Gain Compensation show little difference under this jitter condition. What factors have led to this result? The reason is the fact that the inequality of the PD gain between the two is not apparent even though Gain Compensation scheme provides more lead/lag information than Majority Vote. This phenomenon is especially obvious during low data transition. To take an example, if there is only one transition occurring among the five incoming data, the normalized PD output of these two schemes have no difference. On the other hand, assuming there are 3 lead and 2 lag among five parallel incoming data. For Majority Vote case, the normalized PD output is 1 while it's 0.2 in Gain Compensation case. It can be seen in Figure 3.16 that tracking ability in (d) and (f) is quite similar, whereas (f) is better than (d).

As has been noted above, the integral path provides low frequency phase tracking as well as tracking of frequency offset and spread spectrum. As shown in Figure 3.17, Gain Compensation is a more proper configuration that tracks low frequency offset while maintaining good stability. Poor tracking ability of Majority Vote is a consequence of the feedback loop updating the recovered clock timing by a fixed amount based on the error polarity in the majority. Especially



Figure 3.15: Simulation results of periodic jitter with different conditions (without M-AES scheme).

- (a)Binary PD; Transition density=100%
- (b)Binary PD; Transition density=50%
- (c)Majority Vote; Transition density=100%
- (d) Majority Vote; Transition density=50%
- (e) Gain Compensation ; Transition density=100\%
- (f)Gain Compensation ; Transition density=50%

in approaching the locked state, the updated amount could be too large for small error or even no error. In other words, the effective gain of majority vote is still bang-bang like so generating intrinsic jitter when phase locked. On the contrary, Gain Compensation linearize the effective gain. The linear-like nature results in good tracking ability and thus generates less intrinsic jitter.





Figure 3.16: Simulation results of periodic jitter with different conditions (with M-AES scheme).

- (a)Binary PD; Transition density=100%
- (b)Binary PD; Transition density=50%
- (c)Majority Vote; Transition density=100%
- (d)Majority Vote; Transition density=50%
- (e)Gain Compensation ; Transition density=100%
- (f)Gain Compensation ; Transition density=50%



Figure 3.17: Simulation results of 100ppm frequency offset under different conditions (with M-AES scheme).

- (a) Majority Vote; Transition density=100%
- (b) Majority Vote; Transition density=50%
- (c)Gain Compensation; Transition density=100%
- (d) Gain Compensation; Transition density=50%

# Chapter 4

# Spread Spectrum Clocking

## 4.1 Background

For ease of synchronization, it is often the case that the frequency of both the transmitter and the receiver are both relatively stable. As the data rate of high speed links increases, the electromagnetic interference (EMI) caused by these clock sources becomes a problem as their output spectrum starts overlapping with wireless frequency bands. The normal EMI occurred in the range between 104 to 1012 Hertz according to the Electromagnetic Frequency Spectrum. Because the development and efficiency for these components became smaller and more delicate, these raised more difficulties for the devices to keep away from EMI pollution. Regulatory bodies like the FCC (Federal Communication Commission) have maximum limits for peak EMI emissions. The FCC's regulation is divided the electronic products into Class A and Class B. The FCC's Class A regulations apply to industrial applications and the Class B regulations apply to residential or consumer applications. Today, FCC regulations are primarily concerned with peak emissions at any given frequency, not the average emissions over a given frequency spectrum. Thus, a circuit designer should focus their EMI design efforts with reducing the peak emissions at any given frequency within the frequency spectrum, not the overall average emissions within the spectrum. Figure 4.1



Figure 4.1: FCC Class B Peak Emissions [28].

shows a FCC Class B plot of power  $(dB\mu V/m)$  versus frequency (MHz) for the peak emission requirements (at 10 meters).

Spread spectrum clocking (SSC) solves this problem by varying the frequency of the clock so as to spread its power over a range of frequencies such that the average power emitted at a specific frequency is reduced as shown in Figure 4.2.

A widely adopted SSC profile proposed in an industry standard, Serial AT Attachment (SATA), is shown in Figure 4.3. As shown in Figure 4.3 the down spread technique is a way that the demanded frequency will be moved below the normal frequency between  $f_{nom}$  and  $(1-\delta)$   $f_{nom}$ , where  $f_{nom}$  is the normal frequency,  $\delta$  is the maximum modulation amount of 5000*ppm* down spread, and  $f_m$ is modulation frequency of 30 to 33 kHz respectively. The frequency modulation profile, in the form of the triangular waveform, can be expressed in Eq.(4.1) [1].

$$f = \begin{cases} (1-\delta)f_{nom} + 2f_m \cdot \delta \cdot f_{nom} \cdot t & \text{when } 0 < t < \frac{1}{2f_m} \\ (1+\delta)f_{nom} - 2f_m \cdot \delta \cdot f_{nom} \cdot t & \text{when } \frac{1}{2f_m} < t < \frac{1}{f_m} \end{cases}$$
(4.1)



Figure 4.2: Frequency domain view of SSC.

# 4.2 Spread Spectrum Mechanism4.2.1 Introduction to Modulation Mechanism

In general, there are four types of modulation mechanism in a phase-locked loop (PLL) based SSCG, that is, modulation on VCO, modulation on input reference clock, modulation on divider and modulation with phase selection method. Figure



Figure 4.3: Triangular modulation profile for SATA-III [1].



Figure 4.4: Block diagram of a SSCG with modulation on VCO.

4.4 shows that modulating the output clock of a PLL by giving periodic drift on VCO control voltage with another charge pump used [29]. Many parameters are subject to temperature drift and process variation due to analog intrinsic characteristic. It also suffers new jitter source from the analog modulator and worsen the performance of SSCG.

The SSCG technique with modulation on input reference clock is presented in [30] and is shown in Figure 4.5. It is a digital approach but because of the phase selection using multiplexer, glitch problem is very serious. Moreover, the glitch will cause an injected noise and its noise transfer function has very large DC gain. Besides, an additional digital processing is needed due to lacking of precise delay between the delay elements due to digital processing. Its variations can be so large that trimming can become necessary in critical applications.

Another SSCG type is to utilize modulation on divider [31] as shown in Figure 4.6. In this method, it requires large divider ratio N to achieve high modulation resolution. However, the phase noise of PLL output is multiplied by N within the loop bandwidth, whereas outside the loop it follows that of the VCO [32]. On the other hand, the settling time of the PLL is determined by the inverse of the



Figure 4.5: Block diagram of a SSCG with modulation on input reference clock.



Figure 4.6: Block diagram of a SSCG with modulation on divider.

loop bandwidth. It is always desirable to use a small division ratio N with large loop bandwidth. Therefore, in order to achieve high modulation with large N leads to low reference clock and loop bandwidth limitation. Hence, the difficulty of filter design and lock-in time increase. In addition, the noise produced by VCO is hard to be filtered. A further disadvantage is the appropriate divider ratio N varies under different spreading requirements.

Finally, the phase selection from the coherent multi-phase output of PLL is reported [33] and shown in Figure 4.7. It has the potential of sharing a single PLL



Figure 4.7: Block diagram of a SSCG with phase selection method.

in transmitter and receiver. Also, the SSCG can use the fine multiphase clocks that are available for oversampling based CDR. It has the potential of using more phases to achieve low output jitter. In order to avoid these disadvantages of above, our design is based on the phase selection method published in [33] for low noise concern. Additional benefits of this architecture deriving from the phase selection are reduced divider ratio N.

## 4.2.2 Noise Transfer Function

We derive the noise transfer function of each modulation mechanism to understand the noise performance of each of them. To do this, we can choose one of them as our basic modulation method to minimize quantization noise due to digital signal processing. Figure 4.8 illustrates the equivalent linear model of each modulation mechanism.  $K_{PD}$  is the gain of phase detector, F(s) means the transfer function of the loop filter,  $K_{VCO}$  is the sensitivity of voltage control oscillator, and N represents the divider ratio. The quantization noise of each method is listed below:

 $-\phi_{out}$  presents the output noise due to the quantization noise



Figure 4.8: Noise transfer function of each modulation mechanism.

 $-\phi_{in}$  models the quantization noise of a SSCG with modulation on input reference clock

 $-\phi_c$  models the quantization noise of a SSCG with modulation on VCO  $-\phi_r$  models the quantization noise of a SSCG with phase selection method  $-\phi_d$  models the quantization noise of a SSCG with modulation on divider The quantization noise and effective phase variation of each modulation method is shown below. The noise transfer function of each modulation method is given by

$$\frac{\phi_{out}}{\phi_{in}} = \frac{K_{PD}F(s)K_{VCO}}{s + \frac{1}{N}F(s)K_{VCO}}$$

$$\tag{4.2}$$

$$\frac{\phi_{out}}{\phi_c} = \frac{F(s)K_{VCO}}{s + \frac{1}{N}F(s)K_{VCO}}$$
(4.3)

$$\frac{\phi_{out}}{\phi_r} = \frac{-\frac{1}{N}K_{PD}F(s)K_{VCO}}{s + \frac{1}{N}K_{PD}F(s)K_{VCO}}$$
(4.4)

$$\frac{\phi_{out}}{\phi_d} = \frac{-K_{PD}F(s)K_{VCO}}{s + \frac{1}{N}K_{PD}F(s)K_{VCO}}$$
(4.5)

As shown in the Eq.(4.2) to (4.2), all of them reveal the low frequency passing characteristic. Therefore, the  $\Sigma\Delta$  modulation is usually adopted to implement the modulation signal generator and the analysis of the quantization noise is almost the same. Although all the noise transfer function presents the same lowpass character, they have different DC gain. This feature makes the quantization noise effect quite different. Inserting s = 0 in Eq.(4.2) to (4.2), we find the respective DC gain:

$$\frac{\phi_{out}}{\phi_{in}}|_{s=0} = N \tag{4.6}$$

$$\frac{\phi_{out}}{\phi_e}|_{s=0} = \frac{N}{K_{PD}}$$

$$(4.7)$$

$$\frac{\phi_{out}}{\phi_r}|_{s=0} = +1$$

$$(4.8)$$

$$\frac{\phi_{out}}{\phi_d}|_{s=0} = -N$$

$$(4.9)$$

Note that the DC gains are greater than one except the phase selection method. The quantization noise will be amplified through the PLL loop due to the closed loop gain. To achieve the same noise level at the output of PLL, a higher order of the  $\Sigma\Delta$  modulation is required when the DC gain is higher. Thus, our design is based on the phase selection method published in [33]. Due to the same filter attribute, the transient response of spread spectrum behavior of a SSCG will be realized with modulation on input. All the other modulation mechanism can be accomplished with the corresponding input referred signal at the phase detector input.



Figure 4.9: Triangular modulation profile for SATA-III (a)Ideal profile (b)Digital approach.

## 4.2.3 Conception of SSCG Using Phase Rotation

The modulation profile is one of the most important parameters that affects the performance of spread spectrum. The ideal modulation profile is given in Figure 4.9(a). Due to the unwanted process variation caused by analog approach, an accurate modulation profile is difficult to implement. Therefore, digital method is widely applied to carry out the desired modulation profile. As observed in Figure 4.9(b), the digital method digitizes the straight line into a stair-like waveform.

Figure 4.10 shows the block diagram of SSCG using phase rotation scheme, where k is half of multi-phase number from the VCO, I represents the interpolation and P is therefore the resolution of the phase rotator. The M denotes the control signal to the phase rotation. The key idea of this approach is changing the rising edge position of the VCO output at each reference clock rising edge. This timing adjustment contributes an average shift from the nominal analog value over N numbers of VCO clock cycle, where N is the divider ratio. Therefore, we can adjust this timing shifting amount adequately depending on the desired frequency deviation of spread spectrum. Following are descriptions of this scheme.

Figure 4.11 shows a graphical representation of the timing diagram, where

 $T_{VCO}$  is VCO oscillation period,  $T_{ref}$  is the reference clock period, p is the number of phase provided by the phase interpolation and  $\frac{T_{VCO}}{p}$  represents the resolution of the phase rotation. As observed in Figure 4.11, we can derive the formula of the steady state frequency deviation and design our circuit parameter. The VCO output waveform is adjusted by phase rotator with an amount of  $\frac{T_{VCO}}{p}$  every reference clock rising edge. This timing adjustment leads to a phase error at the phase detector input. As a steady state is achieved, the phase error at the phase detector input is zero. Thus, the reference clock period can be expressed as:

$$T_{ref} = NT_{VCO} - \frac{T_{VCO}}{p} \tag{4.10}$$

We can rewrite Eq.(4.10) as

$$(N - \frac{1}{p})T_{VCO} = T_{ref} \tag{4.11}$$

$$\Rightarrow \boxed{\frac{1}{(N-\frac{1}{p})T_{VCO}}} = \frac{1}{T_{ref}}$$
(4.12)

From the equation above, we can find that the spread frequency in steady state condition is

=

$$f_{VCO} = f_{ispread}$$
  
=  $f_{ref}(N - \frac{1}{p})$   
=  $f_{nonspread}(1 - \frac{1}{N \times p})$  (4.13)

where  $f_{VCO}$  is the VCO oscillation frequency;  $f_{nonspread}$  represents the original frequency of VCO;  $f_{ref}$  is reference clock frequency, and  $f_{ispread}$  denotes ideal spread frequency which means it is in the steady state rather than considering the transient response in the phase rotation process. With arbitrary phase rotation amount,  $\frac{\alpha}{p} T_{VCO}$ , a general formula is given by

$$f_{ispread} = f_{nonspread} \left(1 - \frac{\alpha}{N \times p}\right) \tag{4.14}$$

It can be concluded that a desired spread frequency can be generated by adjusting the amount of phase rotation.



Figure 4.10: Block diagram of SSCG using phase rotation scheme.



Figure 4.11: Timing diagram of the phase rotation.

## 4.2.4 Phase Rotation Mechanism

As we can observe in Figure 4.9(b), less high frequency term would be produced with larger number of the stair in the modulation profile due to the quantization error. What is more, for a linear time invariant system, a large phase rotation contributes large phase error at the output of the system. To put it in another way, high resolution of phase rotation is required which means large p is good for reducing phase error. In our work, we adopt five stages VCO (2k=10)so a coherent multi-phase PLL which has ten uniformly distributed phases. The interpolation is  $\frac{1}{16}$  (I=16) in our design so that the resolution of the phase rotation is  $\frac{1}{160}T_{VCO}$ . From Eq.(4.14) and specification described in [1], the desired maximum spread frequency is  $f_{nonspread}(1 - 5000ppm)$ . Therefore, we can derive the maximum amount of the phase rotation is  $9.6 \times \frac{1}{160}T_{VCO}$  (P=160) to create a 5000ppm down-spread frequency deviation. In order to produce the periodic modulation profile, the control signal moves from 0 to 9.6 and returns to 0 in a circle as shown in Figure 4.10.

Another design consideration is the number of stairs in modulation profile as shown in Figure 4.10, denoted by M. The number of stairs depends on the consideration of the hardware implementation and PLL system parameters. The system response converges quickly with small M. As a result, the frequency distribution of the SSCG output concentrates on certain frequency so that the performance of the EMI reduction would be poor. The other drawback is that large input step due to small number of stairs contributes a big amplitude response. This would lead to unnecessary high frequency term in the frequency spectrum and increasing cycle-to-cycle jitter. To improve smoothing the triangle wave, the parameter of the digital controller, M, might be as large as possible. However, excessive number of stairs in the modulation profile does not improve the resolution since the deterministic phase rotation resolution. As a result, the design of M should



Figure 4.12: Block diagram of phase rotation and phase rotation control.

be optimized with considering other parameters.

Figure 4.12 shows the block diagram of phase rotator which consists of two multiplexer, an interpolator, a phase-rotation counter and a decoder choosing exactly the adjacent two phases and converts the interpolation ratio signal to a thermal code. A thermal code is designed to control the interpolator in order to ensure the monotonic behavior in the process of phase rotation. The monotonic behavior guarantees the accumulated phase is monotonically increased or decreased during the spread spectrum process. It means that the frequency adjustment of spread spectrum output would increase or decrease toward the end of modulation profile without any spike in it. If the monotonic characteristic cannot be guaranteed, there are spikes in the modulation profile. Hence the high frequency term in modulation profile would be amplified so that the cycle-to-cycle jitter increases. As we known, the amount of EMI reduction relies on the distribution of the spreading frequency. A spike in the modulation profile would decrease due to uneven distribution. The amount of EMI reduction would decrease due to uneven distribution of the spreading frequency.



Figure 4.13: Randomization and noise shaping to eliminate unwanted spurs.

As described above, the amount of phase rotation changes from 0 to 9.6. If we use a counter to control the amount of the phase rotation, there is regular sequence of the rotator resulting in unwanted spurs in the spectrum. Nevertheless, we can eliminate spurs by randomize the control signals. By randomizing the choice of the rotation phase such that the average is still we want.  $\Sigma\Delta$  modulation technique is adopted in our work to carry out the fractional number from 0 to 9.6. Since the amount of the phase rotation can only be integer, there is unavoidable quantization noise. As shown in Figure 4.13, the basic idea of the noise shaping technique with  $\Sigma\Delta$  modulation is to shape the spectrum of the quantization noise such that its power within the useful signal band becomes very much small, which are described in the next section.

## 4.3 $\Sigma \Delta$ Modulators

 $\Sigma\Delta$  modulators are well known in the field of communication and extensively been used for A/D and D/A conversion applications. The fundamental operation of these modulators relies on the fact that the spectrum of the quantization noise is shaped such that a small amount of noise power remains within the useful signal band with the rest of the quantization noise being pushed to the higher frequencies. As we mentioned in 4.2.4, the same principles can also be exploited in the phase rotation application. The  $\Sigma\Delta$  modulator used in the phase rotation mechanism is to randomize the instantaneous phase jumping and hence push phase noise associated with the quantization error from low frequency to high frequency. Then the loop filter filters out the phase noise in high frequency.

## 4.3.1 Basic Principles of $\Sigma\Delta$ Modulation

We start first by explaining the basic principles of  $\Sigma\Delta$  modulators briefly. The introducion of oversampling noise shaping technique in [34] by Inose and Yasuda has resulted in a very popular method for converting signals between the analog and the digital domains. This principle has led to the rapid development of robust converters known as sigma-delta ( $\Sigma\Delta$ ) modulator based converters [35]. In the literature, they are also referred as delta-sigma ( $\Delta\Sigma$ ) modulators. The fundamental basics of  $\Sigma\Delta$  modulators is to combine oversampling and noise shaping techniques. In  $\Sigma\Delta$  modulator, negative feedback combined with a coarse quantization at a high sampling rate shapes the spectrum of the quantization noise away from the baseband frequencies [36]. The input signal is sampled at a frequency that far exceeds the Nyquist rate to spread the quantization noise over a bandwidth, which is much larger than the signal band [37].

Shown in Figure 4.14(a) is a block diagram for a general  $\Sigma\Delta$  modulator that incorporates a quantizer along with a accumulator. If the white noise approximation for the quantizer is used, the linear model of the modulator is illustrated in Figure 4.14(b). Performing straight forward z-domain analysis on the linear model, one obtains

$$Y(z) = X(z)\frac{H(z)}{H(z)+1} + E(z)\frac{1}{H(z)+1}$$
(4.15)

where X(z), Y(z) and E(z) are the z-transforms of the input, the output and the quantization error, respectively. In  $\Sigma\Delta$  modulation, the signal that is subject to quantization is not the input signal itself, but a filtered version of the difference between the input signal and the encoded output. The filter H(z), usually called the feedforward filter [38], in a 1<sup>st</sup>-order  $\Sigma\Delta$  modulator is a discrete-time integra-



ALL DA

Figure 4.14: General  $\Sigma\Delta$  modulator: (a) block diagram (b) linear model. tor with transfer function  $H(z) = \frac{z^{-1}}{(1-z^{-1})}$ . Using this, for a 1<sup>st</sup>-order modulator we can write  $Y(z) = X(z)z^{-1} + E(z)(1-z^{-1})$  (4.16)

The quantized signal is the integrated (sigma) version of the difference (delta) between input signal and the analog representation of the binary coded output. As a result of noise shaping, only a small portion of the noise power lies within the signal band.

## 4.3.2 Quantization Noise

The quantization process involved in  $\Sigma\Delta$  modulator in an inherently non-linear operation and introduces errors to the phase rotation. From Eq.(4.16), in a  $1^{st}$ -order modulator, the output is given by the superposition of the one-sampledelayed version of the input signal, plus the quantization noise that is shaped by a  $1^{st}$ -order differentiation (high-pass filtered). Observe that the quantization
error undergoes filtering through a high-pass filter,  $H_{noise}(z) = (1 - z^{-1})$ , and thereby reduces the quantization noise around dc.

With  $z = e^{2j\pi f/f_{ref}}$ , the expression of  $H_{noise}(f)$  is written in the frequency domain to obtain the noise shaping action of the 1<sup>st</sup>-order modulator.

$$H_{noise}(f)| = |1 - e^{-\frac{2j\pi f}{f_{ref}}}|$$
  
=  $|2sin(\frac{\pi f}{f_{ref}})|$  for  $0 \le f \le \frac{f_{ref}}{2}$  (4.17)

Quantization error noise can be approximated as white noise with uniform spectrum distribution within  $-\frac{f_s}{2}$  to  $+\frac{f_s}{2}$ ,  $f_s$  is the sampling frequency [39]. Then the power spectral density of the quantization error is given by

$$S_e(f) = \frac{1}{12} \Delta^2 \frac{1}{f_s}$$
(4.18)

where  $\Delta$  is the difference between adjacent output levels and is called the quantizer bin width [38].

Clock quality is usually described by jitter or phase-noise measurements. The definitions of phase-noise spectrum is then described. To understand the definition of the phase-noise spectrum L(f), we define the power spectrum density of a clock signal as  $S_C(f)$ . The phase noise spectrum L(f) is then defined as the attenuation in dB from the peak value of  $S_C(f)$  at the clock frequency,  $f_C$ , to a value of  $S_C(f)$  at f. Figure 4.15 illustrates the definition of L(f). Mathematically, the phase-noise spectrum L(f) can be written as

$$L(f - f_c) = 10 \log \frac{S_C(f)}{S_C(f_C)}$$
 in dB (4.19)

Thus, the power spectrum density for the phase error,  $S_{\Theta e\_loop}$ , resulting from quantization error through the loop can be written as

$$S_{\Theta e \text{-loop}}(f) = S_{\phi e}(f) \times L^2_{\text{loop}}(f)$$

$$(4.20)$$

where  $L_{loop}(f)$  represents the phase-noise spectrum of a loop. Accordingly, we can get the phase error amount as long as we know the  $L_{loop}(f)$  of the used loop.



Figure 4.15: Definition of phase-noise spectrum.

This provides an essential point to the discussion of analysis of quantization noise described later.

#### 4.3.3 Digital implementation of the $\Sigma\Delta$ modulator

As shown in Figure 4.16, a  $\Sigma\Delta$  module is accomplished with a digital accumulator whose carry bit was used to vary the amount of phase rotation. The symbol K is the dc input (binary word) to the accumulator, and k is the accumulator size. In other words, the accumulator carry  $O_k$  flag set to high K times in every  $2^k$ cycles. Therefore, the long-term statistics of  $O_k$  is a fractional number  $X = \frac{K}{2^k}$ .

The  $2^{nd}$ -order modulator is used to shape the quantization noise toward higher frequency. Figure 4.17 illustrates the implementation of  $2^{nd}$ -order  $\Sigma\Delta$  modulator. The first accumulator with input K calculates the average fractional number while the second one performs the phase error spectral shaping.

In order to realize the accumulator in digital approach, we have to decide the size of that first. The size of the accumulator relates to the number of stairs in the modulation profile. To take our example, a 4-bit accumulator is used so that the desired K in the  $\Sigma\Delta$  module equals to  $2^4 \times 9.6 = 153.6$  and should be truncated



Figure 4.16: Realization of  $1^{st}$ -order  $\Sigma\Delta$  modulator.



Figure 4.17: Implementation of  $2^{nd}$ -order  $\Sigma\Delta$  modulator.

to 153 to promise the maximum frequency deviation not larger than 5000*ppm*. Therefore, the desired amount of phase rotation, from 0 to 9.6 is accomplished by the accumulator with input integer value from 1 to 153. If the size of accumulator is too large or too small, the EMI reduction and cycle-to-cycle jitter performance would become poor. Inasmuch as a large size of accumulator means unnecessary number of stairs in the modulation profile and vice versa. The impact of number of stairs in the modulation profile on the behavior of SSCG has been discussed in section 4.2.4.

#### 4.4 Analysis of Quantization Noise

The modulation mechanism based on modulation on VCO [29], modulation on input [30], and modulation on divider [31] reveals that DC gain is larger than one and hence the quantization noise is amplified. A higher-order  $\Sigma\Delta$  modulation is accomplished to shape the amplified quantization noise in these modulation mechanisms. Although the  $\Sigma\Delta$  concept has existed for a log time, there are still many problems associated with these modulators. A lot of research works have been done in this field to describe the benefit of the higher-order  $\Sigma\Delta$  modulator [40] [41].

It should be noted, however, as we can derive from Figure 4.17 that the output of  $2^{nd}$ -order  $\Sigma\Delta$  modulator could be M-1, M, M+1, and M+2, where M represents the integer part of Kave. Hence the maximum rotation phase would be 3 step, which may result in large cycle-to-cycle jitter at VCO output. The low jitter requirement in our application might not be satisfied. In order to confirm this assumption, what follows is focusing on the performance of simple first and higher-order systems.

For a  $n^{th}$ -order accumulator based modulator, the noise transfer function is  $H_{noise}(z) = (1 - z^{-1})^n$ . The magnitude spectrum of  $1^{st}$ -order noise transfer is

shown in Figure 4.18 along with second and third-order ones for comparison. According to the well-known Nyquist sampling theorem, the signal must be band limited to half the sampling frequency, so what we concern frequency is below half the sampling frequency which in our work is 100MHz. Notice that higher-order noise transfer function provides more attenuation over low frequencies and more amplification over high frequencies.



Figure 4.18: Magnitude spectrum for noise transfer function. (log scale)

Referring to Eq.(4.14), if the ideal amount of phase rotation is a fractional number X, the ideal frequency deviation can be expressed in Eq.(4.21). However, the practical amount of phase rotation is the sum of an ideal term X and a quantization error term  $Q_e$  as shown in Eq.(4.22).

$$f_{ispread} = f_{nonspread} \left(1 - \frac{X}{N \times p}\right) \tag{4.21}$$

$$f_{spread} = f_{nonspread} \left(1 - \frac{X + Q_e}{N \times p}\right) \tag{4.22}$$

To express the power spectrum density of phase error, we derive the normal-

ized frequency error first. Normalized Frequency Error can be expressed as:

$$\frac{Ideal - practical}{Ideal} = \frac{\frac{1}{N \times p} Q_e}{1 - \frac{X}{N \times p}} \\
= \frac{Q_e}{N \times p - X} \\
\approx \frac{Q_e}{N \times p}$$
(4.23)

The clock period of the phase rotator is different from the VCO. Hence we observe the behavior of phase error on the front of the phase detector. Such phase error can be represented as  $\phi_e$  in Figure 4.8. The phase error formula in time domain can be described by the following equation

$$\phi_e(t) = 2\pi f_{ref} \int \frac{Q_e(t)}{N \times p} dt$$

$$= 2\pi \frac{f_{ref}}{N \times p} \int Q_e(t) dt$$
(4.24)

Then the phase error in frequency domain is given by

$$S_{\phi_e}(f) = \left(\frac{f_{ref}}{f \cdot Np}\right)^2 S_{Q_e}(f) \tag{4.25}$$

It can be observed from Eq.(4.25) that the higher resolution of phase rotation, a better PSD result of phase error is achieved at the output of PLL.

We next examine the effect on the resolution of the phase rotation. As mentioned above, we can get the information of phase error from Eq.(4.20). Since what concerned more is the response in jitter performance, we transfer the frequency domain view to the time domain. The first step in calculating the equivalent RMS jitter is to obtain the integrated phase noise power over the frequency range of interest. Once the integrated phase noise is known, the RMS phase jitter in radians is given by the equation (see [42] for further details, derivations),

$$Jitter_{RMS}(radians) = \sqrt{2 \times \int S_{\phi e \ loop}(f) df}$$
(4.26)

and dividing by  $2\pi f_{VCO}$  converts the jitter in radians to jitter in seconds:

$$Jitter_{RMS}(seconds) = \frac{\sqrt{2 \times \int S_{\phi e\_loop}(f)df}}{2\pi f_{VCO}}$$
(4.27)

where  $f_{VCO}$  is is the VCO oscillation frequency.

In order to verify theoretical equations, we use Wolfram Mathematica platform to analyze jitter behavior in different conditions, where all the components of the PLL have been considered to be ideal. The unit gain frequency of PLL used in the model is set to 2MHz as shown in Figure 4.19. Figures 4.20-4.22 show the simulated RMS jitter on the front of phase detector and the end of VCO for various modulators and phase resolution conditions. First, for the same phase resolution of the interpolator, as seen from Figure 4.20(a), higher-order modulators provide less jitter over low frequencies and amplify it over high frequencies. On the other hand, since the behavior of PLL system is like low-pass filter, the high frequency jitter is then attenuated as shown in Figures 4.20(b), 4.21(b), and 4.22(b). For the 1<sup>st</sup>-order system, the jitter over low frequencies is more considerable than higher-order one so the PLL loop couldn't filter it out remarkably.

| litter and (s)             | Phase resolution of interpolator |                |                |
|----------------------------|----------------------------------|----------------|----------------|
| $5 m \epsilon r_{RMS}$ (S) | $\frac{1}{160}$                  | $\frac{1}{40}$ | $\frac{1}{10}$ |
| 1 <sup>st</sup> -order     | 1.23 <i>ps</i>                   | 4.73 ps        | 19.2 ps        |
| $2^{nd}$ -order            | 1.45 ps                          | 5.56 ps        | 25.3 ps        |
| $3^{rd}$ -order            | 2.27 <i>ps</i>                   | 8.87 ps        | 36.2 ps        |

Table 4.1: The simulated jitter on the front of PD

Even so, as clearly shown in Figure 4.20(b), the jitter in the  $1^{st}$ -order modulator case is only 0.4 ps which is insignificant in comparison to the one by other noise source generated. However, Figure 4.22(b) highlights the momentous differences while the order of the modulators is different. The RMS jitter turned out to be a very high 7 ps in  $1^{st}$ -order case while the  $3^{rd}$  one is smaller than 1 ps.



Figure 4.19: PLL closed loop response.

| litter purg (s)   | Phase resolution of interpolator |                |                |
|-------------------|----------------------------------|----------------|----------------|
| JUUCIRMS (3)      | 160                              | $\frac{1}{40}$ | $\frac{1}{10}$ |
| $1^{st}$ -order   | 0.43 ps                          | 1.71 ps        | 6.92 ps        |
| $2^{nd}$ -order 🕏 | 0.08 ps                          | 0.34 ps        | 1.23 ps        |
| $3^{rd}$ -order   | 0.05 ps                          | 0.21 <i>ps</i> | 0.87 ps        |

Table 4.2: The simulated jitter filter through the loop

We also summarize these jitter results in Table 4.1 and Table 4.2. These results supports the fact that the derived formulas in Eq.(4.25).

Our theoretical results have shown that, once the phase resolution of the interpolator, p, is high enough, the difference of the jitter from different order modulators is so insignificant that can be neglected. Whereas the resolution of interpolator in our work is high enough, in an effort to reduce the hardware complexity, there is no reason to choose the higher-order  $\Sigma\Delta$  modulator to control the amount of the phase rotator.

It must be noted that the output of higher-order  $\Sigma\Delta$  modulators could be negative, and the negative value control signal would result in up spread clocking although the average is still down spread. Since our SSCG work is for Serial-ATA



Figure 4.20: Different modulation order conditions for comparison of RMS jitter. ( phase resolution of the interpolator= $\frac{1}{160}$ ) (a)on the front of PD (b)filter through the loop

Specification defines the EMI reduction using down-spread with 5000*ppm* frequency deviation, the higher-order modulator is not suitable for our application.

It is obvious that a behavioral level simulation model is needed to accurately define the circuit parameters before the real implementation. Meanwhile, a benefit that can be obtained from such a simulation model is that EMI suppression performance of a  $\Sigma\Delta$  modulator with different order can be easily assessed. A PLL-based SSCG model was run for different order  $\Sigma\Delta$  modulations. We developed the behavioral model on a MATLAB platform.

Figure 4.23 illustrates the TIE jitter at the VCO output for differen order



Figure 4.21: Different modulation order conditions for comparison of RMS jitter. ( phase resolution of the interpolator= $\frac{1}{40}$ ) (a)on the front of PD (b)filter through the loop

modulation conditions. Jitter as shown in Figure 4.23(a) changes slowly meaning it's low frequency noise. On the contrary, Figure 4.23(b) and (c) display their behaviors as high frequency like noise. The rms jitter of these results is also calculated and listed in Table 4.3. Despite different behavior, the jitter in these conditions is all smaller than 1ps which is negligible as compared with other noise. This small amount jitter comes from the fact that phase resolution we used in our work is quite high.

Finally, with SSC enabled, Figure 4.24 shows the clock spectrum in simulation. As seen from the plot, there is no apparent difference from different cases. The result indicates that, unlike other cases, the effect of higher-order modulation



Figure 4.22: Different modulation order conditions for comparison of RMS jitter. ( phase resolution of the interpolator=  $\frac{1}{10}$ ) (a)on the front of PD (b)filter through the loop

does not apply in our work.



Figure 4.23: TIE jitter of VCO in different order modulators conditions.

|                    | Order of $\Sigma\Delta$ modulation |                 |                 |  |
|--------------------|------------------------------------|-----------------|-----------------|--|
|                    | $1^{st}$ -order                    | $2^{nd}$ -order | $3^{rd}$ -order |  |
| $Jitter_{RMS}$ (s) | 0.447 ps                           | 0.197 ps        | 0.169 <i>ps</i> |  |

Table 4.3: The RMS jitter of the VCO output



Figure 4.24: The spectrum of VCO output with different order  $\Sigma\Delta$  modulations.

### Chapter 5

# Experimental Results and Conclusions

#### 5.1 Circuits Simulation

ATT ILLER The circuit level simulation is performed using mixed-mode simulator in Nanosim. In the simulation, GP=1/8 and GI=1/64 and this is to ensure larger jitter tolerance to verify functionality. The input pattern is K28.5 which is a DC-balanced pattern and includes 5 successive '1's and '0' and successive transition '01010', '10101' to test ISI effect. The K28.5 is '10100 00011 01011 11100' and starts from LSB. To verify the CDR function, a built-in-self-test (BIST) circuit is used. The BIST will automatically parallelize and align the serial input, and detect the K28.5 pattern. After the K28.5 is found, the signal bus 'rev\_data' displays the pattern and the signal 'data\_en' is set high. If bit error occurs, 'rev\_data' no longer shows K28.5 pattern and 'data\_en' is set low. In order to prevent performance degradation from process variation, we slightly over-designed the circuit and simulate it at a faster rate of 6.945Gb/s instead of 6Gb/s. That means the local PLL generates 1.389GHz instead of 1.2GHz. The simulation results is shown in Figure 5.1. The clock is set at 1.6GHz, 40ps rise/fall time corresponding to the simulation result of sampling clock; the receiver data rate is 8Gb/s with 200mVswing after 10M cable model. To test spread spectrum clock functionality, the

receiver local clock generator is a spread-spectrum clock generator that generates -5000ppm, 33KHz modulation frequency SSC. The receiving data is sent at nominal rate, therefore the CDR has to recover the nominal data rate to produce correct data. Figure 5.2 shows the clock spectrum in a SSC simulation. It can be seen that the data clock is recovered from the spread spectrum local clock and is at 1.389GHz.



Figure 5.1: (a) K28.5 Input pattern (b)Verification of CDR functionality.



Figure 5.2: The spectrum of recovered clock and receiver clock in SSC simulation..

The time domain simulation results of the proposed SSC under different order

 $\Sigma\Delta$  modulation are shown in Figure 5.3. These diagrams show the period of VCO output vs. time. The SSC modulation frequency is 33KHz. It can be seen that there is no obvious different response from each other in our work. The maximum cycle-to-cycle jitter of these diagrams can be seen that is all less than 1.5ps during the spread spectrum operation. An FFT calculation is also adopted in Figure 5.4 to display the frequency domain behavior of the VCO output clock with different order modulation. The VCO output clock is operating at 1.389GHz, the modulated clock is down-spreading 4983*ppm* and the corresponding EMI reduction is 20.6dB as shown in Figure 5.5.

#### 5.2 Layout

The CDR circuits together with a spread-spectrum clock generator and a continuoustime equalizer is implemented in UMC 90nm 1P9M process. The chip area is  $1.25 \times 1.1 \ (mm \times mm)$  including 73 bonding pads. The layout floor plan is shown in Figure 5.6. The multi-phase signals from PLL to the phase selection block and M-AES block must be routed symmetrically to ensure correct signal timing. Decoupling capacitors for supply and bias points are placed wherever possible, but needs to avoid high speed signal lines. The control signals from digital control to phase selection block and M-AES are very dense and needs extra caution in layout. The chip summary results of CDR and SSCG are listed in Table 5.1 and Table 5.2, respectively.

#### 5.3 Measurement Environment Setup

The testing environment setup is shown in Figure 5.7. All DC supply sources are given from Keithley 2400 Source Meter. Agilent N4903A Serial J-BERT provides the jittery and spread spectrum clock receiver data for CDR testing. It also



(c)  $3^{rd}$ -order  $\Sigma\Delta$  modulation

Figure 5.3: Period of VCO output waveform vs. time.



Figure 5.5: Spectrum of VCO output clock with and without spread spectrum modulation.



Figure 5.6: Layout view of test chip.

| Table 5.1. Design summary of proposed ODIt |                                                |  |
|--------------------------------------------|------------------------------------------------|--|
| Process                                    | 90nm 1P9M CMOS                                 |  |
| Data Rate                                  | $6 \mathrm{Gb/s}$                              |  |
| Supply                                     | 1V                                             |  |
| Power                                      |                                                |  |
| CDR                                        | $6\mathrm{mW}$                                 |  |
| (digital: PD.Filter                        |                                                |  |
| Proportional/Integral path)                |                                                |  |
| CDR                                        | $41 \mathrm{mW}$                               |  |
| (analog: interpolator.sampler.Mux.)        |                                                |  |
| Active Area                                | $220 \times 320(\mu m \times \mu m)$ (digital) |  |
|                                            | Gate Count : 24946                             |  |
|                                            | $240 \times 380(\mu m \times \mu m)$ (analog)  |  |
| Recovered                                  | 54.420ps @PJ, Amp=0.18UI(P2P), Freq=1 MHz      |  |
| Clock Jitter                               | 17.516ps @RJ, $\sigma$ =0.02UI                 |  |
| Frequency Tolerance                        | +/-1000ppm                                     |  |
| SSC Tracking                               | +/-5000ppm 33KHz                               |  |

Table 5.1: Design summary of proposed CDR

| 10010 0.2. 2 001811 201  | initial j of proposed as a d                       |  |
|--------------------------|----------------------------------------------------|--|
| Modulation Frequency     | 33KHz                                              |  |
| Max. Frequency Deviation | 4983ppm                                            |  |
| Active Area              | 240 × 180( $\mu m \times \mu m$ )(SSCG and PLL)    |  |
|                          | $270 \times 220(\mu m \times \mu m)$ (Loop filter) |  |
| EMI reduction            | 20.6dB                                             |  |
| Jitter performance       | 1.3ps (P2P)                                        |  |
| Power                    | 7.57 mW                                            |  |

Table 5.2: Design summary of proposed SSCG

provides the reference clock for PLL in spread spectrum clock generator. In order to measure BER, we use a BIST in the test chip that generates a waveform whose duty cycle is proportional to the accumulated error bits. This signal is the Error Signal. Tektronics TDS6124C Digital Storage Oscilloscope is used to measure the waveform of Error Signal. Tektronics TDS6124C Digital Storage Oscilloscope also measures the waveform and jitter of CDR recovered clock and recovered data. Agilent E4440A Spectrum Analyzer is used to measure the spectrum of CDR recovered clock and the output result of spread spectrum clock generator.

2000000



Figure 5.7: Test Environment Setup.

# Chapter 6 Conclusions

Schemes improving CDR loop bandwidth stabilization includes Majority Vote, M-AES, and Gain Compensation are proposed. The CDR conforms to SATA generation 3 specifications. The CDR is a dual loop architecture that is suitable for multi-channel integration without the need of extra PLLs for different channels. The 2<sup>nd</sup>-order digitally implemented phase tracking algorithm is programmable for different jitter conditions and can track spread spectrum clock transmission. The proposed Gain Compensation technique eradicates the unwanted side effects of binary phase detection and enhance the performance during various data transition density. The CDR meets the specification of jitter quantity and spread spectrum clock of SATA-III and the specification of jitter tolerance mask of SDH STM-64 interface. The CDR is implemented in UMC 1P8M 90nm 1.0V Regular-Vt CMOS technology.

The other objective of this thesis is the effect of different order of  $\Sigma\Delta$  modulation. Our theoretical results have shown that, once the phase resolution of the interpolator is high enough, the difference of the jitter from different order modulators is so insignificant that can be neglected. Whereas the resolution of interpolator in our work is high enough, in an effort to reduce the hardware complexity, there is no reason to choose the higher-order  $\Sigma\Delta$  modulator to control the amount of the phase rotator. The result indicates that, unlike other cases, the effect of higher-order modulation does not apply in our work. Table 6.1 shows the comparison with other SSCG.

|                  | Proposed SSCG | ISSCC2005            | ISSCC2005            | ISSCC2006    |
|------------------|---------------|----------------------|----------------------|--------------|
|                  | (Simulated)   | [33]                 | [41]                 | [30]         |
| Technology       | 90nm          | $0.18 \mu m$         | $0.15 \mu m$         | $0.15 \mu m$ |
| Modulation       | Phase         | Phase                | Modulation           | Modulation   |
| Mechanism        | Rotation      | Selection            | on Divider           | on Input     |
| Divider Ratio    | 12            | 60                   | 37.5/75              |              |
| Operating        | 1.2GHz        | 1.5GHz               | 1.5GHz               | 27MHz        |
| Frequency        |               |                      |                      | (ref. clock) |
| Frequency        | 5000ppm       | 5000ppm              | 5000ppm              | 30000ppm     |
| Deviation        | (6 MHz)       | $(7.5 \mathrm{MHz})$ | $(7.5 \mathrm{MHz})$ |              |
| EMI Reduction    | 20.6dB        | 9.8dB                | 20.3dB               | 14dB         |
| EMI Reduction/BW | 3.43dB/MHz    | 1.3  dB/MHz          | 2.7  dB/MHz          |              |

Table 6.1: Design summary of proposed SSCG



## Bibliography

- Serial ATA Workgroup, SATA: High Speed Serialized AT Attachment, Revision 2.6, Mar. 2007.
- [2] HDMI, High-Definition Multimedia Interface Specification, Revision 1.3a, Nov. 2006.
- [3] PCI-SIG, PCI Express Base Specification, Revision 1.0a, 15 April 2003.
- [4] J. Savoj and B. Razavi, "A 10-Gb/s CMOS Clock and Data Recovery Circuit with a Half-Rate Linear Phase Detector," *IEEE J. Solid-State Circuits*, vol. 36, no. 5, pp.761-767, May 2001.
- [5] J. E. Rogers and J. R. Long, "A 10-Gb/s CDR/DEMUX with LC Delay Line VCO in 0.18-μm CMOS," *IEEE J. Solid-State Circuits*, vol. 37, no. 12, pp. 1781-7189, Dec. 2002.
- [6] S. Y. Sun, "An Analog PLL-Based Clock and Data Recovery Circuit with High Input Jitter Tolerance," *IEEE J. Solid-State Circuits*, vol. 24, no.2, pp. 325-330, Apr. 1989.
- [7] R. F. Rad, A. Nguyen, J. M. Tran, T. Greer, J. Poulton, W. J. Dally, J. H. Edmonson, R. Senthinathan, R. Rathi, M. J. E. Lee, and H. T. Ng, "A 33-mW 8-Gb/s CMOS clock multiplier and CDR for highly integrated I/Os," *IEEE J. of Solid-State Circuits*, vol. 39, no. 9, pp. 1553-1561, Sep. 2004.

- [8] C.-K. K. Yang, R. F. Rad, and M. A. Horowitz, "A 0.5-μm CMOS 4.0-Gbit/s serial link transceiver with data recovery using oversampling," *IEEE J. of Solid-State Circuits*, vol. 33, no. 5, pp. 713-722, May 1998.
- [9] H. T. Ng, R. F. Rad, M. J. E. Lee, W. J. Dally, T. Greer, J. Poulton, J. H. Edmonson, R. Rathi, and R. Senthinathan, "A Second-Order Semi digital Clock Recovery Circuit Based on Injection Locking," *IEEE J. of Solid-State Circuits*, vol. 38, no. 12, pp. 2101-2110, Dec. 2003.
- [10] K. -Y. K. Chang, J. Wei, C. Huang, S. Li, K. Donnelly, M. Horowitz, Y. Li, and S. Sidiropoulos, "A 0.4-4-Gb/s CMOS Quad Transceiver Cell Using On-Chip Regulated Dual-Loop PLLs," *IEEE J. of Solid-State Circuits*, vol. 38, no. 5, pp. 747-754, May 2003.
- [11] S. Kim, K. Lee, D. K. Jeong, D. D. Lee, and A. G. Nowatzyk, "An 800Mbps multi-channel CMOS serial link with 3x oversampling," in it IEEE 1995 CICC Proc., p. 451, Feb. 1995.
- [12] J. Hancock, Jitter-Understanding it, Measuring it, Eliminating it, Part1: Jitter Fundamentals, Summit Technical Media, Apr. 2004.
- [13] Semiconductor Industry Association, The National Technology Roadmap for Semiconductors, 1997.
- [14] Serial ATA Workgroup, SATA: High Speed Serialized AT Attachment, Revision 2.5, Oct. 2005.
- [15] American National Standard for Information Technology, Serial Attached SCSI-2, Sep. 2007.
- [16] ITU-T Recommendation G.825, The Control of Jitter and Wander within Digital Networks which are Based on the Synchronous Digital Hierarchy (SDH), Mar. 2000.

- [17] J. Sonntag and J. Stonick, "A digital clock and data recovery architecture for multi-gigabit binary links," *Proc. IEEE Custom Integrated Circuits Conf.*, pp. 537-544, Sep. 2005.
- [18] Y. Tomita, M. Kibune, J. Ogawa, W. W. Walker, H. Tamura, and T. Kuroda,
   "A 10-Gb/s receiver with series equalizer and on-chip ISI monitor in 0.11-μm
   CMOS," *IEEE J. Solid-State Circuits*, vol. 40, no. 4, pp. 986-993, Apr. 2005.
- [19] H. Takauchi, H. Tamura, S. Matsubara, M. Kibune, Y. Doi, T. Chiba, H. Anbutsu, H. Yamaguchi, T. Mori, M. Takatsu, K. Gotoh, T. Sakai, and T. Yamamura, "A CMOS multichannel 10 Gb/s transceiver," *IEEE J. Solid-State Circuits*, vol. 38, no. 12, pp. 2094-2100, Dec. 2003.
- [20] R. C. Walker, "Designing Bang-Bang PLLs for Clock and Data Recovery in Serial Data Transmission Systems", *Phase- Locking in High-Performance Systems*, B. Razavi, Ed: IEEE Press, pp. 34-45, 2003.
- [21] Yuan-Pu Cheng, "Clock and Data Recovery for Spread Spectrum Clock using Multiple Alternating Edge Sampling", National Chiao Tung University, 2007.
- [22] Y. Choi, D. K. Jeong, and W. Kim, "Jitter Transfer Analysis of Tracked Oversampling Techniques for Multigigabit Clock and Data Recovery," Invited paper, *IEEE Transactions on Circuit and Systems*, vol. 50, no. 11, pp. 775-783, Nov. 2003
- [23] C. Kromer, G. Sialm, C. Menolfi, M. Schmatz, Frank Ellinger, H. Jackel, "A 25-Gb/s CDR in 90-nm CMOS for High-Density Interconnects," *IEEE J. Solid-State Circuits*, vol. 41, no. 12, pp. 2921-2929, Dec. 2006.
- [24] J. F. Bulzacchelli, M. Meghelli, S. V. Rylov, W. Rhee, A. V. Rylyakov,H. A. Ainspan, B. D. Parker, M. P. Beakes, A. Chung, T. J. Beukema, P.

K. Pepeljugoski, L. Shan, Y. H. Kwark, S. Gowda, and D. J. Friedman, "A 10-Gb/s 5-Tap DFE/4-Tap FFE Transceiver in 90-nm CMOS Technology," *IEEE J. Solid-State Circuits*, vol. 41, no. 12, pp. 2885-2900, Dec. 2006.

- [25] R. Kreienkamp, U. Langmann, C. Zimmermann, T. Aoyama, and H. Siedhoff, "A 10-Gb/s CMOS Clock and Data Recovery Circuit With an Analog Phase Interpolator," *IEEE J. Solid-State Circuits*, vol. 40, no. 3, pp. 736-743, Mar. 2005.
- [26] B. J. Lee, M. S. Hwang, S. H. Lee, and D. K. Jeong, "A 2.5-10-Gb/s CMOS Transceiver With Alternating Edge-Sampling Phase Detection for Loop Characteristic Stabilization," *IEEE J. Solid-State Circuits*, vol. 38, no. 11, pp. 1821-1829, Nov. 2003.
- [27] S. H. Lee, M. S. Hwang, Y. Choi, S. Kim, Y. Moon, B. J. Lee, D. K. Jeong, W. Kim, Y. J. Park, and G. Ahn, "A 5-Gb/s 0.25-gm CMOS Jitter-Tolerant Variable-Interval Oversampling Clock/Data Recovery Circuit," *IEEE J. Solid-State Circuits*, vol. 37, no. 12, pp. 1822-1830, Dec. 2002.
- [28] Ecliptek Corporation. 2005. Programmable Spread Spectrum Quartz Crystal Oscillators Reduce EMI for High Speed Digital Systems. Retrieved Aug. 2007, from http://www.ecliptek.com/tech/ss\_emi\_digi.html
- [29] H. S. Li, Y. C. Cheng, and D. Puar, "Dual-Loop Spread-Spectrum Clock Generator," in *ISSCC Dig. Tech. Papers*, pp. 184-185, Feb. 1999.
- [30] S. Damphousse, K. Ouici, A. Rizki and M. Mallinson, "All digital spread spectrum clock generator for EMI reduction" *IEEE J. Solid-State Circuits*, vol. 42, pp. 145-150, Jan. 2007.

- [31] M. Kokubo, T. Kawamoto, T. Oshima, T. Noto, M. Suzuki, S. Suzuki, T. Hayasaka, T. Takahashi, and J. Kasai, "Spread-Spectrum Clock Generator for Serial ATA using Fractional PLL Controlled by ΔΣ Modulator with Level Shifter," in *ISSCC Dig. Tech. Papers*, pp. 160-161, Feb. 2005.
- [32] B. G. Goldberg, "The evolution and maturity of Fractional-N PLL synthesis," *Microwave Journal*, pp. 124-134, Sept. 1996.
- [33] H. R. Lee, O. Kim, G. Ahn, and D. K. Jeong, "A Low-Jitter 5000ppm Spread Spectrum Clock Generator for Multi-channel SATA Transceiver in 0.18um CMOS," in *ISSCC Dig. Tech. Papers*, pp. 162-163, Feb. 2005.
- [34] H. Inose and Y. Yasuda, "A unity bit coding method by negative feedback," *Proceedings of the IEEE*, vol. 51, pp. 1524-1535, Nov. 1963.
- [35] J. C. Candy, "A use of limit cycle oscillations to obtain robust analog-todigital converters," *IEEE Trans. Commun.*, vol.COM-22, pp. 298-305, Mar. 1974.
- [36] J. C. Candy, "A use of double integration in sigma-delta modulation," *IEEE Trans. Commun.*, vol.33, pp. 249-258, Mar. 1985.
- [37] P. M. Aziz, H. V. Sorensen, and J. V. D. Spiegel, "An overview of sigma-delta converters," *IEEE Signal Processing Mag.* pp. 61-84, Jan. 1996.
- [38] M. Kozak and I. Kale, Oversampled Delta-Sigma modulators: Analysis, applications and novel topologies, Kluwer Academic Publisher, 2002.
- [39] D.A. Johns and K. Martin, Analog Integrated Circuit Design, John Wiley and Sons, Inc., 1997.
- [40] Y. Matsuya et. al., A "A 16-bit oversampling A-to-D conversion technology using triple integration noise shaping," *IEEE J. Solid-State Circuits*, vol. SC-22, pp. 921-929, Dec. 1987.

- [41] Tom A. D. Riley, Miles A. Copeland, and Tad A. Kwasniewski, "Delta-Sigma Modulation in Fractional-N Frequency Synthesis," *IEEE J. Solid-State Circuits*, vol. 28, no. 5, pp. 553-559, May. 1993.
- [42] Maxim Integrated Products, Dallas Semiconductor. Dec., 2004. Clock (CLK) Jitter and Phase Noise Conversion. Retrieved Sep. 2007, from http://www.maxim-ic.com/appnotes.cfm/an\_pk/3359

