#### 高速、低壓差動擺幅之收發器

學生:蕭聖文 指導教授:吳錦川博士

#### 國立交通大學電子工程學系 電子研究所碩士班

#### 摘要

隨著液晶面版的尺寸加大,還有對於顯示器解析度的要求提高,因此單位時 間內,顯示卡和液晶顯示器間所需傳輸及接收的資料量也大幅提高,而能達到越高 速傳輸的介面電路將為顯示畫面品質高低的關鍵。雖然以往使用並列的方式可達到 高速傳輸的目的,但如此一來將會浪費許多導線的成本及可用空間,且使用便利上 也受到影響,因此小巧、高速序列傳輸方式成為現在主流。在目前高速傳輸介面電 路中,高速、低功率、低雜訊干擾的傳輸條件中又以低電壓差動訊號傳輸 (LVDS) 技術被運用的範圍也最廣。

本論文的研究有傳送和接收兩部分,並使用 tsmc 0.35um2P4M CMOS 製程技術實 現,電壓電源為 3.3V。傳送器主要利用七相位鎖相迴路和多工器將並列資料轉為序 列資料。而量測結果七相位鎖相迴路輸出 142.8 MHz 的時脈訊號之方均根抖動和峰 值抖動分別為 11.7 ps 和 80 ps, 而傳送器傳送出 1 Gb/s 的信號, 消耗功率為 162mW。

接收器主要是利用三倍取樣的機制來解決資料和時脈時差的問題,利用具有磁 滞現象的比較器將由傳送器傳送過來的小信號放大成數位訊號,再利用三組取樣器 取樣。接收器每接收到十六筆資料後,做一次取樣相位校正的動作,已確保接收器 能接受到最正確的資料。

#### **High speed**、**LVDS Transceiver**

Student: Hsiao Shen-Wen Advisor: Prof. Jiin-Chuan Wu

Department of Electronics & Institute of Electronics National Chiao-Tung University

#### **Abstract**

As the size of the LCD monitor is being enlarged and demands on the resolution, data transmission between display card in computers and a LCD monitor becomes larger and larger. So, the key of high display quality will dramatically depend on the speed of the transmission interface. Although parallel ports can achieve the goal of high speed transmission, it takes more cost and space and, sacrifices convenience. Therefore, the novelty and high-speed serial ports become a maim stream. LVDS-interface techniques are more widely used in the transmission field. It has high speed, low power, low EMI.

The thesis is to design a transceiver for LVDS transmission interface. Transceiver is fabricated in 0.35um 2P4M process. Power supply is 3.3V. The transmitter makes use of a 7-phase PLL and a multiplexer to transfer parallel data to serial data. The measured jitters of the PLL output are 11.7 ps (r.m.s.) and 80 ps (peak-to-peak) at 142.8MHz. Transmitter transmits 1 Gb/s serial data normally and total power is 162mW.

Receiver uses 3X-sampling skills to overcome time-skew problem. It uses the comparator with hysteresis to amplify incoming small signal to full swing, and then uses three samplers to sample data. The receiver calibrates the sampling phase after receiving sixteen data to make sure the receiver can work correctly.

#### 致謝

首先我要感謝我的指導教授 吳錦川教授,老師豐富的經驗和深厚的學識在這 兩年來對我認真的指導,在嚴謹的態度中常以幽默的方式教導著我們,讓我在研究 功課及生活中均受益良多。

還有我要感謝我的口試委員,呂良鴻教授、邱煥凱教授、陳巍仁教授,不遠千 里來指導以及給我很多寶貴建議。

接下來我要感謝我的父母和家人,全力的在背後支持我,使我沒後顧之憂而能 專心在自己的課業上。

還要感謝和我一起待在這實驗室近兩年的權哲、史周、阿瑞、棋樺、凱嵐、如 琳、秉捷、偉霆、丁彥、旻珓、大頭、宗霖等,大家一起討論課業、生活上互相幫 忙,同時也一起在這實驗室裡度過歡樂美好的日子,另外也感謝實驗室的每位學長 對學弟的照顧還有對於實驗室事務和課業上指導,使我這兩年來可以很順利地完成 1896 **TITTING STATE** 學業。

2004/06/01

## **Contents**



### **Chapter1**



## **Chapter 2**



## **Chapter 3**

### **Transmitter**



## **Chapter 4**

### **Receiver**





## **Chapter 5**

## **Experimental Results**



## **Chapter 6**



## **List of Tables**





## **List of Figures**







# **Chapter 1 Introduction**

#### **1.1 Motivation**

 As fabrication processes make a great progress year by year, the working speed of chips becomes faster and faster. However, the data transmission speed between chips or computers and its peripheral components is a limiting factor of the overall performance in a system.

On the other hand, system clock becomes higher and higher, so does the consumption of system power without doing any other improvement. Nowadays, a variety of interface circuit has been proposed to increase performance, maintain low power consumption, and reduce cost. The goal of this research is to design a transceiver for serial link that can

achieves the specification of the Low-Voltage Differential Signaling (LVDS) and transmit data at 1 Gbps.

#### **1.2 LVDS Interface**

The LVDS display interface serializes the parallel data and sends it from the displays source to the display device. The architecture of the LVDS display interface is shown in **Fig. 1.1** [1]

The input signals to the transmitter at the display source are pixel data, horizontal



**Fig. 1.1 The architecture of a LVDS display interface** 

synchronization, vertical synchronization, a data enable control and etc [1]. These signals are serialized and transmitted over LVDS differential pairs. At the display device the LVDS signals are received, converted to parallel form, and output from the LVDS receiver. 1896

 Power consumption, transmission rate, bit error rate, and cost are four factors when evaluating a transceiver performance and often should be traded off. The LVDS characters low-voltage swing and differential signals.

- low-voltage swing: To minimize power dissipation and enable operation at very high speed, low-swing (400mv maximum) signal are specified.
- Differential signals: Small signal swing requires differential signaling for adequate noise margin in practical system.

 The most controversial decision was to use differential signals, which at first appear to double the number of signal lines. Reliable single-ended schematic require many more ground signals and run significantly slower. So the pin-count overhead of differential signals is actually much less than that of single-ended ones. Other design benefits:

- Constant driver current. The transmitter consumes a (near) constant current when driving the links; the current remains the same, but is routed in the opposite direction when the signal value changes. This simplifies the design of power-distribution wiring.
- Constant link current. The net signaling current in a differential link is (nearly) constant, which greatly simplifies the system design. The links are unidirectional.
- Low-power. A low signal current can be used, since much of the induced noise and ground-bounce appears as a common-mode signal
- Simple board design. Although differential signals must be carefully routed on adjacent matched tracks, they are usually less sensitive to imperfections in the transmission line environment.
- Low EMI. Differential signals minimize the area between the signal and the return path. In addition, the equal and opposite currents create canceling electromagnetic fields. This dramatically reduces the electromagnetic emissions.
- Low susceptibility to externally generated noise. Though these links generate little noise, other parts of the system may. Differential signals are relatively immune to this noise.

#### **1.3 Thesis Organization**

 This thesis is organization into six chapters and the first one is the introduction of the LVDS interface. Chapter 2 introduces more specifications for LVDS interface

communication. In Chapter 3, it will present the architecture of the transmitter. The receiver will be described in Chapter 4. In Chapter 5, the measurement results will be given. Chapter 6 gives a summary of this work and discusses the future work.



# **Chapter 2 LVDS Specifications**

### **2.1 LVDS Electrical Specifications**

#### **2.1.1 Overview**

This chapter presents an overview of the LVDS interface and key concepts. All contents and figures are adopted from [1][2]. A LVDS interface, (**Fig. 2-1**), has a low-voltage swing (400mV single-ended maximum), is connected point-to-point, and achieves a very high data rate and reduced power. Power is low because signals are small: a minimum of 2.5 mA are sent through a 100  $\Omega$  termination resistor. This sharply reduced power dissipation enables an important advance: integrating the line termination resistor, interface drivers and receivers, and the processing logic in the same integrated circuit.



**Fig. 2-1 LVDS interface** 

| Symbol              | Parameter                                    | Conditions                     | Min    | Max    | Units |
|---------------------|----------------------------------------------|--------------------------------|--------|--------|-------|
| $V_{\rm oh}$        | Output voltage low, $V_{oa}$ or $V_{ob}$     | $R_{\text{load}} = 100 \Omega$ |        | 1475   | mV    |
| $V_{\rm ol}$        | Output voltage low, $V_{oa}$ or $V_{ob}$     | $R_{\text{load}} = 100 \Omega$ | 925    |        | mV    |
| $ V_{od} $          | Output differential voltage                  | $R_{\text{load}} = 100 \Omega$ | 250    | 400    | mV    |
| $V_{\rm os}$        | Output common mode voltage                   | $R_{\text{load}} = 100 \Omega$ | 1.125  | 1.375  | mV    |
| $ \Delta V_{od} $   | Change in $V_{od}$ between "0" and "1"       | $R_{\text{load}} = 100 \Omega$ |        | 35     | mV    |
| $\Delta V_{\rm os}$ | Change in $V_{\alpha s}$ between "0" and "1" | $R_{\text{load}} = 100 \Omega$ |        | 35     | mV    |
| $V_{\text{idth}}$   | Input differential threshold                 |                                | $-100$ | $+100$ | mV    |

**Table 2-1 General purpose link** 

Switching speed is high because the driving load is an uncomplicated point-to point 100  $\Omega$  transmission line environment. Interface devices are all on the same piece of semiconductor material, reducing the skew due to process, temperature, and supply variation between signal pairs.

LVDS is independent of the physical transmission layer media. As long as the media delivers signals to the receiver with adequate noise margin and the skew tolerance range, the interface will be reliable. This is a great advantage when using the cable to carry LVDS signals. Since all connections are point-to-point, physical links between nodes are independent of other node connections in the same system. This allows for freedom in developing a useful connection that fits the need of the application. Electrical specifications and skew specifications are optimized for 2–5 V supply voltages. The full range of semiconductor process technologies can be used to implement LVDS. It is intended that the specification be interoperable for all these technologies. The rapid trend toward reduced power supply voltage was considered in providing for signals that can be compatible with future system requirements.

 The specification for transmitter and receiver parameters is given in **table 2-1**. Descriptions of theses specification are contained in the following subclasses.

#### **2.1.2 Driver Output Levels**

The driver output, when properly terminated, results in a small-swing differential voltage. The relation between the single-ended outputs and the differential signal is shown in **Fig. 2-2**. The differential driver is made up from two single-ended outputs. These outputs alternate between sourcing and sinking a constant current. The differential voltage level is determined by the load resistance. The DC load seen by the driver is the receiver input impedance in parallel with the differential termination, 100  $\Omega$ , which dominates. The case where the current source is providing 4 mA is shown in **Fig. 2-2**, where the outputs are switching the current at a 50% duty cycle. The receiver threshold limits are shown in **Fig. 2-2**, in relation to the single-ended signals that arrive at the receiver inputs. When the magnitude of the voltage difference exceeds the receiver threshold, then the receiver is in a determined logic state. For the purpose of this standard, a differential voltage greater than or equal to *V*idth(max) is a logic high, and less than or equal to *V*idth(min) is a logic low. Ground shift margin is built in by confining the output to a range of *V*ol to *V*oh (e.g., this allows approximately 1 V of ground shift between a driver and receiver that are powered from 2.5 V supplies). The range of allowable dc output levels for driver output voltages *V*oa and *V*ob is illustrated in **Fig. 2-3**. Measurement of the voltages *V*oa, *V*ob and the differential output voltage *V*od is illustrated in figure **Fig. 2-4**. The driver output shall always be terminated in compliance with this specification. The unterminated driver output voltage shall not exceed 2.4 V. Note that the receiver may be exposed to the unterminated driver output voltage briefly when a cable from the driver is being connected to the receiver—the cable will be charged to the unterminated driver output voltage.



**Fig. 2-2 Maximum driver signal levels shown for 1.2V** *V***os** 



**Fig. 2-3 Driver signal levels** 



**Fig. 2-4 Reference circuit** 

The following driver DC output voltage limits refer to **Fig. 2-3** and **Fig. 2-4**, and shall apply for a load resistance  $R = 100 \Omega$  connected as shown in Fig. 2-4. Ideally, the amplitude and common-mode voltage of the steady-state differential output would not change, but in practical designs, both change. The output of a driver whose differential voltage (*V*od) and driver offset voltage (*V*os) change when the output changes state is shown in **Fig. 2-5**. The definition for *V*od and *V*os is shown in **Fig. 2-6**.



**Fig. 2-5 Driver signal levels** 



**Fig. 2-6 Reference circuit** 

The definition of Δ*V*os and Δ*V*od are explicitly stated by taking into account the varying voltage levels of the single ended outputs when in the different logic states. This can be expressed by equation (1) and equation (2).

$$
\Delta V_{\text{OS}} = |V_{\text{OS}} \text{ (high)} + V_{\text{OS}} \text{ (low)}|
$$
  
Where  

$$
V_{\text{OS}} \text{ (high)} = (V_{\text{Oah}} + V_{\text{Obl}})/2
$$
, and  $V_{\text{OS}} \text{ (low)} = (V_{\text{Oal}} + V_{\text{Obh}})/2$  (1)

$$
|\Delta V_{od}| = |V_{od}(\text{high}) - V_{od}(\text{low})|
$$
 (2)

Where

 $V_{od}$  (high) = *V*oah - *Vobl*, and  $V_{od}$  (low) = *Vobh* - *Voal* 

#### **2.1.3 Driver Short-Circuit Specification**

To ensure that the driver circuit does not damage itself or other parts of the electronics, limits on the output currents when shorted mutually and to ground are imposed. When the driver output terminals are short-circuited to the driver circuit ground, neither current magnitude (*I*sa or *I*sb) shall exceed the specified value. The test circuit is shown in **Fig. 2-7**. When the driver terminals are short-circuited to each other, the current magnitude shall not exceed the specified value. The test circuit is shown in **Fig. 2-8**.



**Fig. 2-8 Short-together test circuit** 

#### **2.1.4 Driver Power-Off Leakage Current**

The driver output leakage currents (*I*xa and *I*xb) are measured under power-off conditions, *V*cc=0 V, as shown in **Fig. 2-9**. With the voltage on the driver output terminals between 0 V and 2.4V, with respect to driver common, these currents shall not exceed the specified value.



**Fig. 2-9 Driver power-off leakage current test circuit**

#### **2.1.5 Receiver Input Levels**

The receiver input signal is measured differentially, **Fig. 2-10**. The receiver output state is determined by a differential input signal greater than +*V*idth or less than -*V*idth, within the permitted *V*i range. The termination is allowed to have greater variance because it is intended to operate in a more controlled environment with less common-mode noise. For simplicity, the remaining receiver specification discussion  $u_1, \ldots, u_k$ here will apply directly to the general purpose specification.



**Fig. 2-10 Receiver signal levels, for table 2-1** 

The ability to accept voltages outside a *V*i range is desirable because it increases noise immunity to ground potential difference and interconnect-coupled noise. The upper limit to the differential swing is given to ensure that receiver skew specifications are maintained for this range of input signals throughout the receiver common mode range. The range of allowable DC input levels for receiver input voltages, Via and Vib, is illustrated in **Fig. 2-11**. Measurement of the voltages *V*ia, *V*ib, and the differential input voltage *V*id is illustrated in figure **Fig. 2-12**.



**Fig. 2-11 Receiver signal common mode levels, table 2-1**  $n_{\rm HII}$ 



**Fig. 2-12 Reference circuit** 

The signal common-mode level for **table 2-1** is shown in **Fig. 2-11**. The receiver common-mode input voltage, *V*icm, will be an alternating voltage depending on three superimposed conditions: the driver output condition, voltages induced on the interconnection, and reflections caused by common-mode termination imperfections. This voltage can be expressed by accounting for the varying levels during both logic states. Equation (3) expresses the relationship of the four input voltages resulting from the three conditions previously stated.

$$
Vicm = (Vi + Vib)/2 \tag{3}
$$

#### **2.1.6 Receiver Threshold Hysteresis**

The threshold hysteresis is important in receiver design to eliminate the possibility of oscillating receiver output when the differential input is undefined (see **Fig. 2-13**). The undefined input can occur when the receiver inputs are unconnected, when the connected driver is powered down, or when transitioning between defined values. The 25 mV minimum hysteresis means that an input signal must change by more than this value to change the receiver output state. A known output condition for an open or shorted receiver input (failsafe) is implementation-dependent and beyond the scope of this standard.



**Fig. 2-13 Receiver hysteresis**

## **2.2 Pixel Formats**

 **Table 2-2** shows the common display resolutions. There are many kinds of pixel formats nowadays. The following are supported by OpenLDI(Open LVDS Display Inetrface)[1]. **Fig. 2-14** is one of that, which is called"18-bit Single Pixel transmission, Unbalanced ". Others, such as "24-bit Single Pixel", "18-bit Dual Pixel", "24-bit Dual Pixel", are mostly similar.

| Resolution | <b>Common Name</b> |
|------------|--------------------|
| 640×480    | VGA                |
| 800×600    | <b>SVGA</b>        |
| 1024×768   | XGA                |
| 1280×1024  | SXGA               |
| 1600×1024  | SXGAW              |
| 1600×1200  | UXGA               |
| 1920×1080  | <b>HDTV</b>        |
| 1900×1200  | UXGAW              |
| 2048×1536  | OXGA               |

**Table 2-2 Common display resolutions**



**Fig. 2-14 18-bit single pixel transmission, unbalanced** 

In the 18-bit single pixel mode, the RGB and control inputs shall be transmitted as shown in **Fig. 2-14**. Outputs A3 through A7 and CLK2 shall be inactive in this mode and fixed at a single value.

OpenLDI serializes the parallel pixel stream for transmission from the display source and the display device. There are 8 serial data lines (A0 through A7) and two clock lines (CLK1 and CLK2) in the OpenLDI interface. The number of serial data lines may vary, depending on the pixel formats supported. For the 18-bit single pixel format, serial data lines A0 through A2 shall be used. For the 24-bit single pixel format, serial data lines A0 through A3 shall be used. For the 18-bit dual pixel format, serial data lines A0 through A2 and A4 through A6 shall be used. For the 24-bit dual pixel format, serial data lines A0 through A7 shall be used. Only those serial data lines required for use by the formats supported by an implementation need to be active. The OpenLDI Specification v0.95 serial data stream on each signal line shall be at a bit rate that is seven times the pixel clock. The CLK1 line shall carry the pixel clock. The CLK2 line shall also carry the pixel clock when dual pixel mode is used. When dual pixel mode is not used or when it is known that the display device does not require the CLK2 signal, CLK2 may be left inactive. The CLK2 signal is provided for compatibility with earlier systems and to support display device designs that use independent receivers for the upper and lower pixels. There are two modes of operation, unbalanced and DC balanced. Each of the unbalanced and DC balanced modes uses one mode for single pixel transmission and a second for dual pixel transmission.

# **Chapter3 Transmitter**

#### **3.1 Architecture of Transmitter**

The block diagram of a serial link transmitter is shown in **Fig. 3-1**. It consists of a PLL, a 7 to 1 data multiplexer, and a data driver [3]. The Pseudo Random Bit Sequence (PRBS) generates data pattern for testing. The PLL provides seven regular clocks for the MUX. The serializing of the parallel data from data sources (PRBS) is achieved by using multiple phase clocks from the PLL. To achieve high data rate without speed-critical logic on chip, the differential data are multiplexed when data is transmitted. Finally, the data driver outputs the serial data steam to the transmission line.



**Fig. 3-1 The block diagram of Transmitter** 



**Fig. 3-2 Seven bits of LVDS in one clock cycle** 

#### **3.2 Phase Locked Loop**

#### **3.2.1 Introduction**

For LVDS-interfaced transmission between TFT-LCD monitors and computers, it needs to transfer seven serial data per cycle and per channel [1], as shown in **Fig. 3-2.** This chapter will introduce a PLL with 142.8 MHz reference frequency, which can produce seven uniformly distributed phases for multiplexer. A Phase-Locked Loop (PLL) is basically an oscillator whose frequency is locked onto the phase and the frequency of an input signal. This is done by a feedback control loop. PLL are primarily used in communication applications. For example, it recovers clock from digital data signals, recovers the carrier from satellite transmission signals, performs frequency and phase modulation and demodulation, and synthesizes exact frequencies for receiver tuning.

#### **3.2.2 PLL Architecture**

A simplified block diagram of a PLL is shown in **Fig. 3-3**. It consists of a

Phase-Frequency Detector (PFD), a Charge Pump (CP), a Loop Filter (LF), and a Voltage-Controlled Oscillator (VCO). The negative feedback system synchronizes the internal clock Clk[0] from the VCO to the external reference signal F**ref** by comparing their phases. The PFD outputs control signals, UP and Downb according to the phase errors between the reference signal F**ref** and the internal clock Clk[0]. The function of the charge pump is to convert the logic states of the PFD into a control voltage for VCO by charging or discharging the loop filter. This voltage is proportional to the phase difference and adjusting the VCO frequency. In the loop filter, extra poles and zeros should be introduced to suppress high-frequency signal from the PFD and present the dc level to the VCO. The PLL is "locked", which means the phase difference between F**ref** and Clk[0] is constant and the frequencies of F**ref** and Clk[0] are almost the same.



**Fig. 3-3 Block diagram of a charge-pump PLL**

#### **3.2.3 Circuit Implementation**

#### **3.2.3.1 Phase Frequency Detector**

The PFD is a digital sequential circuit, trigged by the rising (or falling) edge of

the reference signal (F**ref**) and feedback signal (Clk[0]) from the VCO. It creates tri-state operation as shown in **Fig. 3-4**, the state Up=Down=1 never occurs. The tri-state operation allows a wide range of detection for  $\Delta \phi = \pm 2\pi$ . It detects both phase error and frequency difference.



**Fig. 3-4 State diagram of a tri-state PFD** 

In the **Fig. 3-4**, Up is used to increase and Down is to used to decrease the frequency of signal Clk[0]. If F**ref** leads Clk[0] or has a higher frequency than Clk[0], then Up will be set to high and Down remains low. Vice versa, if F**ref** lags Clk[0] or has a lower frequency than Clk<sup>[0]</sup>, then Down will be set to high and Up remains low. By repeating these operations for a long time, PLL becomes locked. Therefore, Up and Down will both remain low, then F**ref** and Clk[0] have the same frequency and phase is aligned.

As shown in **Fig. 3-5**, the PFD is implemented by two D flip-flops and one NOR gate. Its transfer characteristic function shows in **Fig. 3-6**. When the phase error is small, the reset generation is very fast. The charge pump will not charge the loop filter because the very narrow pulses, Up and Down may not reach a full VDD swing or have enough time to turn on the charge pump switch. When it happens, we call it dead zone. If the dead zone exists, the phase jitter may increase. Because it allows the VCO to accumulate as much random phase error as the extent of the dead zone while receiving no corrective feedback to change the control voltage. Its characteristic curve with dead zone is shown in **Fig. 3-7**. By adding extra delay in the reset path, the dead zone can be reduced. But the PFD will have limit on the maximum operation frequency that is in inverse proportion to total reset path delay [4]. Therefore, the time of this delay should be kept minimum. The elimination of the dead zone results in overall linear operating characteristics for the PFD, especially for input signals with small but finite phase difference. The TSPC D flip-flop implementation of the PFD is shown in **Fig. 3-8** [5].



**Fig. 3-6 PFD transfer characteristic curve** 



**Fig. 3-7 PFD transfer characteristic curve with dead zone** 



**Fig. 3-8 TSPC dynamic D Flip-Flop** 

#### **3.2.3.2 Charge Pump**

The Charge pump is a circuit that supplies current to the loop filter to produce a control voltage. However, an undesirable feature of charge pumps is the charge injection produced by the overlap capacitance of the switch devices and by the capacitance at the intermediate node between the current source and switch devices.

بالقلاف

This charge injection will result in a phase offset at the input of the phase-frequency detector when the PLL is locked. This phase offset will increase as the charge pump current is reduced. When the ordering of the current source and switch devices are reversed so that the switch devices connect directly to the output node as shown in **Fig. 3-9** [6], the output voltage is directly affected by the switching noise from the overlap capacitance of the switch devices. In addition, the intermediate nodes between the current source and switch devices will charge toward the supplies while the switch devices are off. When the switch devices turn on, these intermediate nodes must charge toward the output voltage, removing charge from the output node in the process and resulting in a phase offset that depends on the output voltage. To

combat the charge injection problem, the intermediate nodes can be switched to the output as **Fig. 3-10** [7][8].

To lower any switching errors that may affect the sensitive output node Vctrl, switch devices are put on the side of the current source devices. When switch devices are off, the intermediate nodes between each switch device and current source devices will be charged toward the output voltage by the gate overdrive of the current source devices. Therefore, the control voltage Vctrl can be isolated from the switching noise further.



**Fig. 3-9 Charge sharing with charge injection offsets** 



**Fig. 3-10 Schematic of charge pump** 

#### **3.2.3.3 Voltage Control Oscillator**

The VCO consists of a seven-staged differential ring oscillator. Each delay cell is based on symmetric loads as shown in **Fig. 3-11** [8]. With the supply as the upper swing limit, the lower swing limit is symmetrically opposite at *V*ctrl. The buffer delay changes with the control voltage since the effective resistance of the load always changes with the control voltage. In order to maintain the symmetric I-V characteristics of the loads, the current source bias circuit is designed to adjust the buffer bias current so that the output swing varies with the control voltage rather than fixed.

The delay per stage can be expressed by the equation:

$$
t_d = R_{\text{eff}} \times C_{\text{eff}} = \frac{1}{gm} \times C_{\text{eff}}
$$
 (3.1)

where  $C_{\text{eff}}$  is the effective delay cell output capacitance,  $R_{\text{eff}}$  is the effective resistance of a delay cell. The drain current for one of the two equally sized devices at Vctrl is **MARITING** given by

$$
I_{d} = \frac{k}{2} [(Vdd - Vctrl) - |Vtp|]^{2}
$$
 (3.2)

where k is the device transconductance of the PMOS device. Taking the derivative with respect to (Vdd-Vctrl), the transconductance is given by

$$
gm = k[(Vdd - Vctrl) - |Vtp|]
$$
\n(3.3)

Combining (3.1) with(3.3), the delay of each stage can be written as

$$
t_d = \frac{C_{\text{eff}}}{k[\sqrt{Vdd - Vctrl}] - |Vtp|}
$$
(3.4)

The period of a ring oscillator with N delay stages is approximately 2N times the delay per stage. This translates to a center frequency of

$$
f_{vco} = \frac{1}{2Nt_d} = \frac{k[(Vdd - Vctrl) - |Vtp|]}{2NC_{\text{eff}}}
$$
(3-5)

The center frequency of the VCO is in direct proportion to (Vdd-Vctrl) and has no relationship with the supply voltage.



**Fig. 3-11 Schematic of VCO delay cell with symmetric load elements** 

The bias generator for the VCO is shown in **Fig. 3-12** [8]. It consists of a differential amplifier, a half-buffer replica and a control voltage buffer. The differential amplifier and the half-buffer replica form a negative loop to make the voltage Va equal to Vctrl so that the output swings vary with the control voltage rather than is fixed. If the supply voltage changes, the amplifier will also adjust to keep the swing and thus the bias current constant. The bandwidth of the bias generator is typically set equal to the center frequency of the delay stages so that the bias generator can track all supply and substrate variations at frequencies that can affect the PLL designs. The bias generator also provides a control voltage buffer to isolate Vctrl from potential capacitive coupling in the delay stages.

An important issue in supply-independent biasing is the existence of "degeneration" bias points. If all of the transistors carry zero current when the supply is

turned on, they may remain off indefinitely because the loop can support zero current. It is necessary to add a start-up circuit that drives the circuit out of the degenerate bias point.



 Two differential outputs of the VCO are converted to the single-ended signal used as input to the phase-frequency detector with the differential-to-single-ended converter shown in **Fig. 3-13** [8]. The two differential amplifiers use the same current source bias voltage, Vbn, generated by the self-biased generator for the VCO. According to Vbn, the circuit corrects the input common-mode voltage level and provides signal amplification. The inverters are added at the output to improve the driving ability.

The simulation results of the VCO transfer characteristics are : The supply voltage is 3.3V. For Vctrl between 1.0V and 2.0V, the gain of the VCO, Kvco, is 142 MHz/V when SPICE is in the TT mode.


**Fig. 3-14 PLL linear model** 

## **3.2.4 PLL Linear Model**

The transient response of a PLL is generally a nonlinear phenomenon that can not be formulated easily [9][10]. Some basic knowledge of control loop theory is necessary in order to understand PLL filter. A linear mathematical model representing the phase of the PLL in the locked stage is presented in **Fig. 3-14**.

We assume that the loop is locked and the PFD, represented by a subtractor, has the output voltage proportional to  $\theta_e$ , the difference in phase between its inputs. The average error current within a cycle is  $i_d = I_p \frac{\partial e}{\partial x}$ θ 2  $i_d = I_p \frac{v_e}{2 \pi R}$ . The ratio of the current output to the input phase differential, *Kcp*, is defined as  $\frac{p}{2\pi}$  $\frac{I_p}{I_p}$  (A/rad). The loop filter has a transfer function  $H[p(s) (V/A)$ . The ratio of the VCO frequency to the control voltage is  $Kv$  (Hz/V). N is the divider ratio. For  $N=1$ , the output frequency of the VCO is the same as the reference input frequency. Since phase is the integral of frequency over time, *K*v (Hz/V) should be changed to *s K s*  $\frac{2\pi K_v}{g} = \frac{K_{vco}}{g}$  (rad/sec.V).

The open-loop transfer function of the PLL can be represented as

$$
G(s) = \frac{\theta_{out}(s)}{\theta_{in}(s)} = \frac{IpKvHlp(s)}{\frac{sN}{1 \text{ s s}}}
$$
(3.6)

From the feedback theory, the close-loop transfer function of the PLL can be found as

$$
H(s) = \frac{\theta_{out}(s)}{\theta_{in}(s)} = N \frac{G(s)}{1 + G(s)}
$$
(3.7)

To keep the mathematics simple, we neglect the parasitic capacitance shunting the loop filter to ground. With  $Hlp(s) = R1 + \frac{1}{sC1}$ , the close-loop transfer function of the PLL can be expressed by the equation

 $L, V_2$ 

$$
H(s) = N \frac{(\frac{ipKv}{NC1})(1 + sR1C1)}{s^2 + s(\frac{ipKv}{N})R1 + \frac{ipKv}{NC1}}
$$
(3.8)

This can be compared with the classical two-pole system transfer function

$$
H(s) = N \frac{\omega_n^2 (1 + \frac{s}{\omega_z})}{s^2 + 2\zeta \omega_n s + \omega_n^2}
$$
 (3.9)

Thus, the parameters natural frequency  $\omega_n$ , zero of the LP  $\omega_z$  and damping factor  $\zeta$ can be derived as

$$
\omega_n = \sqrt{\frac{IpKv}{NC1}} = \sqrt{\frac{kcpKvco}{NC1}}
$$
\n(3.10)

$$
\omega_z = \frac{1}{R1 \times C1} \tag{3.11}
$$

$$
\zeta = \frac{R1}{2} \sqrt{\frac{IpKvCl}{N}} = \frac{\omega_n}{2\omega_z} = \frac{R1}{2} \sqrt{\frac{KcpKvcoCl}{N}}
$$
(3.12)

In a 2<sub>nd</sub>-order system, the loop bandwidth of the PLL is determined by  $\omega_n$ . But the -3dB bandwidth should be *N*  $k = \frac{IpKvcoCl}{V}$  (Hz).



# **3.2.5 PLL Noise Analysis and Stability**

 Timing jitter is considered as the random fluctuations of the phase in the time domain. Timing jitter of the clock in a system can extremely limit the maximum speed of a digital I/O interface and increase the bit error rate of a communication link. The output jitter of the PLL depends on the jitter of the VCO, the jitter of the PLL reference input, and the bandwidth of the loop. The cycle-to-cycle jitter of an oscillator is defined as the r.m.s. variation in its output period. For an oscillator with a nominal period of T0, random fluctuations in the phase cause a timing error, ∆tvco, to

accompany each period of oscillation, as illustrated in **Fig. 3-15**.

In **Fig. 3-14**, we can add noise in each node of the loop. Follow equation (3.6) (3.7) (3.7), we can get transfer functions that if noise comes from the PD, then it will be a low-pass function to output, if noise comes from the CP, then it will be a band-pass function to output, if noise comes from the VCO, then it will be a high-pass function to output.[11]Therefore, we can increase the loop bandwidth to reduce noise from the VCO, and noise comes from the VCO should be the main noise source in whole PLL. Because PLL is a negative feedback system, we should confirm the whole loop is stable. So we can't increase loop bandwidth indefinitely, its maximum is restricted by input reference frequency. The criteria of the stability limit can be derived as  $[12][13]$ : **AMMAD** 

In general, it has approximation to be less than 1/10 of the input reference frequency to avoid instability.

The jitter present at the input clock is low-pass filtered by the loop. This means that more phase noise from the input clock will transfer to the output if the loop has a larger loop bandwidth. However, it does not cause a problem when the input is a clean clock source.

Besides, Jitter in on-chip clocks also comes from power supply variations and substrate noise, clock-to-clock variations in IR drops and LC oscillations. Variations in the supply and substrate voltages will typically cause changes in the buffer delays, which lead to output jitter. The contribution of device electronic noise to jitter is typically much less than that due to supply and substrate noise [14].

The loop filter used in this thesis is shown in **Fig. 3-16**. Resistor R1 in series with capacitor C1 provides a zero in the open loop response that improves the phase margin and the overall stability of the loop. The shunt capacitor C2 is used to avoid discrete voltage steps at the control part of the VCO due to the instantaneous changes in the charge pump current output. But it can adversely affect the overall stability of the loop. Total PLL parameter is listed in **table 3-1** 





**Fig. 3-17 Block Diagram of PRBS** 

# **3.3 Pseudo Random Bit Sequence (PRBS)**

Pseudo Random Bit Sequence is used for test (DFT). It makes use of D-Flip-Flop which can be reset to set initial condition, as shown in **Fig. 3-17**. If D-Flip-Flops are not all 0, then this circuit will generate a pseudo random code and be used as a parallel data stream source. The XOR gate is the speed-critical part in this circuit. The circuit totally uses seven-staged D-Flip-Flop, and produces  $(2^7-1)$  bit length.  $u_{\rm H\,III}$ 

#### **3.4 Seven to One Multiplexer and Output Driver**

The circuit should transmit seven data to bits per cycle and per channel, so PLL produces seven phases with 1 ns phase resolution when the transmitter transfers the data steam with 1Gb/s. Besides, the circuit uses NOR gates to adjust duty cycles of the output waveforms of the PLL and produce a 57% duty cycle. This will reduce output data jitter of the multiplexer. The configuration between input data  $(D0+, D0-) \sim (D6+, D0)$ D6-) and clk0~6 is shown in **Fig. 3-18**. It arranges those seven phases and divides one cycle into seven parts regularly. In this way, only one data can pass through the multiplexer during one-seven cycle time. For example, at the timing interval between the rising edge of clk0 and the falling edge of clk4, the input signal D0+ and D0- can drive the multiplexer output Vi+ and Vi-.

 The multiplexer basically employs dual pseudo-nMOS multiplexers at its input. A source-coupled pre-driver is added as shown in **Fig. 3-19** [15][16][17]. The multiplexer accepts full-swing inputs D0~D7 and clk0~6, and generates a limit-swing multiplexed signal at inputs of source couple pairs. The swing is limited by a PMOS with its gate grounded. The pre-driver stage then converts the signal to low swing Vo+ and Vo- .



**Fig. 3-18 Timing diagram of a 7 to 1 multiplexer** 



# **3.5 Output Driver**

A typical LVDS driver behaves as a current source with switched polarity [18]. The output current flows through the load resistance, generating the differential output swing. For operation in the gigabit-per-second range, additional termination resistor is placed at the source end to suppress the reflected waves caused by crosstalk or by imperfect termination, due to package parasitic and component tolerance. The driver uses the typical configuration with four MOS switches in the bridge configuration, as show in **Fig. 3-20** [18], M1- M4: with M1 and M4 switched on, the polarity of the output current is positive together with differential output voltage, |Vout+ -Vout-|. On the contrary, if M1 and M4 are switched off, the polarity of the output current and voltage is reversed. In **Fig. 3-20**, the right half circuit is the common-mode feedback control to achieve higher precise output that fall within the LVDS standard specifications.



**Fig. 3-20 The transmitter data driver** 



**Fig. 3-21 The loading schematic of the data driver** 

#### **3.6 Simulation Results**

In a real IC, the die is packaged. Therefore, its influence should be taken into consideration. During simulation, the package effects are added at the Vdd, Gnd, and I/O nodes. Besides, the output loading of the data driver through the cable also should be considered, as shown in **Fig. 3-21.** In high speed transmission, additional resistance is often added in the near end to perform "Double termination" and reduce signal reflection trough the transmission line.

**Fig. 3-22** is the Kvco simulation results of the VCO when SPICE is in the FF, TT, and SS mode. **Fig. 3-23** is the result for the seven-phased clock of the PLL described in this chapter, and the clock frequency is 142.8MHz. The phase resolution is 1.01ns. **Fig. 3-24** is the differential data output waveform, and **Fig. 3-25** is the "eye pattern" of the data output.



 **Fig. 3-22 Simulation result of the VCO transfer characteristics** 







**Fig. 3-24 Simulation results of the transmitter data output (Vout+ - Vout-)** 



**Fig. 3-25 The simulation "eye pattern" of the transmitter data output** 

# **Chapter4 Receiver**

## **4.1 Introduction**

The final goal of the timing recovery is to maximize the timing margin—the amount that a sample position can err when the data is still correctly received. There may be two sources that reduce the timing margin: one is static phase error, and the other is jitter or dynamic phase error. **Fig. 4-1** [19] illustrates the timing margin  $t_{\text{margin}}$  $=$  **t**<sub>bit</sub> – **t**<sub>os</sub> – **t**<sub>jc</sub> – **t**<sub>jd</sub> where the **t**<sub>os</sub> is the static sampling error; **t**<sub>jd</sub> and **t**<sub>jc</sub> and are the jitter on the data transition and the sampling clock respectively. Since the sampling position is defined with respect to the data transition, jitter on both the clock and the data additively reduces timing margin.



**Fig. 4-1 Timing margin** 



 **Fig. 4-2 Clock recovery architectures: (a) data/clock recovery architectures and (b) phase picking block diagram.**



The amount of phase error and the jitter depends on the implementation of the clock recovery circuit. Two techniques are commonly used: phase-locked loop (PLL) and phase picker. A PLL employs a feedback loop that actively servos the sampling phase of an internal clock source based on the phase of the input. **Fig. 4-2(a)** [19] illustrates a common implementation using a voltage-controlled oscillator (VCO) as the clock source, and a charge pump following the phase detector to integrate the phase error. A phase picker, as shown in **Fig. 4-2(b)**, oversamples each bit, and uses the oversampled information to determine the transition position of the data. Based on the transition information, the best sample is then selected as the data value.

 The static phase error of a PLL depends mainly on its phase detector design. Ideally, sampling at the middle of the bit window gives the maximum timing margin. However, if the sampler has a setup time, the middle of the effective bit window is shifted by the setup time. Not compensating this shift causes significant static phase

error. Additional phase error occurs due to inherent mismatches within the phase detectors and/or charge pump. Furthermore, any phase detector "dead zone" (window in which the phase detector does not resolve phase information) limits the phase resolution, increasing the static phase error.

 In a phase-picking architecture, the multiple samples per bit are used to find the transitions, effectively behaving as the phase detector. Sampler uncertainty limits the resolution of the transition detection. Sources of this uncertainty are sampler metastability window and data dependence of the sampler setup time. The uncertainty window for the sampler design is  $\leq 1/10$  the bit time which does not impact performance significantly. More importantly, in this architecture, the phase information is quantized by the oversampling, causing a finite quantization error of 1/2 the phase spacing between samples. For a higher oversampling ratio, this static phase error is less, but it has a significant cost of increasing the number of input samplers, increasing the input capacitance, and hence limiting the input bandwidth. For a 3x oversampling system, the maximum static phase error is 1/6 the bit time.



**Fig. 4-3 Block diagram of receiver** 

# **4.2 Architecture of Receiver 4.2.1 Algorism of 3X-Oversampling**

 **Fig. 4-3** shows the block diagram of the receiver. It consists of a sampler bank, a phase selector, a PLL, a control unit, and a synchronizer. The 3x-oversampling receiver operates as follows. First, the data stream and the clock are received by interface circuits. The input clock is used as the reference for the PLL. In order to realize the 3x-oversampling mechanism, the PLL produces six phases clock with 1GHz clock rate, which is seven times as fast as the reference clock (142.8 MHz). In other words, the resolution each time space of sampling clock is about 167 ps

The input data steam, with data rate 1 Gbps, is sampled by the sampler bank three times per bit. Four sets of oversampled data are processed by the control unit to detect whether the timing between the data and the clock are aligned or not. As shown in **Fig. 4-4 (a)**, as the sampling phase lags the incoming data, data transitions might appear between second and third data values within the data information set. Then, the control unit will send a control signal to phase selectors to select an earlier phase for the sampler bank. Otherwise, in **Fig. 4-4 (b)**, as the sampling phase leads the incoming data, data transitions might appear between first and second data values within the data information set. The control unit will send another control signals to phase selectors to select a later phase for the sampler bank. Because the adjusting processing is repeated until no data transition is detected, as shown in **Fig. 4-4(c)**. When the system is locked, the center data values sampled by the second sampling phase are considered as the recovered data.



**Fig. 4-4 (a) lag, (b) lead, and (c) lock state of the receiver** 

#### **4.2.2 Slicer**



The advantage of this hysteresis comparator is noise immunity and noise is cut off by the threshold voltage as **Fig. 4-6**. If the size ratio  $A = (W/L) \frac{1}{2} (W/L) \frac{1}{1}$  and bias current through  $M5$  is  $I_B$ , the threshold voltage is derived as

$$
\pm Vth = \pm \sqrt{\frac{I_B}{K}} \frac{(\sqrt{A}-1)}{\sqrt{1+A}}
$$
 (4-1)

The threshold voltage depends on not only the bias current but also the size ratio of the lower two current mirrors. If A<1, there is no hysteresis in transfer function, when A>1, the hysteresis is shown in **Fig. 4-6**.







**Fig. 4-6 Simulation of Hysteresis comparator** 

The frequency response of the slicer has much influence on the total response speed of the receiver. It must have enough comparing speed and provide significant signal swing for the sampler. **Fig. 4-7** is the frequency response of the slicer when input nodes have the same voltage.



**Fig. 4-7 The frequency response of the slicer** 



**Fig. 4-8 The block diagram of the receiver PLL** 

#### **4.2.3 Receiver PLL**

 The structure of the receiver PLL, as shown in **Fig. 4-8**, is basically the same as the transmitter. To accomplish the 3x-oversampling mechanism, the PLL should produce an output clock with frequency that is seven times as fast as the input clock (142.8 MHz).

As shown in **Fig. 4-9**, the circuit uses a divider by 7 [21] in the feedback loop. It comprises five cascaded "half transparent" (HT) registers and a TSPC flip-flop, as a pre-charge stage. The HT register contains six transistors, a p-latch and an n-latch in the TSPC technique. If more HT registers are inserted into the feedback path, higher dividing ratios can be obtained. A quick reset can be made by reseting registers close to the pre-charge stage (not necessary for all stages).

In addition, the PLL uses three stages VCO to provide six uniformly-separated clocks whose frequency is 1GHz. The parameter of the receiver PLL is shown in **Table 4-1.** 



**Fig. 4-9 Schematic of the divide-by-seven circuit** 

| Charge Pump Current  | 126uA                           |
|----------------------|---------------------------------|
| VCO Center Frequency | 1 GHz                           |
| Kvco                 | 823MHz/V                        |
| Divided by N         | $N=7$                           |
| Loop Bandwidth       | 14MHz                           |
| Phase Margin         | 65 degrees                      |
|                      | $C1 = 90pF$                     |
|                      | $C2 = 2.2pF$                    |
|                      | $R1 = 2k\Omega$<br><b>ARRAN</b> |

**Table 4-1 Parameters of the receiver PLL** 





**Fig. 4-10 The demultiplexing sampler** 

#### **4.2.4 Demultiplexing Sampler**

The demultiplexing on input data, that is amplified by the slicer, is performed by the sampler, as shown in **Fig. 4-10**. It works like a sense amplifier. The output nodes are precharged to vdd during the hold mode when the clock is low. The input data stream is sampled as the clock signal, Clk, is going to high, and then the sampled data is stored in the RS latch. Therefore, this sense amplifier works as an edge-triggered Flip-Flop [22]. The receiver totally uses three samplers in the sampler bank, and three different sample phase signals come from the phase selector.

### **4.2.5 Control Unit**



The main function of the control unit in the receiver is to make a decision how to adjust the sampling phase according to the former sampled data information. As shown in **Fig. 4-11**, one bit of input data is sampled three times, as an information set, and those sampled data, D1, D2, D3, are registered by edge-triggered D-Flip Flops. As four sets data are ready, the DFF Bank synchronizes the twelve data in DFF by the clock signal Clksh. The UP/DN decision circuit will judge whether the sampling phase is correct according to those twelve data and decide how to adjust sampling phases.

After synchronization, every set of data is fed into an edge detector to check how the sampling phase is, as shown in **Fig. 4-12 (a)** [23]. If transition occurs between first and second data values within an information set, the edge detector will send a "DN<sub>i</sub>" signal to the UP/DN decision circuit. If transition occurs between second and



**Fig. 4-11 The block diagram of the control unit.** 

third data values within an information set, the edge detector will send a "UP<sub>i</sub>" signal to the UP/DN decision circuit. The truth table of the edge detector is shown in **Fig. 4-12 (b).** There are two special cases in the truth table: "010" and "101". Those situations are recognized as transient conditions, and the edge detector doesn't send any "UPi" or "DNi" signal to the UP/DN decision circuit. On the other hand, to leave larger time budget for the UP/DN decision circuit, the digital part, and reduce the overall system fluctuation, the Clksh signal has a sixteen times frequency as fast as the Clkout, the sampling clock. In other words, the system will make a calibration after receiving sixteen bits of the input data stream.

The UP/DN decision circuit is to compare the two numbers of "UP<sub>i</sub>" and "DN<sub>i</sub>" signals from those four edge detectors. If the total number of "UP<sub>i</sub>" is larger than the total number of " $DN_i$ " and at least two, this circuit will send another set of control signal, C0~C5, to phase selectors and select a earlier sampling phase, and vice versa. In the other cases, the circuit will not adjust phases and the system is considered locked. This calibration will continue until the system is locked.



The clock signal Clksh is produced from reference clock, Clkouts of the phase selector. However, sampling phases are not fixed and should be adjusted dynamically. Therefore, the synchronization clock, Clksh, also should be adjusted according to the situation of the data sampling. As mentioned before, the frequency of Clksh is sixteen times as fast as the Clkout. As shown in **Fig. 4-13 (a)**, the divider-by-16 counter can be made by using four Flip-Flops and cascading them. Unfortunately, asynchronous counter will accumulate jitter stage by stage. In **Fig. 4-13 (b)**, a synchronous counter is made by adding a D-Flip-Flop at the last stage to re-sample the clock, and it will eliminate the jitter accumulated in asynchronous counter.



**Fig. 4-13 (a) Asynchronous divider circuit. (b) Modified divider.** 



# **4.2.7 Phase Selector**

 The phase selector, as shown in **Fig. 4-14**, is basically a pseudo-NMOS circuit. Only one of control signals, C0-C5, is logic high and the circuit selects one proper phase for the sampler. It totally uses three phase selectors to produce three different phases, Clkout0, Clkout1, Clkout2, for the sampler bank. When the sampler needs an earlier phase to sample data, the control unit will shift the control signal with logical "high" to another and let other signals all low. For example, if C2 is "high" and C0, C1, C3, C4, and C5 are low at first, then C1 will become "high" and others will be low. Besides, the control unit can let the phase selector do circular phase shifting until the system has a proper sampling phase.



**Fig. 4-14 Phase selector** 

# **4.3 Simulation Results**

 **Fig. 4-15** is the simulation result of the divide-by-seven circuit at the PLL. **Fig. 4-16** is Simulation result of the VCO transfer characteristics, Kvco. **Fig. 4-17** is the Clkouts of the phase selectors after the system is locked, and their frequency is 1GHz. **Fig. 4-18** is the UPsh and DNsh signals from the system's starting up to the locked mode when UPsh and DNsh signals don't change any more.

**ALLES** 



**Fig. 4-15 The referenced clock and divide-by-7 output** 





**Fig. 4-16 Simulation results of the VCO transfer characteristics** 

**Fig. 4-17 Clkouts, sampling phases of the phase selector output**



**Fig. 4-18 Waveforms of control signals after power on** 

# **Chapter 5 Experimental Results**

#### **5.1 Layout Consideration**

The transceiver is implemented in tsmc 0.35um 2P4M CMOS process. **Fig. 5-1**  is the layout of the transmitter, and **Fig. 5-2** is the layout of the receiver. The area of the transmitter is 1600X1600  $\text{um}^2$ , and the area of the receiver is 1700X1700  $\text{um}^2$ .

In order to reduce power supply noise, bypass capacitors are added to the power supply lines as many as possible. Besides, the devices are surrounded by the guard rings to isolate the noise. In additions, bodies of MOS of digital and analog circuits are separate to reduce noise coupling in the substrate.





Fig. 5-1 Layout of the transmitter Fig. 5-2 Layout of the receiver



**Fig. 5-4 Measurement setups of the transmitter** 

#### **5.2 Transmitter Experiment Results**

**Fig. 5-3** is the PCB of the transmitter. **Fig. 5-4** is the total testing setup of the transmitter experiment. We use Tek DSA 601 to measure the PLL output Clk[0], and its jitters of r.m.s. are 11.66 ps, and peak-to-peak are 80ps ,as shown in **Fig. 5-5**. Then, we use Tek TDS 754D to observe transmitter output.

**Table 5-1** shows the measurement results of the transmitter. The output common voltage is 1.24 V, and slightly different from the original design, 1.25 V. The output differential voltage is about 320 mV and also slightly different from the original design, 350 mV.

**Fig. 5-6** is the eye pattern of one bit of the PRBS output. It can be observed that the PRBS circuit works. **Fig. 5-7** is the transmitter output when fixed data pattern "1110010" is fed into the transmitter with data rate 1Gbps.

**Fig. 5-8 ~ Fig. 5-12** are the results of the transmitter at different data rates. We can observe that some eyes of the output are not clear. When the data rate is about 700 Mbps, the output eye pattern is better, and others are not so good. After analysis, we inferred that it is because the layout of the PRBS is far from the multiplexer and each data path to the multiplexer is different. The original design omits using Flip-Flop resistors and only uses invertors to adjust timing problems. Therefore, it doesn't leave enough time margins for the multiplexer. Therefore, once the process has variation, different delay time of PRBS data to the multiplexer will causes data transition when the first and the last bit are being transmitted. So, some eye patterns are bad.

To improve the problems, we can add a data capture circuit (a data buffer) which is close to the multiplexer as much as possible. However, it may still suffer from the problem of time margins because of process variation. Besides, we can add shift

resisters before the multiplexer and make use of the transmitter PLL multi phases. In this way, we can give more time margins for the multiplexer. However, it will increase the penalty of the whole transmitter, the loading of PLL and transmitter power consumption.

**Vdd 3.3V**  Data Rate 1Gbps **1Gbps** Power(include PRBS) 162mW **Output Differential Swing**  $|V_{od}|$  **320mV Output Common Mode Voltage V<sub>os</sub> | 1.24V**  $\begin{array}{l} \Delta \Sigma \text{A 601A \; \Delta \text{I}\Gamma \text{ITIZINT} \; \Sigma \text{ITNAA} \; \text{ANA} \\ \delta \alpha \tau \epsilon \text{: } 28-\text{ATIP}{-04} \; \tau \mu \epsilon \text{: } 0 \text{:} 10 \text{:} 20 \end{array}$ **AWZEE** Tek Window1 Cursors **Def Wfm**  $1.7V$  $\begin{array}{c}\n\bullet \\
\bullet \\
\bullet \\
\bullet \\
\end{array}$ trig'd<br>M  $rac{1}{2}$ ET]<br>.348%<br>.79%<br>.922%  $rac{200 \text{ps}}{\mu \pm 1\sigma}$  $464n$ diy 36ps<br>L.22 Mean=451.1ps<br>RMSA=11.66ps<br>PkPk=80ps 18%  $\mu \pm 2\sigma$ 97 WŦ.  $\mu$ ±30 Measure-R3<br>nents MenuMain  $4p<sub>s</sub>$ ments tistic Too Bottom izonta  $0f f$  $\frac{1}{y}$  22  $1,18$ 

**Table 5-1 Measurement results of the transmitter** 

**Fig. 5-5 Jitters of the transmitter PLL**



**Fig. 5-7 The TX output of a fixed input data pattern** 



**Fig. 5-9 500 Mbps data rate of the TX** 



**Fig. 5-11 1 Gbps data rate of the TX** 



**Fig. 5-13 PCB of the receiver** 



**Fig. 5-14 Measurement setups of the receiver** 

# **5.3 Receiver Experiment Results**

 **Fig. 5-13** is the PCB of the receiver. **Fig. 5-14** is the total testing setup of the receiver experiment. First, we used HP 8133A to generate LVDS input signals for the receiver, and HP 8131A to provide the receiver a reference clock. Then we made use of Tek TDS 754 to observe the results of the receiver.

 After experiments of the receiver, we found the receiver doesn't work correctly. Only individual PLL can work. **Fig. 5-15** is the three output clocks, and **Fig. 5-16** is jitters of the receiver PLL. Because the original design didn't leave enough test pins for measurements to diagnose what the problem is in the receiver.

 After this experiment, we know it is very important to more carefully design the plans of testing to avoid this kind of situation.



**Fig. 5-15 Three output clocks of the receiver with 1 GHz lock rate** 



**Fig. 5-16 Jitters of the receiver PLL**
# **Chapter6 Conclusions and Future work**

### **6.1 Conclusions**

To transmit high speed signal and achieve lower power consumption, a transceiver circuit with LVDS interfaces has been designed in this thesis and implementation in deep sub micro CMOS process. Major properties can be summarized as follows

The transmitter uses a PLL to generate a 142.8MHz uniformly distributed seven phase clock signal. Then the multiplexer uses those clock signals to convert the parallel data to serial data. After that, the LVDS data driver outputs serial data and its data rate achieves 1Gb/s. Whole transmitter is described in chapter 3 and experiment results are described in chapter 5.

The receiver is constructed by following main parts: slicer, sampler, PLL, phase selector, synchronizer, and control unit. At first, it uses a slicer to amplify input small signal to full swing. Then, the 3X-oversampling algorism is implemented by three demultiplexing samplers and a PLL with 1GHz clock rate. The PLL provides six phases for the phase selector and the control unit will continue to choose appropriate sampling phases by judging those previous sampling results. Then, the synchronizer is used to parallelize the data. The detail of the whole design issues are described in chapter 4

# **6.2 Future Work**

The increasing demand for data bandwidth in point-to-point link has driven the development of high-speed and low-cost serial link technology. Maybe we can do further work as below:

- 1. Reduce PLL output jitter.
- 2. Use some pre-emphasis circuits to reduce the ISI problem of data output.
- 3. Use other structures to design the receiver to improve the immunity to data jitters.



#### **REFERENCES**

- [1]. Electrical characteristics of low-voltage differential-signaling (LVDS) interface circuits, TIA/EIA-644, National Semiconductor Corp., ANSI/TIA/EIA, 1996
- [2]. IEEE Standard for Low-Voltage Differential Signals (LVDS) for Scalable Coherent Interface (SCI), 1596.3 SCI-LVDS Standard, IEEE Std. 1596.3-1996, 1994.
- [3]. K. Lee, Y. Shin, S. Kim, D.-K. Jeong, G. Kim, B. Kim and V. D. Costa, "1.04 GBd Low EMI Digital Video Interface System Using Small Swing Serial Link Technique," *IEEE J. Solid-State Circuits*, vol. 33, no. 5, pp. 816-823, May 1998.
- [4]. M. Soyuer and R.G. Meyer, "Frequency Limitation of a Conventional Phase-Frequency Detector," *IEEE J. Solid-State Circuits*, vol. 25, pp. 1019-1022, Aug. 1990. 1896
- [5]. Joonsuk Lee and Beomsup Kim, "A Low-Noise Fast-Lock Phase-Locked Loop with Adaptive Bandwidth Control," *IEEE J. Solid-State Circuits*, vol. 35, no. 8, pp. 1137-1145, Aug. 2000.
- [6]. M. Johnson, and E. Hudson, "A Variable Delay Line PLL for CPU -Coprocessor Synchronization," *IEEE J. Solid-State Circuits*, vol. SC-23, no. 5, pp. 1218-1223, Oct. 1988.
- [7]. J. G. Maneatis, "Precise Delay Generation Using Coupled Oscillator," *IEEE J.Solid-State Circuits*, vol. 28, no. 12, pp. 1273-1282, Dec. 1993.
- [8]. J. G. Maneatis, "Low-Jitter Process-Independent DLL and PLL Based on Self-Biased Techniques," *IEEE J. Solid-State Circuits*, vol. 31, no. 11, pp. 1723-1732, Nov. 1996.
- [9]. Behzad Razavi, *Design of Integrated Circuits for Optical Communications*, McGRAW- Hill, 2003.
- [10]. Behzad Razavi, *RF Microelectronics*, Prentice Hall, Inc. 1998.
- [11]. W. B. Wilson, Un-Ku Moon, K. R. Lakshmikumar and Liang Dai, "A CMOS Self-Calibrating Frequency Synthesizer," *IEEE J. Solid-State Circuits*, vol. 35, no. 10, pp. 1437-1444, Oct. 2000.
- [12]. J. P. Hein, and J. W. Scott, "z-Domain Model for Discrete-Time PLL's," *IEEE Trans. on Circuits and Systems*, vol. 35, no. 11, pp. 1393-1400, Nov. 1988.
- [13]. F. M. Gardner, "Charge-Pump Phase-Lock Loops" *IEEE Trans. On Commun.*, vol. COM-28, pp. 1849-1858, Nov. 1980.
- [14]. Frank Herzel and Behzad Razavi, "A Study of Oscillator Jitter Due to Supply and Substrate Noise," *IEEE Trans. on Circuits and Systems*, vol. 46, no. 1, pp. 56-62, Jan. 1999.
- [15]. Ming-Ju Edward Lee, William J. Dally and Partick Chiagn, "Low-Power Area Efficient High-Speed I/O Circuit Techniques," *IEEE J. Solid-State Circuits*, vol. 35, no. 11, pp. 1591-1599, Nov. 2000.
- [16]. Ming-Ju Edward Lee, William Dally and Patrick Chiang, "A 90mW 4Gb/s Equalized I/O Circuit with Input Offset Cancellation," in *IEEE ISSCC Dig. Tech. Papers*, 2000, pp. 252-253.
- [17]. Alan Fiedler, Ross Mactaggart, James Welch and Shoba Krishnan, "A 1.0625 Gbps Transceiver with 2x-Oversampling and Transmit Signal Pre-Emphasis," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 1997, pp. 238-239.
- [18]. Boni, A., Pierazzi, A., and Vecchi, D., **"**LVDS I/O interface for Gb/s-per-pin operation in 0.35-µm CMOS," *IEEE J. Solid-State Circuits*, vol. 36 , No 4 , April 2001 , pp. 706 - 711
- [19]. C. K. K. Yang and R Farjad-Rad, " A 0.5-um CMOS 4.0-Gbit/s serial link transceiver with data recovery using oversampling", *IEEE Journal of Solid-State Circuits*, vlo.33, pp.713-722, May 1998.
- [20]. Phillip E. Allen, Douglas R. Holberg, CMOS Analog Circuit Design, Oxfordl, 2002.
- [21]. J.-R. Yuan and C. Svensson, "Fast CMOS nonbinary divider and counter", *Electronics Letters*, vol. 29, pp 1222-1223, June 1993.
- [22]. A. Fiedler *et al.,* "A 1.0625Gb/s transceiver with 2x-oversampling and transmit signal pre-emphasis," in *ISSCC'97 Dig. Tech. Papers,* Feb. 1997, pp. 238–239.
- [23]. K. Lee and Y. Shin, "1.04 Gbd low EMI digital video interface system using small swing serial link technique", *IEEE Journal of Solid-State Circuits*, vol.33, متقللاتي pp. 816-823, May 1998.
- [24]. S. Kim *et al.,* "An 800Mbps multi-channel CMOS serial link with 3x\_oversampling," in *IEEE 1995 CICC Proc.,* Feb. 1995, p. 451.
- [25]. William J. Dally and John Poulton, "Transmitter Equalization for 4-Gbps Signaling," *IEEE Micro.*, pp. 48-56, Jan.-Feb. 1997.
- [26]. Behzad Razavi, *Design of Analog CMOS Integrated Circuits*, McGraw-Hill, 2000.
- [27] W. Dally and J. Poulton, "A tracking clock recovery receiver for 4-Gb/s signaling," in *Hot Interconnect97 Proc.,* Aug. 1997, p. 157.

## **VITA**

蕭聖文於西元 1976 年 10 月 17 出生於彰化縣,西元 1995 年畢業於國立台中 一中,西元 2000 年畢業於國立交通大學電子工程學系,獲電機資訊學院學士學 位,西元 2004年畢業於國立交通大學電子研究所,獲電機資訊學院碩士學位, 研究所修習課程:



bentlygt@yahoo.com.tw

68