## 國立交通大學

電機與控制工程研究所

### 碩士論文



## A Self-Calibrate All-Digital 3Gbps SATA Driver Design

- 研究生:王信文
- 指導教授:蘇朝琴 教授
- 中華民國九十三年七月

## 具自我校正功能之全數位 3Gbps SATA 驅動電路設計 A Self-Calibrate All-Digital 3Gbps SATA Driver Design

研 究 生:王信文

Student : Hsin-Wen Wang

指導教授:蘇朝琴 教授 Advisor: Chau-Chin Su

國立交通大學

電機與控制工程研究所



Submitted to Department of Electrical and Control Engineering

College of Electrical Engineering and Computer Science

National Chiao Tung University

in partial Fulfillment of the Requirements

for the Degree of

Master

in

Electrical and Control Engineering

July 2004

Hsinchu, Taiwan, Republic of China

中華民國九十三年七月

具自我校正功能之全數位 3Gbps SATA 驅動電路設計

研究生:王信文 指導教授:蘇朝琴 教授

### 國立交通大學電機與控制工程研究所



由於製程技術的進步,CMOS 積體電路的操作頻率及電路複雜度也隨著增加。使得晶 片內部的邏輯閘以及連結外部的輸入/輸出介面之間的頻寬差距到達嚴重的比例。因此,連 接晶片之間的傳輸通道時常限制了系統的效能,這些系統包括網路的切換器、路由器、處 理器和記憶體之間的介面及多處理器的傳輸通道。

在此論文中,我們有兩個研究主題。首先,我們將簡單的討論及計算介面電路的雜訊 來源以及印刷電路版的知識。依據這些知識,我們提出一個可抑制同步切換雜訊且符合低 電壓差動訊號標準的 2.5 Gbps 傳送器。接著,我們再提出一個有自我校正功能,可應用於 第二代 SATA 的驅動電路。使用此技術的傳輸器將可工作在 3 Gbps 的位元傳輸率,並可以 對輸出電壓準位做自動的調整,以防止製程漂移或溫度變化而造成輸出準位的誤差,實現 此傳輸器的電路技術及設計概念也將再論文中說明。

論文中,我們將實現一個符合低電壓差動訊號標準 2.5 Gbps 的傳送器。此傳送器是使用 0.18µm 的製程製作且在 1.8V 的供應電壓下可以操作在 2.5 Gbps,另外晶片面積則為 1500×860µm<sup>2</sup>。使此設計能工作在 2.5 Gbps 的技術包括使用點對點的傳輸,並加入我們所 提出抑制同步切換雜訊的機制。

關鍵字: 高速串列鏈結, 低電壓差動訊號標準, 同步切換雜訊抑制, 自我校正

### A Self-Calibrate All-Digital 3Gbps STAT Driver Design

Student: Hsin-Wen Wang Advisor: Chau-Chin Su

## Institute of Electrical and Control Engineering National Chiao Tung University

## Abstract

Due to process technologies scale-down, the operating frequency and circuit complexity of CMOS VLSI increase. The growing gap between on-chip gates and off-chip I/O bandwidth is reaching the critical proportions. Therefore, the interconnections between chips often limit the performance of a system in application such as network switches, routers, processor-memory interfaces, and multi-processor interconnection. For this reason, to integrate high speed serial links on chips can reduce the pin/wire count, and power budget of a system significantly.

There are two major topics in this thesis. First, we will focus on the study of signaling noise sources and channel (PCB) modeling. Base on these considerations, we will propose the 2.5 Gbps transmitter that conforms to the Low Voltage Differential Signal (LVDS) specifications and Simultaneous switching noise (SSN) reduction. Second, we will propose a driver circuit design which can auto calibration itself and apply to second generation SATA. So the driver can prevent the output voltage error from process or temperature variation. This transmitter for the physical layer of a serial link will have a data bandwidth of 3 Gbps. The circuit design and operational concept for the transmitter will be described in the thesis.

In this thesis, a 2.5 Gbps transmitter has been implemented. It is compatible with the LVDS standard. In a TSMC 0.18-µm 1P6M CMOS technology, the transmitter circuit operates at 2.5 Gbps on a 1.8V power supply and occupies an area of  $1500 \times 860 \mu m^2$ . The technique to achieve 2.5 Gbps data rate is using point-to-point topology and the novel methodology that reduce the SSN.

#### Keyword: High-speed serial link, LVDS, SSN-reduction, Self-Calibrate

#### 誌 謝

我想我最需要感謝的是我的家人,要是沒有他們為我的付出,不會有今天的 我。

也要特別感謝我的指導教授 蘇朝琴教授,老師不管在研究方面或生活處事 上,都讓我收穫很大,對於做事情的態度,也有很大的成長。老師說過,放棄是 很容易的,只有堅持下去的人才會成功,我會永遠銘記老師對我的忠告。

在此還要感謝一起在交大兩年的同學們:朋哥、corgan、彥呈、英廷,一起 走過在交大艱苦的第一年,還要特別感謝彥呈陪我熬夜下線,早上還幫我買早 餐。另外還有學弟們:trash 銘、ku、cgu、阿達以及小佑,也在我碩二最辛苦的 一年,總是在我最沒靈感的時候,陪我一起思考新 idea,尤其是建錫、ku 和慢 車銘,更是常常陪我到半夜。另外丸子、鴻文、仁乾、阿亮、小莘、Borland、 能哥…等學長們,提供給我最寶貴的經驗,沒有這些經驗,做起事來就不會那麼 順利了。

2004 年 6 月,我要再次感謝我的爸爸媽媽,謹將此論文獻給即將要過生日的母親。



王信文 2004/6/16

## **Table of Contents**

| TABLE OF CONTENTS                                             | V    |
|---------------------------------------------------------------|------|
| LISTS OF FIGURES                                              | VII  |
| LISTS OF TABLES                                               | IX   |
| CHAPTER 1 INTRODUCTION                                        | 1 -  |
| 1.1 MOTIVATION                                                | 1 -  |
| 1.2 CMOS SERIAL LINKS                                         | 2 -  |
| 1.3 THESIS ORGANIZATION                                       | 4 -  |
| CHAPTER 2 BACKGROUND STUDY                                    | 6 -  |
| 2.1 Signaling Techniques                                      | 6 -  |
| 2.2 Noise Source                                              | 8 -  |
| 2.3 CHANNEL ANALYSIS                                          | 12 - |
| CHAPTER 3 2.5GBPS LVDS TRANSMITTER WITH SSN                   |      |
| REDUCTION                                                     | 20 - |
| 3.1 SIMULTANEOUS SWITCHING NOISE REJECTION                    | 20 - |
| 3.2 TRANSMITTER ARCHITECTURE                                  | 25 - |
| 3.3 System Components                                         | 27 - |
| 3.4 TRANSMITTER SIMULATION                                    | 34 - |
| 3.5 SUMMARY                                                   | 42 - |
| CHAPTER 4 A SERIAL-ATA DRIVER WITH OUTPUT<br>SELF-CALIBRATION | 43 - |
| 4.1 Motivation                                                | 43 - |
| 4.2 Overall Architecture                                      | 44 - |
| 4.3 BUILDING BLOCKS                                           | 45 - |

| 4.4 SIMULATION RESULT         | 50 - |
|-------------------------------|------|
| 4.5 TAPE OUT AND SUMMARY      | 58 - |
| CHAPTER 5 MEASUREMENT RESULTS | 60 - |
| 5.1 TAPE OUT                  | 60 - |
| CHAPTER 6 CONCLUSION          | 66 - |
| 6.1 CONCLUSION                | 66 - |
| 6.2 FUTURE WORK               | 67 - |
| BIBLIOGRAPHY                  | 68 - |



## **Lists of Figures**

| Figure 1.1 Basic link components: the transmitter, the wire, and the receiver            | 3 -  |
|------------------------------------------------------------------------------------------|------|
| Figure 2.1 A point-to-point, low swing, incident-wave system                             | 7 -  |
| Figure 2.2 Voltage noise and timing noise margin                                         | 8 -  |
| Figure 2.3 Two methods of compensated for channel response                               | 9 -  |
| Figure 2.4 Eye diagram showing limitations on signal rate                                | 12 - |
| Figure 2.5 The channel model that includes termination resistors and packaging parasitic | 13 - |
| Figure 2.6 Lumped LCRG model of a transmission line                                      | 14 - |
| Figure 2.7 Equivalent circuit of a differential length of a two-conductor T-line         | 16 - |
| Figure 2.8 Characteristic impedance approximations for microstrip line                   | 19 - |
| Figure 3.1 Simplified electrical model of chip-package interface                         | 21 - |
| Figure 3.2 The overall transmitter architecture                                          | 26 - |
| Figure 3.3 The idea of the orderly turn-on buufer                                        | 27 - |
| Figure 3.4 Orderly turn-on buffers                                                       | 28 - |
| Figure 3.5 The signal after orderly turn-on buffers                                      | 29 - |
| Figure 3.6 Duty cycle adjust buffer design                                               | 30 - |
| Figure 3.7 The circuit design of duty cycle control.                                     | 30 - |
| Figure 3.8 The PLA circuit                                                               | 31 - |
| Figure 3.9 Simplified electrical model of chip-package interface                         | 32 - |
| Figure 3.10 The traditional and proposed driver                                          | 34 - |
| Figure 3.11 Simulation of the orderly turn-on buffer                                     | 34 - |
| Figure 3.12 Simulation of the duty cycle adjust buffer                                   | 35 - |
| Figure 3.13 The relation of SSN and duty cycle phase difference                          | 36 - |
| Figure 3.14 The overall transmitter SPICE simulation result                              | 37 - |
| Figure 3.15 The SSN noise comparison                                                     | 38 - |
| Figure 3.16 The eye diagram comparison of proposed and conventional transmitter          | 38 - |
| Figure 3.17 The process variation verification result                                    | 41 - |
| Figure 4.1 The overall transmitter self-calibration architecture                         | 44 - |
| Figure 4.2 The architecture of the counter                                               | 45 - |
| Figure 4.3 The function verification of the counter                                      | 46 - |
| Figure 4.4 The Finite State Machine (FSM) in our transmitter                             | 46 - |
| Figure 4.5 The comparator circuit design in this application                             | 48 - |
| Figure 4.6 Function and design flow of the digital control circuit                       | 49 - |
| Figure 4.7 The transmitter circuit example in SATA 1.0a specification                    | 50 - |
|                                                                                          |      |

| Figure 4.8 The simulation of the transmitter counter                          | - 50 - |
|-------------------------------------------------------------------------------|--------|
| Figure 4.9 Simulation of the FSM                                              | - 51 - |
| Figure 4.10 Output voltage change with FSM change                             | - 52 - |
| Figure 4.11 The typical-typical case of driver design                         | - 53 - |
| Figure 4.12 The output voltage of the TT, SS and FF cases                     | - 54 - |
| Figure 4.13 Consider the stable of the driver                                 | - 55 - |
| Figure 4.14 The stable consideration simulation of $V_{\rm HIGH}$             | - 56 - |
| Figure 4.15 The stable consideration simulation of $V_{\text{LOW}}$           | - 56 - |
| Figure 4.16 Eye diagram of proposed SATA transmitter                          | - 57 - |
| Figure 4.17 Layout of proposed SATA transmitter                               | - 58 - |
| Figure 5.1 The single transmitter layout                                      | - 61 - |
| Figure 5.2 The overall chip layout                                            | - 62 - |
| Figure 5.3 The overall chip photograph                                        | - 63 - |
| Figure 5.4 Experiment setup of the transmitter measurement prototype          | - 64 - |
| Figure 5.5 The measurement eye diagram of single-ended output                 | - 64 - |
| Figure 5.6 The comparison eye diagram of single-ended and differential output | - 65 - |



## **Lists of Tables**

| Table 3-1 Function table of the PLA decoder             | 32 - |
|---------------------------------------------------------|------|
| Table 3-2 The process consideration simulation result   | 40 - |
| Table 3-3 The Fast-Slow and Slow-Fast case simulation   | 40 - |
| Table 4-1 An example of the FSM code                    | 47 - |
| Table 4-2 Summary of the SATA chip                      | 59 - |
| Table 5-1 Chip summary of the proposed LVDS transmitter | 62 - |



## **Chapter 1**

## Introduction



### **1.1 Motivation**

The exponential growth in both speed and integration of digital integrated circuits (ICs) has increased the demand for higher inter-chip communication bandwidths to maximize overall system performance. Traditionally, most system designers have satisfied this demand by increasing the number of links, leading to an increase of cost, power consumption and complexity of the system. In order to solve the problem, however, every single pin interconnection bandwidth has to scale with the speed and integration level of the IC[1].

There are two dominant approaches to high-speed signaling, source-synchronous parallel channels and point-to-point links. First, in the conventional shared bus model, many links are integrated with in a single system to increase the total signaling bandwidth. The large number of such links makes the key constraint (area, power, etc) more restrictive. However, the parallel buses have typically been used for short-range interconnections within a single system. It usually works at 100~400Mbps such as in multi-processor systems[2], processor-to-memory interfaces[3], and network switches[4]. Second, in contrast, the goal in high-speed point-to-point links is to maximize the communication bandwidth and distance on a single cable. Point-to-point links offer an excellent solution that requires multi-gigabit per second (multi-Gbps) rates. The distances of this serial links could be several meters long such as computer-to-computer or computer-to-peripheral interconnection, or several kilometers long such as SONET.

In this thesis, we will explore a system architecture which uses *non-return-to-zero* (NRZ) signaling techniques. A 3 Gbps transmitter which uses orderly turn-on technique and duty cycle modulation to reduce *simultaneous switching noise* (SSN) has been designed and implemented. It is compatible with the *low voltage differential signaling* (LVDS) standard. The architecture of auto self-calibration transmitter has been investigated. Both design feasibility and system performance are studied and analyzed. Then, a reasonable solution is proposed.

# 1.2 CMOS Serial Links

### 1.2.1 General Concept

Traditionally, high-speed links in Gbps range have been implemented in GaAs or bipolar technologies. The primary advantage of those technologies is the faster intrinsic device speed (higher  $f_T$ ). However, CMOS technology is more widely available and allows higher integration than other technologies. Recently development has shown that CMOS is capable of achieving Gbit/s data rate[5, 6].

Another motivation for CMOS implementation is the faster improvement of CMOS speed than the speed of other technologies due to the rapid scaling of it feature sizes.  $0.18 \ \mu m$  CMOS technologies have speed comparable to a 0.5 um GaAs technology. Although it is always possible to yield inherently better devices in

non-transitional technologies, the momentum and investment in CMOS technology development is progressing toward making CMOS the fastest technology[7].

There are two distinct communication channels for serial links: optical fibers and cooper cables. Fist, optical fibers provide a large communication bandwidth over very long distances. But, the fiber and necessary optical components and terminal electronics makes this approach costly and area-inefficient[8, 9]. The optical links are the only one solution for the gigabit/s communication over very long distances. Cooper cables, on the other hand, are the much cheaper solution. But suffer from a very limited data bandwidth that decreases with length. Hence, cooper link, which is the focus of this thesis, are mostly used for short distance application, such as system-to-system interconnections within the same room[10].

## 1.2.2 Basic Link Components

A typical link is comprised of three primary components: a transmitter, a channel, and a receiver. The transmitter converts digital bits into a signal stream that is propagated on the channel to the receiver. The receiver reverts this analog signal back into binary data. Figure 1.1 shows these components.



Figure 1.1 Basic link components: the transmitter, the wire, and the receiver

A transmitter sends the data as analog quantities. The analog values are simply either a HIGH-level or LOW-level to represent a single bit, known as *non-return-to-zero* (NRZ). For example, in an optical system, they are the levels of different amounts of optical power. For electrical systems, these levels are of different signal voltages. The channel is the medium on which the data is propagated. This medium can be a optical fiber, a coaxial transmission line, an unshielded twisted-pair, a *printed-circuit board* (PCB) trace, or the chip package. Channels always attenuate or filter the signals. So the difficulties in transceiver design are to overcome the attenuation and noise induced from the channel. Also, we have to maintain the signal clean while transmitting high data rate. To recovery the data from transmitter, the analog waveform is amplified and sampled. In order to recovery the data, receiver must be able to determine the high speed bits correctly. Finally, the timing recovery circuit properly places the sampling strobe.

### 1.3 Thesis Organization

The choose of a suitable communication architectures is the first essential step in the design of a high speed serial link. We will discuss the trade-offs between different modulation, equalization and detection methods in chapter 2 as the background study. We will also consider the specified transmission medium, *Printed-Circuit Board* (PCB) trace, and discuss the limits of data rates in the transmitters and receivers.

After the introduction that how we can push the data bandwidth of a link. Chapter 3 introduces the architecture which can reduce the *simultaneous switching noise* (SSN). The novel approach to decrease the ground bounces can make the signal integrity better. On the other hand, by the *System on Chip* (SOC) development, all chips and systems share the same power and ground. The power supply noise will be an important issue in the future. We will study the system architecture and circuit level design with simulation results in this chapter. Further more, we will realize *Low Voltage Differential Signaling* (LVDS) transmitter[11, 12]. The simulation results are later verified by the measurement in chapter 5 from test chips implemented in TSMC 0.18-um CMOS 1P6M technology.

After we propose the novel technique that reduces the SSN in chapter 3, we will enhance the transmitter more complete in chapter 4. Although the output can be compensated by the additional digital control, the transmitter can not self-calibration by itself. We present a full integrated feedback compensation loop inside the transmitter. The feedback can adjust its output voltage to the desired level automatically. Not only the SSN being reduced, but also the digitized self-calibration transmitter has many applications. We follow the LVDS design in Chapter 3, and this chapter to design a novel high-speed serial link named, *Serialized AT Attachment* (SATA). The Serial-ATA interface represents one of the greatest shifts in storage technology to hit our industry over the past decade. We simulate the circuit design to ensure the transmitter can conform the specification.

To validate the simulation data presented in the previous chapters, Chapter 5 discusses the measurement results from the implemented transmitter discussed in Chapter 3. The chapter begins by characterizing the limitations of the test environment to ensure that the environment do not introduce excessive bandwidth limitation or noise. Then, the performance results are presented that show a jitter of less than 98ps from a test chip in a 0.18um CMOS technology.

Finally, Chapter 6 presents the conclusions, discussion, and the future work.

A ALLINGT

## Chapter 2

## **Background Study**



## 2.1 Signaling Techniques

This section describes two different approaches to high-speed signaling in digital systems. The first method is traditional high-swing signaling, such as TTL or CMOS. They have been used in most computer systems over the past several decades, especially for chip-to-chip communication. These conventional methods limit the speed to 100MHz. The frequencies have not scaled with improving process technologies. As the speed in the modern digital system increases, the conventional methods are therefore becoming a major bandwidth bottleneck.

The second method is discussed more in this thesis. The point-to-point incident-wave signaling does not suffer from the limitations of the conventional method. By the way, its data rate can scale with the process technology. Because of the scaling, the new signaling technique is emerging in high-speed systems.

#### 2.1.1 Traditional Large-Swing Signaling

Traditional signaling systems are limited to data rate of about 100Mb/s per wire and dissipate large amounts of energy per bit transmitted. Because of the limitations, many modern microprocessors operate their external buses at a small fraction of the internal clock rate. In traditional CMOS systems, a CMOS inverter is used as both driver and receiver. The cable or PCB trace usually has characteristic impedance of about 50 $\Omega$  to 100 $\Omega$ . These signaling systems are slow because the high impedance driver is unable to switch the line voltage completely on the incident wave[13].

#### 2.1.2 Point-to-Point Low-Swing Signaling

A signaling system that overcomes the limitations of traditional signaling is shown in Figure 2.1. A current-source transmitter drives the line with currents that typically range from a couple of milliamperes to tens of milliamperes, resulting in a voltage swing range of 100mV to about 1V. This line is terminated at both ends in its characteristic impedance. The receiver termination absorbs the incident wave, preventing any reflections. The source termination makes the systems more tolerant of crosstalk and impedance discontinuities.



Figure 2.1 A point-to-point, low swing, incident-wave system

A high-gain clocked regenerative receiver amplifier can be both low offset (~30-60mV) and high gain. With the improved receivers with low offset and high sensitivity, this system can operate reliably using very small swings. Therefore this signaling method also offers a considerable reduction in power dissipation of the system. The system described can also operate at data rates independent of the line length. A new symbol can be driven onto the line before the previous symbol arrives

at the receiver. This results in a system whose data rate, to a first approximation, scales linearly with the device speed[10].

### 2.2 Noise Source

A good design presents reasonable trade-offs among the performance metrics and cost metrics. The performance metrics include bit rate, latency and bit error rate (or robustness to be more general). The cost metrics include power, area, number of pins, wires, and other electrical components. All of these metrics affect each other. The most important source is the noise. Noise decreases the performance of systems and bit error reduces the effective bandwidth of the links. Most digital transmission systems have some mechanisms to handle or correct them. Therefore, reliability or robustness of a link at the desired operating speed is an important consideration. This is measured by the bit error rate (BER). Different applications have different BER requirements. Bit errors are caused by two main noise sources. One is voltage (amplitude) and another is timing (phase) noise as pictured in Figure 2.2.



ideal RxClk sampling position

Figure 2.2 Voltage noise and timing noise margin

#### 2.2.1 Voltage Noise

Voltage noise directly reduces the signal voltage margin. It also reduces signal timing margins by shifting signal transition edges. The major voltage noise sources are channel attenuation and inter-symbol interference, fabrication offset, reflections, and power supply noise[14].

#### **Channel Attenuation and Inter-Symbol Interference**

The channel filters the transmitted signal and causes frequency-dependent channel attenuation and signal distortion. It leads to the reduction in receiver signal amplitude and *inter-symbol interference* (ISI). Channel attenuation and ISI effects are presented in all links. Their magnitudes depend on the characteristics of the channel and the signal frequencies relative to the channel bandwidth. The channel impedance attenuates the traveling signal, and the conduction in the surrounding dielectric causes further loss. Furthermore, high-frequency current flows closer to the conductor surface resulting in higher series resistance (skin effect, we will describe later). As a result, the channel reduces signal amplitude and adds residual error from previous bit leading to ISI.

To solve the channel attenuation and ISI problem, equalization has been widely used in communication systems. The basic idea is to insert filter in the signal path to provide the inverse filtering effect of the channel. However, with current technologies, the complex equalization schemes used in communication systems cannot operate at the GHz frequency range and hence cannot be used in multi-Gigabit link design, where only simple equalization schemes can be applied. Equalizers can be implemented at transmitter or receiver, or at both. The easiest and most common equalization technique used in Gigabit link is transmitter pre-emphasis. A short (finite impulse response) FIR filter pre-distorts the transmit signal to boost the signal energy of the high-frequency components. However, this method increase power consumption and it amplifies the high-frequency noise at the same time. On the other hand, the same architecture can be implemented at the receiver.



Figure 2.3 Two methods of compensated for channel response

We can also explored multi-level signaling, where the transmitter sends more than one bit at a time. The simplest multi-level transmission scheme, called *pulse*  *amplitude modulation* (PAM), is to encode N data bits into a symbol consisting of  $2^{N}$  voltage levels. 4-PAM signaling has been demonstrated to increase the achievable data rate over band-limited channels.

#### Offsets

Even in a carefully matched layout, transistor mismatches in the transmitter and receiver can induce voltage offset. The amplitude of offset is independent of transmission signal swing but rather are determined by the transistor size and process parameter. Mismatches in transmitter cause the actual output signal swing different from the nominal swing. Mismatches in receiver increase the minimum transmit signal swing required for accurate signal detection. Offset-cancellation, commonly used in op-amp design, can be applied to reduce the effects of circuit mismatches.

#### Reflections

Reflections can impact signal margins in two different ways. First, reflections at mismatched terminations and impedance discontinuities come back as noise signals and add to future signal bits in the line. However, reflection noise is another type of ISI noise, which is given by (1). Where  $\Gamma$ , the reflection coefficient is related to  $Z_L$ , the load impedance at the reflection point, and  $Z_0$ , the characteristic impedance of the line by (2). We will discuss the signal reflection in later sections.

$$V_{reflected} = \Gamma \times V_{incident} , \qquad (1)$$

$$\Gamma = \frac{Z_L - Z_0}{Z_L + Z_0} \,. \tag{2}$$

#### **Power Supply Noise**

Power supply noise is reduced by switching large currents in short durations across the parasitic inductance in power distribution network. It is therefore also called LdI/dt noise. It is a problem in almost all digital systems. This issue becomes more serious as modern chips integrate more gates that switch at high frequency. The noise amplitude from digital logics is independent of the I/O driver output signal swing. Therefore, the noise can be proportional to the output signal swing if the noise is caused by the large output driver switching current. Although power supply noise affects different systems by different degrees, its omnipresence in digital systems has stimulated many researches in techniques to reduce dI/dt noise. Such techniques include minimizing the inductance of power distribution networks, employing constant current drivers or more generally keeping the total current drawn from each supply constant, increasing bypass capacitance both on the chip and on the board, using separate power supplies for noise-sensitive circuit blocks, using slew-rate control, and using coding schemes that reduce switching frequency of signals. The power supply noise is the most issue in this thesis. We will describe the method to reducing SSN very detail in chapter 3.

### 2.2.2 Timing Noise

The data rate of a signaling system is limited by both the electronics used to generate and receive the medium over which the signal propagates. The following two parts examine more detail these two limitations. As shown in Figure 2.4, limits on signaling rate arise from rise time, aperture time, and timing uncertainly[13]. The time for the bit cell,  $t_{bit}$ , must exceed  $t_r$  (the time required for the signal to slew from one logic level to another) summed with  $t_s$  (the time for the receiver to sample this signal while stable) and  $t_j$  (the jitter between the signal and sampling clock). This is the first limitation.



Figure 2.4 Eye diagram showing limitations on signal rate

The second limitation is about the transmission medium. PC board traces and coaxial and twisted-pair cables behave as transmission lines that store and propagate signal energy. This part will be discussed in detail in the next section.

## 2.3 Channel Analysis

Concern about the performance of wires in scaled technologies has led to research exploring other communication methods[15]. The relationship between wires and gate delays has many interesting problems. So understanding the wires is the first course to research the high-speed links.

In our design, the system transmission line model is fixed. However, we must account for its characteristics in our signaling system. The channel is the entire path from the output of the transmitter circuit to the input of the receiver circuit. This includes the connections from the die input and output pads to the package pin on both sides and the PCB trace that connects them. The termination resistor which matches the characteristic impedance of the channel is used to minimize the signal reflection. After the transmitter circuit lunches a signal, the signal run through the pad, bonding wire, package pin, PCB trace, and the receiver package. As we know, a signal can continue to propagate along a transmission line as long as the impedance remains constant. Changes in the impedance along the line will cause part of the signal energy be reflected which then propagates in the opposite direction (back toward the

transmitter). If the signal is reflected again, the second reflection would interfere with the signal that is transmitted after the roundtrip propagation delay. It appears as signal pattern-dependent noise. A common source of impedance mismatch is discontinuities between the chip package and PCB trace.

Consequently, the package parasites should be taken into account in the design of high speed interface circuits from very beginning. Thus, simulations must include a reasonable circuit model of the package and the transmission line model of PCB trace. Figure 2.5 illustrates the channel model that includes termination resistors and packaging parasitic.

In the following sections, we will identify the parasitic parameters of the package and build the equivalent transmission line model for the PCB trace between chips.



Figure 2.5 The channel model that includes termination resistors and packaging parasitic

#### 2.3.1 Channel Medium

PC board traces and coaxial, twisted-pair cables behave as transmission lines that store and propagate signal energy. These lines can be modeled by a series of lumped LCRG elements as shown in Figure 2.6. The loss in transmission is primary due to the series resistive component of the copper (R) and parallel conductive component of the dielectric (G).



Figure 2.6 Lumped LCRG model of a transmission line

The major frequency-dependent loss in many electrical link using conductors is due to skin-effect resistance. This effect can be modeled as an increase in the series resistance of the wire as shown in Figure 2.6. Higher frequency signals propagate closer to the surface of the conductor. The resulting signal current conducts within a limited depth, the skin depth, on the conductor surface, which is defined as [16] (3).

$$\delta = \frac{l}{\sqrt{\frac{\pi \times \mu_0}{\rho} \times f}},\tag{3}$$

where  $\rho$  is the resistivity of the conductor, and f is the frequency of the signal. Hence, the effective series resistance ( $R_{skin}$ ) of the cable corresponding to this depth increase with square root frequency [17]: **BBG** 

$$R_{skin} = \frac{1}{2 \cdot r} \sqrt{\frac{\rho \cdot \mu_0}{\pi} \cdot f} , \qquad (4)$$

for a round conductor with radius r. This relation shows a frequency dependent resistive loss. Note that the above equation is valid only for frequencies well above the skin frequency,  $f_{skin}$ , where the skin depth is smaller than the radius of the wire:

$$f_{skin} = \frac{\rho}{\left(\pi \cdot \mu \cdot r\right)}.$$
(5)

With some insulating materials, dielectric absorption also causes frequency dependent attenuation. This loss can be modeled as a conductance G between the signal wire and ground. This effect can be mitigated by using a low-loss dielectric material. However, because of certain restrictions on PC board materials, the choice of dielectric is limited. PCB trace usually demonstrates a considerably higher dielectric loss compared to cables. Dielectric loss for each material is usually expressed in terms of a parameter, called the loss tangent defined as (6).

$$tan\sigma_D = \frac{G}{\omega \cdot C},\tag{6}$$

where C is the capacitance per unit length as shown in Figure 2.6. This quantity is approximately constant with frequency. The dielectric loss, G, typically increases linearly with frequency. Base on the above equations, the approximate cable frequency response in dB accounting for both skin effect and dielectric loss is given by[13] (7), where  $\ell$  is the length of the cable,  $h_s$  and  $h_d$  are the skin effect and dielectric loss coefficients respectively.  $h_s$  is usually larger than  $h_d$ , so dielectric loss may be neglected. On the other hand, for the PCB trace,  $h_s$  is marginally larger than  $h_d$ , so it needs to be considered.

$$H(f,l)\big|_{dB} = -\left(h_s \cdot \sqrt{f} + h_d \cdot f\right) \cdot \ell .$$
<sup>(7)</sup>

### 2.3.2 Characteristic Impedance of the Channel

The characteristic impedance  $Z_0$  of the transmission line is defined by the ratio of the voltage and current waves at any point of the line; thus,  $V/I = Z_0$ . Figure 2.7 shows the equivalent electric circuit of such a line segment. The quantities v(z,t) and  $v(z + \Delta z, t)$  denote the instantaneous voltages at z and  $z + \Delta z$  respectively. Similarly, i(z,t) and  $i(z + \Delta z, t)$  denote the instantaneous currents at z and  $z + \Delta z$ , respectively. Applying Kirchhoff's voltage law, we obtain[18]

ALL DE LE

$$v(z,t) - R\Delta z \cdot i(z,t) - L\Delta z \cdot \frac{\partial i(z,t)}{\partial t} - v(z + \Delta z,t) = 0, \qquad (8)$$

on the limit as  $\Delta z \rightarrow 0$ , (8) becomes

$$-\frac{\partial v(z,t)}{\partial z} = R \cdot i(z,t) + L \frac{\partial i(z,t)}{\partial t}.$$
(9)

Similarly, applying Kirchhoff's current law to the node N in Figure 2.7, we have (10) by the same method above.

$$-\frac{\partial v(z,t)}{\partial z} = R \cdot i(z,t) + L \frac{\partial i(z,t)}{\partial t}.$$
 (10)

(9) and (10) are a pair of first-order partial differential equations in v(z,t) and i(z,t). They are the general transmission-line equation (PDE). For harmonic time dependent, the use of phasors simplifies the transmission-line equations to ordinary differential equations (ODE).



Figure 2.7 Equivalent circuit of a differential length of a two-conductor T-line

For a cosine reference we write  

$$v(z,t) = \Re \left[ V(z)e^{j\omega t} \right],$$

$$i(z,t) = \Re \left[ I(z)e^{j\omega t} \right].$$
(11)

where phasors V(z) and I(z) are functions of the space coordinate z only and both may be complex. Substitution of (11) in (9) and (10) yields the following differential equations for phasors V(z) and I(z):

$$-\frac{dV(z)}{dz} = (R + j\omega L) \cdot I(z),$$

$$-\frac{dI(z)}{dz} = (G + j\omega C) \cdot V(z).$$
(12)

(12) are coupled *time-harmonic transmission-line equation*. They can be combined to solve for V(z) and I(z). We obtain the following one-dimensional second-order ordinary differential equations(13).

$$\frac{d^2 V(z)}{dz^2} = \gamma^2 V(z),$$

$$\frac{d^2 I(z)}{dz^2} = \gamma^2 I(z).$$
(13)

where

$$\gamma = \alpha + j\beta = \sqrt{(R + j\omega L)(G + j\omega C)} \qquad (m^{-1})$$
(14)

is the propagation constant whose real and imaginary part  $\alpha$  and  $\beta$ , are the attenuation constant (Np/m) and phase constant (rad/m) of the line, respectively.

We already derived the governing equations(13) for time harmonic V(z) and I(z) on a transmission line. Let us now examine their characteristic on an infinite line. The solutions of (13) are

$$V(z) = V^{+}(z) + V^{-}(z) = V_{0}^{+}e^{-\gamma \cdot z} + V_{0}^{-}e^{\gamma \cdot z},$$
  

$$I(z) = I^{+}(z) + I^{-}(z) = I_{0}^{+}e^{-\gamma \cdot z} + I_{0}^{-}e^{\gamma \cdot z},$$
(15)

where the plus and minus superscripts denote waves traveling in the +zand -z-direction, respectively. Wave amplitude  $(V_0^+, I_0^+)$  and  $(V_0^-, I_0^-)$  are related by (12), and we can verify that the relation

$$\frac{V_0^+}{I_0^+} = -\frac{V_0^-}{I_0^-} = \frac{R + j\omega L}{\gamma} \,. \tag{16}$$

For a infinite line (actually a semi-infinite line with the source at the left end) the terms containing the  $e^{y\cdot z}$  factor must vanish. If not, these terms would increase indefinitely with z, a physical impossibility. There are no reflected waves; only the waves traveling in the z+-direction exist. Thus[18, 19]

$$V(z) = V^{+}(z) = V_{0}^{+} e^{-\gamma \cdot z},$$
  

$$I(z) = I^{+}(z) = I_{0}^{+} e^{-\gamma \cdot z}.$$
(17)

The ratio of the voltage and current at any z for an infinitely long line,  $V^+(z)/I^+(z) = V_0^+/I_0^+$ , is independent of z and is called *characteristic impedance* of the line.

$$Z_0 = \frac{R + j\omega L}{\gamma} = \frac{\gamma}{G + j\omega C} = \sqrt{\frac{R + j\omega L}{G + j\omega C}} \qquad (18)$$

Note that  $\gamma$  and  $Z_0$  are characteristic properties of a transmission line whether or not the line is infinitely long. They depend on R, L, G, C and  $\omega$  but not on the length of the line. An infinite line simply implies that there are no reflected waves[19].

#### **2.3.3 SPICE Model of the Channel (PCB Trace)**

In inter-chip communication system, the signaling media is usually the *printed circuit board* (PCB) or *multi-chip module* (MCM) traces. In terms of circuit simulations, the problem is how to model the traces for high speed transmission. The response of any conductors to an incoming signal depends greatly on the effective length of the fastest electrical feature in the signal. The effective length of the signal's rising edge is[16]

$$\ell = \frac{T_r}{D},$$
(19)

where  $\ell$  is the effective length of the rising edge in.  $T_r$  is the rise time of the signal in ps, and D is the propagation delay of the conductors in ps/inch.

For example, in the 2.5 Gbps serial link, the rise time of the bit cell is often  $100\text{ps} \sim 200\text{ps}$ . For transmitter design, the rise time is usually controlled to be around  $1/4 \sim 1/2$  of the bit time. Considering the worst case, when the rising edge propagates along a microstrip of the FR4 board, it has an effective length of 0.56 inch. For our system, the trace length is considered to less than 21 inch on the board. Evidently, the potential of the signal pulse propagates along the trace is not uniform at all points. This type of system is called a *distributed* system. A rule of thumb is that the circuit behaves mostly in a distributed fashion when the wire is longer than one-sixth of the effective length of the rising edges[16].

The metal in a typical PCB board is usually copper and the dielectric is FR4, a type of fiberglass. The two most common types of transmission lines are micro-strips

and strip-lines. In our design, we use the microstrip for simplicity. Its simple structure is shown in Figure 2.8, and the corresponding formula is shown in (20)[18, 20].

$$Z_{0} \cong \frac{87}{\sqrt{\varepsilon_{r} + 1.41}} ln \left( \frac{5.98H}{0.8W + T} \right) (Ohms)$$
(Valid when  $0.1 < W/H < 2$  and  $1 < \varepsilon_{r} < 15$ )
(20)



Figure 2.8 Characteristic impedance approximations for microstrip line

The SPICE Devices Models Manual [20] presents the method to describe a transmission line. After we run the HSPICE, the output shows six parameters, L, C,  $R_0$ ,  $G_0$ ,  $R_s$ , and  $G_D$ , which  $R_s$  is the skin effect parameter in( $\Omega/m\sqrt{Hz}$ ),  $G_0$  is the DC conductance of the dielectric material in S/m, and  $G_D$  is the dielectric loss parameter in( $S/m \cdot Hz$ ).

## **Chapter 3**

## **2.5Gbps LVDS Transmitter with**

## **SSN Reduction**



## 3.1 Simultaneous Switching Noise Rejection

*Simultaneous switching noise* (SSN) or ground bounce caused by many electrical and packaging properties. SSN become a major bottleneck in high speed digital design. For future systems, modeling SSN can be complex due to the thousands of interconnects that need to be analyzed [21, 22]. Today, many SSN analyses are studied in different fields.

Output pad driver is the main source of the SSN because of the large transient currents during switching. SSN may even appear at low operation frequency signal that has sharp transition. The peak of SSN usually occurs in the beginning of transition. Different pad and package structures will have different value of parasitic inductance and capacitance. Hence, the design must be careful here.

SSN or ground bounce depends on many electrical and packaging properties relate to each other. In general, the same phenomenon applicable to power is called  $V_{DD}$  bounce. Both ground bounce and  $V_{DD}$  bounce are important noise source. Since devices near the high-voltage level tend to have more noise margin than those in the low-voltage level. Therefore, ground bounce is considered more often. The SSN might still be low if a large number of supply bonds are used. For low SSN digital system designs, we must take all the relevant parameters into account and try to find the most effective approaches to reduce the SSN. In this section, we will discuss some techniques for the SSN reduction. Before that, we calculate the amounts of SSN noise firstly.



Figure 3.1 Simplified electrical model of chip-package interface

The last stage of output buffer is the main source to generate SSN during transient because of its high driving capability. Therefore, we focus the analysis of

SSN caused by the last stage of the buffer in this section. Conventionally, Shockley's square law model is used in the analysis of MOSFET circuits due to its simple close-form equation. However, when excluding the velocity saturation effects in sub-micron technologies, the Shockley model can not regenerate the voltage-current characteristics of the short-channel MOSFET transistors. Therefore, alpha-power law MOSFET model[23] that include the velocity saturation effects is used here. Using alpha-power law MOSFET model, the drain current of MOSFET is given as

$$\begin{split} i_{\rm D} &= 0 \qquad , V_{\rm GS} \leq V_{\rm TH}, \text{ cutoff} \\ i_{\rm D} &= k_1 \left( V_{\rm GS} - V_{\rm TH} \right)^{\frac{\alpha}{2}} \cdot V_{\rm DS} \quad , V_{\rm DS} < V_{\rm D0}', \text{ linear} \\ i_{\rm D} &= k_{\rm S} \left( V_{\rm GS} - V_{\rm TH} \right)^{\alpha} \qquad , V_{\rm DS} > V_{\rm D0}', \text{ saturation} \end{split}$$
(21)

where k is drivability factor,  $V_{TH}$  is the threshold voltage,  $\alpha$  is velocity saturation index, and  $V'_{D0}$  is the drain saturation voltage. Typical values of  $\alpha$  range from 1.0 to 1.3 for NMOS transistor.

When the output changes from high to low, because of the velocity saturation effects, the NMOS transistors usually stays in the saturation region during the time of input signal transient. It can be well assumed that when the ramp input signal ( $V_{in}$ ) reaches  $V_{DD}$ , SSN reaches the maximum value. For n output drivers which share the common ground line switches simultaneously, the discharging current flow through the L<sub>gnd</sub> is given in[24]

$$i(t) = n \cdot i_d(t) = n \cdot k_{sn} \left( V_{in}(t) - V_n(t) - V_{in} \right)^{\alpha_n}, \qquad (22)$$

the switching noise  $(V_X)$  that is the voltage built up in node X in Figure 3.1 can be written as

$$V_{n}(t) = L_{gnd} \frac{dI(t)}{dt} = nk_{sn}L_{gnd} \frac{d}{dt} (V_{in}(t) - V_{n}(t) - V_{in})^{\alpha_{n}}$$
(23)

V<sub>in</sub> which is a ramp input can be expressed as  $V_{in} = S_r t$ , where  $S_r = \frac{V_{DD}}{t_r}$ . Solving the nonlinear differential equation with an initial condition  $V_n = 0$  at  $t = t_t$ , where  $t_t = \frac{V_m}{V_{DD}} t_r = \frac{V_m}{S_r}$ , we obtain

$$\frac{\left[\left(V_{in}(t) - V_{n}(t) - V_{tn}\right)/V_{DD}\right]^{\alpha_{n}} V_{DD}^{\alpha_{n}}}{V_{n}(t)} = \frac{t - t_{t}}{2nk_{sn}L_{gnd}}$$
(24)

Take the power series of  $\left[ (V_{in} - V_n - V_{in}) / V_{DD} \right]^{\alpha_n}$  and neglect terms with order higher than two, we can obtain a close-form equation as follows

$$V_{n}(t) = \frac{1}{\alpha_{n} - 1} V_{k} + \frac{A(t - t_{t})}{\alpha_{n} (\alpha_{n} - 1)} V_{k}^{2 - \alpha_{n}} \left( 1 - \sqrt{1 + \frac{2\alpha_{n}}{A(t - t_{t})}} V_{k}^{\alpha_{n} - 1} + \frac{2\alpha_{n} - \alpha_{n}^{2}}{A^{2}(t - t_{t})^{2}} V_{k}^{2\alpha_{n} - 2} \right)$$
(25)  
where  $V_{K} = V_{in} - V_{in}$  and  $A = \frac{1}{2nk_{sn}L_{gnd}}$ . (26)

At t=t<sub>r</sub>, the maximum value of the SSN occurs,

$$V_{n,max} = \frac{V_{DD} - V_{in}}{\alpha_n - I} + \frac{A(t - t_i)}{\alpha_n (\alpha_n - I)} \left( V_{DD} - V_{in} \right)^{2 - \alpha_n} \left( I - \sqrt{I + \frac{2\alpha_n}{A(t_r - t_i)}} \left( V_{DD} - V_{in} \right)^{\alpha_n - I} + \frac{2\alpha_n - \alpha_n^2}{A^2(t_r - t_i)^2} \left( V_{DD} - V_{in} \right)^{2\alpha_n - 2} \right) (27)$$

The  $V_{n,max}$  estimated by (27) and other predict equations together with SPICE simulation results are similar. The error as compared with SPICE is below 5%. Note that, due to the negative feedback effect of  $L_{gnd}$ , the noise is not a linear function of the number of the drivers.

This is the analysis and calculation of the SSN. Next, we discuss some techniques that reduce SSN[24].

#### 3.1.1 Optimal Rise/Fall Time

In the sub-nanosecond region, unnecessarily fast rise/fall edges of pulses should be avoided in order to achieve low SSN. An optimal size of drivers can be estimated in terms of required rise/fall time, loading, and pulse swing. Many semiconductor foundries offer ASIC designers a variety of I/O buffer cells, from 2mA to 24mA in both non-slew rates controlled and slew rate controlled for different requirements. These buffers should be carefully chosen to be just enough for the specification.

It is the edge rate not the frequency that affects the ground bounce. The slew rate dV/dt of the output significantly affects ground bounce more than any other parameter. The slower the output slew, the lower the ground bounce will be. It

becomes a trade off between performance and signal integrity. But this is not the best way to solve the SSN problems.

### 3.1.2 Reducing Inductance

Zero inductance in the power supply network results in zero SSN. Since the parasitic inductance is mainly related to packaging, e.g. wire-bonds, or package. It leads using advanced packaging techniques, such as *ball-grid arrays* (BGA) packages and *multi-chip module* (MCM) substrates with full supply planes, or flip-chip on a MCM will certainly reduce SSN. In addition, the arrangement of supply paths has a strong influence on SSN since switching noise is directly caused by the effective inductance of the power supply networks.

### 3.1.3 Reducing Signal Swing and Use Differential Drivers

A small signal swing leads to a low SSN. Since a significant part of the SSN is generated by large output buffers, the use of LVDS outputs instead of full swing CMOS outputs reduces SSN. The LVDS technique has already been widely used in many high-speed IC chips using BiCMOS as well as CMOS technologies. The drawback of reduced noise susceptibility with reduced swing signals is improved by using differential signals, with which common mode noise is effectively rejected. Moreover, differential outputs into impedance matched and coupled differential lines result in reduced overall noise generation and increased noise immunity. However, steady-state power consumption is the side effect because of the constant current sink through the termination resistors.

### 3.1.4 Separate Power Supply Network

Since SSN is mainly generated by large bus/clock drivers and output buffers, it should be regarded as a design rule to separate the supple for internal logics from the supply for output drivers, especially when many of them switch simultaneously. Power supply planes on an MCM substrate might be split into several parts in order to eliminate noise coupling between different chips or different functional blocks. However, the potential stability problem at system level requires a common ground plane on the PCB that supports chips and MCM modules.

### 3.1.5 Decoupling Capacitors

In MCM and advanced PCB designs, metal planes are used for the power supply and ground. Although the inductance of these planes is small, the inductance associated with vias and bonding leads (or wires) from an MCM substrate to a board (or package) is often considerable. If meshed supply planes are used, the inductance is not negligible, especially at high speeds. This is because the supply or return current paths are strongly influenced by the meshed planes. To use decoupling capacitors effectively, they have to be mounted with multiple-vias close to the chip being concerned to reduce parasitic inductance. Surface-mounted capacitors with large capacitance values (e.g. > 100nF) have large parasitic inductance. Therefore, it is preferable to use medium value capacitors (~1-10nF) with a high self-resonance frequency and a high Q value. An important note here is to distribute them over the entire module. Capacitors formed between two closely placed power and ground planes on an MCM or PCB provide high-quality decoupling with a very low parasitic inductance.

Besides, varying the capacitive load has an effect on both the amplitude and the width of the pulse. The amplitude tended to decrease with increasing capacitive load, whereas the pulse width increases. The increased capacitive load tends to reduce the slew rate on the outputs, thereby, reduce the amplitude.

## 3.2 Transmitter Architecture

The proposed transmitter overall architecture contains a single to differential buffer, a controlled predriver module, a fixed LVDS driver, and a programmable LVDS driver. In Figure 3.2, the first buffer converts the single ended data input to differential. The upper signal path is for the fixed driver and the lower one is for the programmable driver. The upper signals pass to the first six orderly turn-on buffers blocks. Note that, the signals are duty cycle controlled to reduce SSN as well. Besides, because of the process variation or layout mismatch, the output level is not guaranteed to have an expected voltage level. So the additional controllable LVDS drivers can enhance the output driver current to compensate the output voltage swing through the programmable LVDS driver as in Figure 3.2.



Figure 3.2 The overall transmitter architecture

The orderly turn-on block converts one differential pair signals to six. All the six differential pairs go through the duty cycle modulation block independently. Finally, the signals turn on the gate of the drivers "orderly". On the other hand, the control logic decides whether the signals pass or not. Base on the PMOS and NMOS driver capability, the number of the compensated driver can determine by outside control signals.
# 3.3 System Components

## 3.3.1 Orderly Turn-On Buffer

The idea of the orderly turn-on buffer is shown in Figure 3.3. The conventional driver has a current spike as shown in the figure. But by our orderly turn-on architecture, the dI/dt is decrease a lot compare to conventional one. The big driver is divided into several small drivers as the figure shows. All small drivers drive the load one by one between a little time interval. The influence by this circuit is that the slew rate is slowed down and the dI/dt curve also reduces.



Figure 3.3 The idea of the orderly turn-on buufer

It is very difficult to generate a series of signals whose intervals are less than ten picoseconds. For TSMC 0.18-um technology, the gate delay of the inverter is at least  $20 \sim 30$  picoseconds. Even using differential pair inverter approach, we can just suppress the gate delay to about 20 picoseconds. In the 2.5 Gbps transmitter design, to maintain the stable quality of eye-diagram, the rising and falling time should not be longer than 100 picoseconds. Since there are thirteen PMOS or NMOS need to be

turned on in the worst case. The intervals between each signal can not be longer than 15 picoseconds. Therefore, how to design a small delay circuit is a very challenging issue.



Some technologies suppress the SSN by separating the gate signal[25]. But the interval between the signal can be large than 100ps because of the delay circuit. They delay each signals by inverters or buffers. It works if the bit rate is about several hundred Mbps. When the bit rate reach giga bit per second, this approach is feasible.

We propose a new approach to generate the signals which separated by only  $10\sim15$  ps. These signals will orderly turn-on the drivers to suppress the SSN effect. The method is shown in the Figure 3.4. After the input passes the first two inverters, it goes in several paths which has different loading. Because of the difference in the loading effects, the signal V1, V2, etc, will rise at different time. We then add waveform shaping buffers which included two inverters to shape the signal. So, the signal will be have the same rising time with  $10\sim15$  ps separated each other in Figure 3.5.



Figure 3.5 The signal after orderly turn-on buffers

### 3.3.2 Duty Cycle Adjust Buffer

The second block of the system is called "duty cycle adjust buffer". Besides the orderly turn-on buffers, the second method to reduce the SSN effect is duty cycle modulation. This is an experience from conventional I/O pad design for reducing the SSN.

As shown in Figure 3.6, the normal operate mode of LVDS is: that (MP2, MN1) and (MP1, MN2) are turned on and off alternately. It means that there appears a large circuit current change when the signal is in transient. Therefore, we propose a new method to reduce longer current change. When the signal changes, the turn off time of MP2 and MN1 is delayed. In other words, the four transistors of the driver will be partially turned on at the same time driving the transient. MP1 and MN2 are turned on gradually while MP2 and MN1 are being turned off. In microcosmic, the current through MP2 and MN1 will be conducted to MP1 and MN2 when signal changes. After the input signal exceeds the threshold voltage, the current will be totally switched. Hence, it minimizes the current charge to power and ground and reduces the SSN.



Figure 3.6 Duty cycle adjust buffer design

We control the turn-on or turn-off delay by adjust the length and width of the inverters in the predriver block. When a transistor's changes, the charging ability also changes. This modulation can generate the signal as shown in Figure 3.6.



Figure 3.7 The circuit design of duty cycle control

As shown in Figure 3.7, the inverter1~inverter4 compose the duty cycle modulation circuit. The driving ability will be weakened if we increase the length of the inverter. As this method, we can adjust the time difference between Vip01+ and Vip01- as small as 15~30ps. The control pin in Figure 3.7 is a tri-state switch. It control the signal pass or not. When the driving MOS is PMOS (Vip), we use NAND gate to control the tri-state switch. On the other hand, we use NOR gate if it drives

NMOS (Vin).

The concept above is that when the PMOS have to turn-off, the control should be LOW if we want to disable the path. It is because the output of NAND gate will be HIGH if anyone NAND gate's input is LOW. On the other hand, if we want to turn on the path, the control pin should be HIGH. To sum up, a NAND gate operates like a inverter when anyone input is high. At the same conception, we replace the NAND gate with NOR gate. Base on this circuit design, the duty cycle buffer have two functions. One is duty cycle modulation and the other is a switch. The switch decide the signal pass this buffer or not.

#### 3.3.3 Control Logic Circuit Design

In Figure 3.2, the lower path in the system can compensate the output by control the number of driving MOS transistors. When the output level is too high or too low, we turn-on more or turn-off some to change the output voltage. So we need a digital circuit to decide the number of compensation transistors. We choose a simple Programmable Logic Array (PLA) circuit for it like the one shown in Figure 3.8.



Figure 3.8 The PLA circuit

The control signal (A B C)decode eight outputs which follow the function like Table 3-1. The outputs connect to the control pins of the duty cycle adjust buffers. So we can decide the number of the compensation drivers by input A B C. There are two decoders for MOS and PMOS controls independently.

| Α | В | С | 7 | 8 | 9 | 10 | 11 | 12 | 13 |
|---|---|---|---|---|---|----|----|----|----|
|   | • |   |   |   | _ |    | _  | _  |    |
| 0 | 0 | 0 | 0 | 0 | 0 | 0  | 0  | 0  | 0  |
| 0 | 0 | 1 | 1 | 0 | 0 | 0  | 0  | 0  | 0  |
| 0 | 1 | 0 | 1 | 1 | 0 | 0  | 0  | 0  | 0  |
| 0 | 1 | 1 | 1 | 1 | 1 | 0  | 0  | 0  | 0  |
| 1 | 0 | 0 | 1 | 1 | 1 | 1  | 0  | 0  | 0  |
| 1 | 0 | 1 | 1 | 1 | 1 | 1  | 1  | 0  | 0  |
| 1 | 1 | 0 | 1 | 1 | 1 | 1  | 1  | 1  | 0  |
| 1 | 1 | 1 | 1 | 1 | 1 | 1  | 1  | 1  | 1  |

Table 3-1 Function table of the PLA decoder

#### 3.3.4 Driver Design

Finally, the predriver and compensation circuits are integrated to form the final driver. The driver is studied in this section. The power supply and ground of an IC is not ideal. It is combined with the equivalent inductance and capacitance of power distribution networks as shown in Figure 3.9. The load capacitor  $C_L$  will be charged and discharged when signal transient. The switching noise increases as the frequency increases. Conventionally, two current sources are connecting to power and ground respectively to minimize the current change hence reduces simultaneous switching noise as shown in Figure 3.10.



Figure 3.9 Simplified electrical model of chip-package interface

Unfortunately, these two current sources will create a large voltage drop and limit the output voltage swing. In order to meet LVDS standard, the size of the four switching transistors need to be increased significantly. Also the decreasing of the VDS of the four switches. So, the sizes of the pre-drivers must also be increased very much. This will increase the area and power of the pre-driver seriously. It is a trade-off between the performance (noise) and the cost (area). Due to the rapid device size scaled down, the power supply voltage decreases at the same rate. This problem becomes a challenge for designers.

According to the description above, if the current sources connected to power and ground are removed; the SSN effect will be much larger than the LVDS standard requirement. So, we propose a novel architecture. It has no current source in power supply or ground. As shown in Figure 3.10, in stead of using current source, it adds the turn on spreading technique to reduce SSN effect and avoid signal error. By the way, the overall driver becomes an "all-digitized" edition compare to the LVDS standard driver.

This methodology has some advantages. First, the driver needs no current source so that the output signal swing can be enlarged. It also reduces the size of switch transistor greatly. Therefore, the sizes of the pre-driver are reduce as well. It reduces the overall chip area substantially. At the same time, it reduces the power consumption. Further more, the driver architecture looks like two inverters connected back to back, it makes the control and layout easier. Note that, the common-mode voltage is lowered from 1.25V to 0.9V with a 1.8V supply environment technology. The power supply is 2.5V in the IEEE standard.



Figure 3.10 The traditional and proposed driver

Aller

# **3.4 Transmitter Simulation**

## 3.4.1 The building blocks simulation

In order to verify the circuit design, we use HSPICE to simulate the circuit. In this section, we show the simulation results of every building block in the transmitter. The most important functional block is the orderly turned-on buffer. The SSN reduction is mainly achieved by this blocks



Figure 3.11 Simulation of the orderly turn-on buffer

In Figure 3.5, the orderly rising signal generator is working as the simulation results show. The time interval between each rising signal is about 10~20ps. So, we have a problem that if the process variation or parasitical effect occurs, the signal would be not rising as desired. The time interval may be narrower or wider. In the worst case, the signal may overlap. It may result in function incorrect or failure in reducing SSN. In Figure 3.11, we test a case to check the circuit effect by process variations. The two cases are the input time difference are unequal or not. We simulate the unequal input time difference to be 8ps or 24ps. After simulation results, we observe the output eye diagrams are the same. The right side in Figure 3.11 is the output eye diagram comparison about the two cases.

According to the simulation results above, we can assume the process variation will not affect the orderly turn-on buffer operation.

The orderly turn-on buffer circuit design has some advantages. First, the area is small than other design or patent. The conventional method to generate signals with close timing is to use the resistor and gate parasitic capacitance[26]. It uses the R-C delay to generate the timing-close signal. But the resistor consumes a lot of chip area. The inverter loading approach we proposed uses very small area. The whole circuit only uses minimal CMOS inverters. The second advantage of this circuit is the power consumption. The all-digital circuit design makes the average current very small.



Figure 3.12 Simulation of the duty cycle adjust buffer

After the simulation of delay buffer, we check the duty cycle modulation

function. Figure 3.12 shows the simulation result. Like Figure 3.6,  $V_{in+}$  and  $V_{ip-}$  have a phase difference about 15ps~25ps, it is about 20ps in our SPICE simulation. We notice that  $V_{in+}$  rises to turn-on the NMOS in Figure 3.6 and after 20 ps  $V_{ip-}$  rises to turn-off the PMOS. This function minimizes the current change to power and ground and reduces the SSN.



Figure 3.13 The relation of SSN and duty cycle phase difference

In Figure 3.13, we simulate the several phase difference case of duty cycle circuit, e.g. 20ps in Figure 3.12. We observe the SSN effect. We can obtain an experience that when the phase difference is about 20~30ps, the SSN noise will be minimal. So that is why we design the phase difference equal to 20ps.

#### 3.4.2 The Driver and Noise Simulation

After the predriver simulation, the overall transmitter simulation will be discuss in this section. The simulation environment is shown in Figure 3.14. The power and ground include the inductance caused by bonding wires. The output pin also include a 2nH inductance. Output pads are replaced by 1pF capacitance because of the measurement probe. The internal resistors  $100\Omega$  is implement by poly silicon and the far-end resistor is built inside the receiver. The input is a random data generated by C language with bit rate of 2.5Gbps with a jitter of 20ps jitter. We can see the output eye diagram have a jitter about 50ps.



Figure 3.14 The overall transmitter SPICE simulation result

The differential output eye diagram height is +400mV~-400mV. It conforms to the IEEE LVDS standard specification. But we can not guarantee that the output will be set to this value with process variation. If the output differential swing is too high, we can turn-off some programmable drivers to decrease the driver current. We will simulate the process variation consideration next section.

Besides the output eye diagram simulation, another important simulation is the SSN effect. By SPICE tool, we can observe the power supply noise and the ground bounce easily. In Figure 3.15, the power supply noise decrease 31% effectively and ground bounce about 43%. The SSN noise is about 100mV due to our design.



Figure 3.15 The SSN noise comparison

We have verified our design does decrease the SSN effect by SPICE simulation. But we can not measure the power noise effects directly. The only signal we can measure is the output. So we can observe the output eye diagram to analysis the result of our proposed circuit design. As shown in Figure 3.16, the upper eye diagram is our proposed transmitter and the lower one is the conventional one.



Figure 3.16 The eye diagram comparison of proposed and conventional transmitter

The obvious characteristic in above figure is that the slew rate is slower than the conventional one. This is one expectated because of the orderly turn-on driver. Although the slew rate slow down in this design, the signal integrity is better than conventional one. The rising signal has a smooth curve when we operate at orderly turn-on driver. Oppositely, the overshoot in conventional one makes the signal integrity easily faulted due to the noise during transmits data.

### 3.4.3 Process Variation Simulation

Due to the process variation and temperature change, the transistor may work in fast-fast or slow-slow situation. It may make the output or something in fault. In our circuit design, we must have a rule to decide the size of the LVDS driver. This rule will cover any process variation and condition change outside the chip. In our transmitter design, there are a total of thirteen driving transistors. Six transistors are always orderly turned-on all the time. The remaining seven transistors can compensate the output level by control pin outside the chip.

The Table 3-2 and Table 3-3 show the LVDS transmitter output swing and the common-mode voltage with regard to any combination of PMOS and NMOS situation. The typical-typical, fast-fast, and slow-slow cases are shown in Table 3-2. In TT case, we turn on both four compensated transistors in PMOS and NMOS. The transistors size is designed to have an output swing of 450mV and a common mode voltage of 0.9V. By the way, we make the electrical specifications the same for all TT, FF, and SS cases. The seven compensated transistors are all turned on when it operate at slow-slow case. On the other hand, all compensated transistors are turned off when the chip operates at fast-fast case. So we can guarantee to fit in with the electrical specification (400mV swing and 0.9V common mode voltage) in any process corners.

| ТТ              |     |     |     |     |     |     |     |     |
|-----------------|-----|-----|-----|-----|-----|-----|-----|-----|
| N               | 0   | 1   | 2   | 3   | 4   | 5   | 6   | 7   |
| Р               | 0   | 1   | 2   | 3   | 4   | 5   | 6   | 7   |
| V <sub>CM</sub> | 920 | 925 | 915 | 895 | 900 | 888 | 875 | 880 |
| Swing           | 335 | 360 | 385 | 440 | 438 | 463 | 445 | 435 |
|                 | SS  |     |     |     |     |     |     |     |
| Ν               | 4   | 5   | 6   | 7   | 7   | 6   | 6   | 5   |
| Р               | 4   | 5   | 6   | 7   | 6   | 7   | 5   | 6   |
| V <sub>CM</sub> | 910 | 900 | 900 | 890 | 840 | 950 | 840 | 960 |
| Swing           | 370 | 390 | 415 | 435 | 430 | 420 | 410 | 397 |
| FF              |     |     |     |     |     |     |     |     |
| Ν               | 0   | 1   | 2   | 3   | 3   | 2   | 2   | 1   |
| Р               | 0   | 1   | 2   | 3   | 2   | 3   | 1   | 2   |
| V <sub>CM</sub> | 940 | 920 | 910 | 900 | 840 | 970 | 850 | 980 |
| Swing           | 390 | 425 | 455 | 470 | 470 | 460 | 445 | 425 |

 Table 3-2 The process consideration simulation result

#### 1996

Table 3-3 The Fast-Slow and Slow-Fast case simulation

| 10000000        |     |     |     |     |     |     |     |     |  |
|-----------------|-----|-----|-----|-----|-----|-----|-----|-----|--|
|                 | SF  |     |     |     |     | FS  |     |     |  |
| Ν               | 4   | 5   | 6   | 6   | 1   | 2   | 2   | 2   |  |
| Р               | 1   | 2   | 3   | 4   | 4   | 5   | 4   | 6   |  |
| V <sub>CM</sub> | 890 | 890 | 900 | 890 | 950 | 920 | 850 | 890 |  |
| Swing           | 390 | 420 | 445 | 445 | 400 | 420 | 420 | 445 |  |

Table 3-3 shows the other two process variation case, fast-slow and slow-fast. In Table 3-2, because NMOS and PMOS drift at the same time, the transistors additional turned on are usually the same. But in the two special cases, the number of NMOS and PMOS are different. As shown in Table 3-3, in order to hold the swing and common mode, the turn-on transistors combinations for different cases.

In Figure 3.17, we plot the data in Table 3-2 to observe the margin in this design. The X axis indicates the number of the turn-on transistor. The common mode

voltage is shown in the upper three lines in this figure. We can see that whatever process situation, the common mode voltage can be 0.9V by our compensated circuit. At the same way, the output swing can be 400mV in magnitude in TT, FF or SS cases. This is the important guidelines that we design the transistor size of the compensated circuit.



Figure 3.17 The process variation verification result

After the all simulation of the proposed transmitter, we can see the simulation results show that the architecture work for LVDS. The layout of the circuit design and chip measurement results are shown in Chapter 5.

# 3.5 Summary

In this chapter, we introduce and derive the SSN effect in the first section and compare the different techniques for reducing the power noise. In the second section, we propose a LVDS transmitter with SSN reducing technique. We simulate all the building blocks and verify them in order to apply in other high-speed links. Every building block is simulated by SPICE in Section 3.3. In the last section, process variation is considered in our design and the simulated eye diagram shows that it can apply into the IEEE LVDS standard.



# **Chapter 4**

# A Serial-ATA Driver with Output

# **Self-Calibration**



# 4.1 Motivation

The output and common mode voltages in Chapter 3 are controlled by off-chip control pins. In reality, the transmitter does not have any additional control pins when it is integrated with other system blocks. So we have to make the output level self adjustable to work under any process and temperature variations. We will realize this notion with a high-speed serial link specification, Serialized AT Attachment (Serial ATA).

# 4.2 Overall Architecture

The transmitter architecture is based on the LVDS transmitter. In Figure 4.1, we show all the building blocks of our proposed transmitter. The driver is the same as LVDS driver we described in Chapter 3. But, the control logic is replaced by a finite state machine (FSM). The FSM control the compensated driver by another digital circuit (up-down counter). The inputs to the up-down blocks are from the output detection circuit. It is composed of four simple comparators shown in the figure. Due to this architecture, the output has a feedback to hold on the output within the desired region. The self-calibration feedback will not work all the time. We add slow counter to decide the time for the calibration feedbacks. When the transmitter is in the calibration mode, the input is always connected to power supply, high level. When the transmitter data. We will describe the function of each block in next sections.



Figure 4.1 The overall transmitter self-calibration architecture

# 4.3 Building Blocks

## 4.3.1 Counter

The counter is the first simple block. The counter operation speed is as low as kHz. So we choose the simple architecture shown in Figure 4.2. There are six JK Flip-Flops and a NOR gate. All J and K are connected to power supply so the JK Flip-Flops become T-Flip-Flops. When the clock trigger, S1~S6 trigger in sequence. The Tx signal is LOW and S6 is HIGH initially. When the S5 trigger the NOR gate, S6 fall to LOW and trigger the last JK Flip-Flop. The Tx signal rising and set the one input (Tx signal) of the NOR gate to HIGH. So make the S6 always LOW (stable) to hold the Tx signal steady until the reset signal.



Figure 4.2 The architecture of the counter

Figure 4.3 shows the function verification of the counter above. After the reset, S1~S5 begin to count until S6 changes. Tx signal rise from LOW to HIGH and hold on until the next reset. We use this counter to decide when the transmitter to transmit or do self-calibration. Tx signal presents the operation mode of the SATA transmitter.



Figure 4.3 The function verification of the counter

## 4.3.2 Finite State Machine (FSM)

The finite state machine (FSM) decides the strength of the compensated output drivers. The S0~S7 connect to the gate of the driver to turn on or off the transistors. The FSM function like a bidirectional shifter. When the up or down signal (the two signal are always complementary) changes, S0~S7 will shift left or right depend on the up signal is HIGH or LOW. The up signal decides the multiplexer and the shift takes place when enable signal is triggered. The enable signal is the AND of the clock, txb and the EXOR of "up" and "down". Therefore, we can ensure that the shifter synchronizes with the clock and txb.



Figure 4.4 The Finite State Machine (FSM) in our transmitter.

We connect the first and last multiplexer to the power supply and ground because we need the code shift like Table 4-1. For example, if the initial value of the shifter outputs (S0~S7) are all zero, S0 will be high when a right shift command occur. On the other hand, when we shift left, the state value will be low from  $S_7$  to  $S_0$  one by one. If we continue shift right many times, all state value are high. But when a shift left command occurs,  $S_7$  must be zero. That is why we use the power supply and ground to replace the vacant input of the first and last multiplexer.

Table 4-1 An example of the FSM code



The flip-flop of the finite state machine is of the static type. We can not use the dynamic flip-flop, because after the self-calibration the output value of the D flip-flop must hold on itself forever.

## 4.3.3 Digital Control Circuit and Comparator

The digital control circuit and comparation circuit will be described in this section. The source information of the FSM is from the comparators and the digital control circuit. As shown in Figure 4.5, we use a simple two-stage operational amplifier (OpAmp) to compare the output voltage and reference voltages. Because the output is digital type, we neglect the stability of the OpAmp and add some inverters to drive the signal to digital control block.



Figure 4.5 The comparator circuit design in this application

After the comparison with output voltage level and reference voltage, we have four digital outputs. In Figure 4.6,  $V_{HIGH}$  and  $V_{LOW}$  connect to two comparators. The outputs (A, B, C, D) will have nine cases instead of sixteen because the (A, B) and (C, D) will not have "10" output. To use the four outputs to decide the FSM to shift left or right, we follow the table (iii) in the figure. The minterm list can be achieved by a circuit design. They are just inverter and transmission gate like (iv) in the figure. Finally, the output Up/Down are the input of the FSMs.



Figure 4.6 Function and design flow of the digital control circuit

## 4.3.4 SATA Driver



The transmitter in this chapter is for Serial ATA. The Serial ATA is established by several companies such as Dell, and Intel ...etc. The SATA 1.0a specification shows the reference transmitter circuit example as in Figure 4.7. Notice that pass gate impedance plus resistor should be set to  $50\Omega$ . We design the circuit level like the right side of the figure. The four switches can be controlled to be  $50\Omega$  by our digitized driver. Due to the switches are all  $50\Omega$  ideally, the output high would be 375mV and output low be 125mV by voltage divider. So the differential output should be 250mVas SATA specification provide.

Like the previous LVDS driver, because we removed the two current sources, we will face the SSN consequences. So the SSN rejection circuit will be added in this SATA driver. The duty cycle modulation must be modified because the switches are all NMOS instead of CMOS in LVDS switches. The control signals have to be taken care to avoid functional error.



Figure 4.7 The transmitter circuit example in SATA 1.0a specification

## 4.4 Simulation Results

#### 4.4.1 The Simulation of Counter

The simulation of the transmitter counter is shown in Figure 4.8. We notice that the second reset will reset all the counter and tx signals. If there is not the second reset, the tx signal will hold on it value at high level. But we test the reset by add the second reset. According to the simulation result, the tx signal reset to zero and the counter count again from the initial value. This simulation result confirms the function of the counter and reset circuit.





### 4.4.2 FSM and Digital Control Circuit Simulation

To verify the design of FSM and digital control circuit, the spice simulation results of the transmitter circuit are shown in this section. Figure 4.9 shows the simulation of the FSM. When the up and down signal are both high, the FSM stop to shift anymore. So the S0~S7 are stable until the up or down falls to zero. Due to the static flip-flop in the FSM, S0~S7 will hold the value.



Figure 4.9 Simulation of the FSM

The FSM outputs, S0~S7, will orderly rising to HIGH until the up and down are both high. S0~S7 are connected to the gates of the SATA drivers. The input of the transmitter is connected to power supply during self-calibration. So the output are stable within the resistor divides level ( $V_{HIGH}$ =375mV;  $V_{LOW}$ =125mV). Figure 4.10 shows the  $V_{LOW}$  output change situation by the FSM control signal change. We can see that after the reset, the output is adjusted step by step with the control signal from FSM.



Figure 4.10 Output voltage change with FSM change

The difference between each step is about 4mV, which is decided by the transistor size of the output driver. But how do we design the transistor size? What conditions should we satisfy with? We will describe that design flow in the next section.

#### 4.4.3 SATA Driver Design

The turn-on transistors must have an equivalent resistance of  $50\Omega$  as described in Figure 4.7. But how can we ensure the equivalent resistor will be  $50\Omega$ ? We just control the output voltage at the desired level, and then the resistor would be equal to the needed 50 resistor value. If the output voltage is wrong, the detection circuit (comparators) will call the digital control circuit and change the FSM output value. The feedback forces the output level to the reference voltage being set up. The one important design we should do is the size of the small drivers, how many drivers are needed, what the voltage difference for an additional driver.

We design the SATA driver to work in any process variation as previous LVDS driver design. So we simulate every process situation and ensure the output voltage

contented with the SATA specification. Figure 4.11 is the typical-typical case we simulated. We set up the initial value of the FSM is 11110000 (half turn-on). We design the size of driver to ensure output level right when the number of compensated driver is four. The output voltage should be 375mV at high and 125mV at low.



Figure 4.11 The typical-typical case of driver design

When the other case occurs, we tune the size to guarantee the output correct. The extreme two cases are SS case and FF case. We have to design the circuit to satisfy two conditions. All transistors must turn on in SS case and all turn off in FF case. We arrange the simulation results in the Figure 4.12 below. The left figure is  $V_{HIGH}$  case and right side is  $V_{LOW}$  case.

Like the principle in last chapter, the simulation results above guarantee the driver fit in every environment temperature variation. Whatever any case, we can adjust the X-axis, turn-on MOSs, to change the Y-axis, output voltage in Figure 4.12. At the same method, the SF and FS cases also be considered in our SATA driver design.



Figure 4.12 The output voltage of the TT, SS and FF cases

But there is a problem may happen in this design. How can we guarantee the driver will be stable if we change the equivalent resistor of the switches? Consider the Figure 4.13, we can not ensure that  $V_{LOW}$  remain unchanged while the value of  $R_{up}$  is modified. In reality,  $V_{LOW}$  will be influenced with  $V_{HIGH}$  by(28). But  $R_{up}$  affects  $V_{HIGH}$  for the most part and  $R_{down}$  affects  $V_{LOW}$  very slight.

$$V_{HIGH} = \frac{R_{down} + 100}{R_{up} + R_{down} + 100}$$

$$V_{LOW} = \frac{R_{down}}{R_{up} + R_{down} + 100}$$
(28)

So when we change  $R_{up}$  and  $R_{down}$ ,  $V_{HIGH}$  and  $V_{LOW}$  are both influenced at the same time. We consider that one interesting case because it may be happen. If  $V_{HIGH}$  is not large enough to be 375mV, we can reduce  $R_{up}$  to raise it. But  $V_{LOW}$  rise, too. Then the circuit detects  $V_{LOW}$  too high and reduces  $R_{down}$ . At the same time,  $V_{HIGH}$  is influenced again. The cycle occurs again and again. This is the unstable situation in this circuit. So we have to check the situation would not happen by simulation.



Figure 4.13 Consider the stability of the driver

We prove that the unstable case will not happen by SPICE simulation. As shown in Figure 4.14, after S0~S3 turn-on, the  $V_{HIGH}$  output voltage falls from 389mV to 378mV. This is the first stable point. But the  $V_{LOW}$  is not stable yet and still continue to calibrate. The calibration affects the  $V_{HIGH}$  by a 1mV. Finally, both the high and low level voltage is stable because we fine tune the driver size. Although  $V_{HIGH}$  will be influenced when  $V_{LOW}$  changes, the circuit is still stable in this case.

If the rising voltage is 5mV instead of small voltage drop above case, 1mV, the calibration would work again until both the output voltage are stable. We show this case simulation in Figure 4.15. When the calibration complete, the first stable point, 121mV, holds on but it is influenced by the V<sub>HIGH</sub>. This is because the V<sub>HIGH</sub> continue calibrating. So V<sub>LOW</sub> drops 2mV from 121mV to 119mV during the effect. The sense circuit detects the voltage and turn on the calibration circuit again. After the second calibration, both the high and low voltage is stable. The calibration circuit turns off finally.





### 4.4.4 Output Eye Diagram Simulation

After check the FSM, detect circuit and digital control circuit, we simulate the output eye diagram of the SATA driver this section. In Figure 4.16, the data input is a 3.125Gbps random data by C language. Output jitter is about 40ps in post-simulation. The height of the eye diagram is about  $\pm 250$ mV. It equal to the value we design for SATA. The eye-mask is provided from SATA specification. By the figure, we can ensure that the circuit design can fit in the second generation SATA specification. Due to the orderly turn-on buffer effect, the rising time and falling time are longer than conventional transmitter design. The rising time is about 160ps and falling time is 150ps. However, the post-simulation output eye diagram of the driver fit with the SATA eye mask at 3.125Gbps. It is sure that the simulation results can operate at 3Gbps.



Figure 4.16 Eye diagram of proposed SATA transmitter

# 4.5 Tape Out and Summary

## 4.5.1 Tape Out

The proposed SATA transmitter is implemented by National Chip Implement Center (CIC) with T18-93B. The chip area is 0.6mm\*0.6mm as shown in Figure 4.17. Orderly turn-on buffer and duty cycle modulation circuit are in the middle of the chip. The all-digital counter is in the top side and analog comparators are in the bottom side. So it can avoid noise couple. Besides, the two FSMs are near the analog comparators, so we add 15um guard ring outside the FSMs to isolate the pure digital circuit from sensitive circuit. The chip will be implemented and send back in July, 2004.



Figure 4.17 Layout of proposed SATA transmitter

The post-simulation of the SATA driver is shown in the Table 4-2. The extreme low power is because of the all-digital design and low power of the driver, 500mV.

| Function              | 3Gbps SATA Driver            |  |  |  |  |
|-----------------------|------------------------------|--|--|--|--|
| Technology            | 0.18um 1P6M CMOS             |  |  |  |  |
| Supply Voltage        | 1.8V                         |  |  |  |  |
| Chip size             | 600*600 ( $\mu\mathrm{m^2})$ |  |  |  |  |
| Transistor/Gate Count | 2531 / 966                   |  |  |  |  |
| Power Dissipation     | ~10.8mW                      |  |  |  |  |
| Jitter (pk-pk)        | ~20p @3.125Gbps              |  |  |  |  |
| Output Swing          | 250mV                        |  |  |  |  |

Table 4-2 Summary of the SATA chip

### 4.5.2 Summary

In this chapter, we propose another driver that is compatible with SATA and operate in 3Gbps. The driver has an additional circuit design that can adjust the output voltage by itself. The chip can detect the output level and start the calibration by reset. After several clock times, the calibration function is complete and the driver begins to transmit the data. The output voltage we want to set in this driver can decide by different application. Because of this function, we can guarantee that the chip can operate in the specification of SATA.

# **Chapter 5**

# **Measurement Results**



## 5.1 Tape Out

We have described the implement and measurement of the chip in Chapter 3, the LVDS transmitter with SSN rejection. First, we show the chip layout. Second, we introduce the chip performance and show the performance table. Finally, we prepare the measurement environment and discuss the test board considerations. We show the chip eye diagram from the LVDS transmitter and a jitter of 98ps at 2.5Gbps operation.

#### 5.1.1 Layout

Figure 5.1 shows the single transmitter layout. The Decoder and Pre-driver are all digital circuit blocks. There are two power supplies, one for digital circuit and the other for the transistors of the LVDS driver. The internal  $100\Omega$  resistor is implemented by poly silicon as shown in the left of the layout. The transmitter area is



210um\*210um. Because we design the driver in all-digital approach, there is no current source in circuit design. So the area is very small.

Figure 5.1 The single transmitter layout

We tape out the test chip by integrating four transmitters together in single chip. As shown in Figure 5.2, four transmitter copies are placed in the left side. There are two 100pF decouple capacitances in the chip for each power supply. The chip area is 1500um\*860um and gate count is about 1316. The chip power consumption is about 78.8mW when the four transmitters operate at the same time. So one single transmitter consumes less than 20mW when operate at 2.5Gbps. This is very small with regard to other LVDS transmitter design. The chip summary is shown in Table 5-1.



The chip was implemented by National Chip Implement Center (CIC) in T18-92E 0.18um CMOS 1P6M technology. The chip photograph is shown in Figure 5.3.


Figure 5.3 The overall chip photograph

#### 5.1.2 Measurement

Figure 5.4 shows a picture of the test board and the test setup. A common FR-4 dielectric materiel with a loss tangent of 0.035 was used. The thickness of the line trace is 1 oz. (1.37 mils) and the separation of the LVDS differential signal line is adjusted to 75 mils such that the 22 mils traces have a  $100\Omega$  differential impedance. The power supplies are heavily bypassed both on chip and off chip. Separate power supplies are used for the I/O circuit. It includes pre-driver and LVDS driver in order to minimize noise coupling and ease the power measurement. Some off-chip SMD capacitors of 1nF to 1000nF are used in the vicinity of the core of the chip. The SMD capacitors bypass the differential outputs. The capacitor values are gradually increased away from the chip. Big aluminum electrolytic capacitors are placed in the vicinity of the power supply connectors.

Probe specifications for measuring LVDS signals are unique due to the low driver level of LVDS (8mA in our work). Either a high impedance probe,  $100k\Omega$  or greater, or the differential probe, greater than 1 GHz bandwidth, must be used. The capacitive loading of the probe should be kept in the low pF range and the bandwidth of the probe should be at least 1 GHz, 4 GHz is preferable to acquire the waveform properly.



Figure 5.4 Experiment setup of the transmitter measurement prototype

Transmitter measurements are made using an Agilent 86100B wide-band oscilloscope. The high impedance probe of the SD-14 head is used to measure the LVDS signal. This probe offers a 100k $\Omega$  and 0.4pF of loading and a bandwidth of 4 GHz. An Agilent 81300A pulse data generator is used to measure the test chip. It has a measured differential output RMS jitter of 14.25ps at 2.5Gbps (peak-to-peak jitter is about 98ps).



Figure 5.5 The measurement eye diagram of single-ended output

To compare the signaling technique between single-ended and differential, we measure the output eye diagram in single-ended and differential. Figure 5.5shows the single-ended output eye diagram. The RMS jitter is 18ps at 1.25Gbps and peak-to peak jitter is 120ps. At 2.5Gbps, the jitter is about 220ps. When we measure the output by differential mode, the jitter is decreased greatly to be 47ps as shown in Figure 5.6. Thus it can be seen that the differential output decrease greatly the timing noise. That's why the differential signaling technique is widely used in modern high-speed links design. Note that the height of the eye diagram is about 400mV. It conform the LVDS specification.



Figure 5.6 The comparison eye diagram of single-ended and differential output

## **Chapter 6**

# Conclusion



### 6.1 Conclusion

In this thesis, we proposed a 2.5 Gbps transmitter architecture that uses novel scheme with SSN reduction for inter-chip communication. Different from previous researches that use other techniques that describe in Chapter 2 and Chapter 3, the proposed orderly turned-on and duty cycle modulation method to reduce SSN is effective in resolving the power noise and signal integrity problems. Furthermore, a 2.5Gbps transmitter that is compatible with LVDS standard and a 3 Gbps driver that is compatible with SATA standard has been described in Chapter 3 and Chapter 4, respectively. The corresponding transmitter has been implemented in TSMC digital 1P6M 0.18-µm CMOS technology through CIC. The measurement results show that the first LVDS driver have a peak-to-peak jitter of about 98 ps from 1.8V power supply. The second driver that conforms to the SATA specifications correctly has a

peak-to-peak jitter less than 50ps in SPICE Post-simulation. The proposed SATA driver can operate in 3.125 Gbps and the power is very low because the all-digital driver designs. At the same way, the SATA driver can reduce the SSN effect by the same method as the previous LVDS driver. It has an additional function. It can calibrate the output level itself by the feedback loop inside the chip. So, a self-calibrate, all-digital 3Gbps SATA driver is implemented in this thesis.

### 6.2 Future Work

Due to the digitized transmitter design, we separate the driver into several small drivers. We can use the advantage of these separated transistors. We can control the transistors by digital circuit. The digital circuit operate from the information by detect circuit. So, we can realize all digital pre-emphasis circuit technique.

The speed in SATA will be 6Gbps in the third generation in the future. The speed between interconnection of chip is faster and faster, the equalized methods and pre-emphasis techniques are the most popular design method in the future. According to our circuit technique in SSN effect reduction and self-calibration, the pre-emphasis technique can be realized in our driver design by control the number of turned-on transistors.

# **Bibliography**

- [1] C.-K. K. Yang, R. Farjad-Rad and M. A. Horowitz, "A 0.5-um CMOS 4.0-Gbit/s Serial Link Transceiver with Data Recovery Using Oversampling," *IEEE Journal of Solid-Sate Circuits*, vol. 33, pp. 713-722, May, 1998.
- [2] R. Mooney, C. Dike and S. Borkar, "A 900 Mb/s Bidirectional Signaling Scheme," *IEEE Journal of Solid-Sate Circuits*, vol. 30, pp. 15381543,Dec, 1995.
- [3] N. Kushiyama, S. Ohshima, D. Stark, et al., "A 500-Megabyte/s Data-Rate 4.5M DRAM," *IEEE Journal of Solid-Sate Circuits*, vol. 28, pp. 490-498, Apr, 1993.
- [4] K. k.-Y. Chang, W. Ellersick, S.-T. Chuang, et al., "A 2 Gb/s Asymmetric Serial Link for High-Bandwidth Packet Switches," presented at HOT INTERCONNECTS SYMPOSIUM V, Stanford University, Stanford, CA, 1997.
- [5] C.-K. K. Yang and M. A. Horowitz, "A 0.8um CMOS 2.5Gbps Oversampling Receiver and Transmitter for Serial Links," *IIEEE Journal of Solid-Sate Circuits*, vol. 12, pp. 2015-2023, Dec, 1996.
- [6] A. X. Widmer, K. Wrenner, H. A. Ainspan, et al., "Single-Chip 4 x 500-MBd CMOS Transceiver," *IEEE Journal of Solid-Sate Circuits*, vol. 31, pp. 2004-2014,Dec, 1996.
- [7] C.-K. K. Yang, *Design of High Speed Serial Links in CMOS*, PhD Dissertation, Stanford University, 1998
- [8] B. Razavi, "Prospects of CMOS Technology for High-Speed Optical Communication Circuits," *IEEE Journal of Solid-Sate Circuits*, vol. 37, pp. 1135-1145,Sep., 2002.
- [9] T. H. Hu and P. R. Gray, "A monolithic 480 Mb/s parallel AGG/decision/clock-recovery circuit in 1.2-um CMOS," *IEEE Journal of Solid-Sate Circuits*, vol. 28, pp. 1314-1320,Dec, 1993.
- [10] R. Farjad-Rad, A *CMOS 4-PAM Multi-Gbps Serial Link Transceiver*, PhD Dissertation, Stanford University, 2000

- [11] IEEE Standard for Low-Voltage Differential Voltage (LVDS) for Scalable Coherent Interface (SCI), IEEE Std. 1596.3-1996, 1994
- [12] A. Boni, A. Pierazzi and D. Vecchi, "LVDS I/O Interface for Gb/s-per-Pin Operation in 0.35-um CMOS," *IEEE Journal of Solid-Sate Circuits*, vol. 36, pp. 706-711,Apr, 2001.
- [13] W. J. Dally and J. W. Poulton, Digital System Engineering: Cambridge University Press, 1998.
- [14] E. F.-Y. Yeung, Design of High-Performance and Low-Cost Parallel Links, PhD Dissertation, Stanford University, 2002
- [15] R. Ho, K. W. Mai and M. Horowitz, "The Future of Wires," presented at Proceeding of the IEEE, 2001.
- [16] H. Johnson and M. Graham, High-Speed Signal Propagation : Advanced Black Magic: Prentice Hall Professional technical Reference, 2002.

#### ALL DE LE DE

- [17] D. A. Johns and D. Essig, "Integrated circuits for data transmission over twisted-pair channels," *IEEE Journal of Solid-Sate Circuits*, vol. 32, pp. 398-406,Mar, 1997.
- [18] S. H. Hall, G. W.Hall and J. A. McCall, High-Speed Digital System Design : A Handbook of Interconnect Theory and Design Practices: John Wiley & Sons, Inc., 2000.
- [19] D. K. Cheng, Fundamentals of Engineering Electronmaggnetics: Prentice-Hall, Inc, 1993.
- [20] True-HSPICE Device Models Reference Manual, Release 2002.2 June, 2002
- [21] S. Chun, M. Swaminathan, L. D. Smith, et al., "Modeling of Simultaneous Switching Noise in High Speed Systems," *IEEE Journal of Solid-Sate Circuits*, vol. 24, pp. 132-142, May, 2001.
- [22] N. Na, M. Swaminathan, J. Libous, et al., "Modeling and Simulation of Core Switching Noise on a Package and Board," presented at IEEE Electronic Components and Technology Conference, 2001.
- [23] T. Sakurai and A. R. Newton, "Alpha-Power Law MOSFET Model and its Applications to CMOS Inverter Delay and Other Formulas," *IEEE Journal of Solid-Sate Circuits*, vol. 25, pp. 584-594, Apr, 1990.

- [24] S.-J. Jou, W.-C. Cheng and Y.-T. Lin, "Simultaneous Switching Noise Analysis and Low Bouncing Buffer Design," presented at IEEE Custom Integrated Circuits Conference, 1998.
- [25] S.-J. Jou, S.-H. Kuo, J.-T. Chiu, et al., "Low Switching Noise and Load-Adaptive Output Buffer Design Techniques," *IEEE Journal of Solid-Sate Circuits*, vol. 36, pp. 1239-1249, Aug, 2001.
- [26] H.-C. Chow, "CMOS Output Buffer Having a High Current Driving Capability with Low Noise," U.S. Patent, # 5854560, 1998

