# **5-Gb/s Low Power On-Chip Pulse Signaling Interface**

*Yinglin Fang, Hungwen Lu\*, and Chauchin Su* 

Dept. of Electrical and Control Engineering National Chiao Tung University, Hsin-Chu, Taiwan, R.O.C. ylf.ece94g@nctu.edu.tw; ccsu@cn.nctu.edu.tw

**Abstract —** This paper proposes an on-chip pulse signaling communication. It can be used for long distance and low power interconnection on SOC. The pulse signaling communication consists of a transmitter, an on-chip transmission-line and a receiver. By increase the termination resistance at the near end, we can increase the amplitude of the transmitted pulse signal. And then, a de-emphasis circuit is employed to reduce the ISI effect both in the transmitter and in the receiver. A TSMC 0.13um RF process was utilized in our design. In the simulation result, 5Gbps signal transmission can be achieved through a 5mm-length differential interconnect. The power consumption at Tx and Rx are 3.2mW and 3.4mW respectively and the total power consumption is 6.6mW.

**Keyword:** AC coupled, pulse signaling, capacitive coupling, on-chip communication, driver, receiver, de-emphasis, equalization.

# **1. INTRODUCTION**

With the COMS technology grows in recent years, there has been a great interest in SOC design. It results in large chip size and high power consumption. In conventional chip design, the overall system efficiency depends on the performance of individual module. However, with the distance between modules increases, the module-to-module data communication bandwidth becomes an important issue of SOC design. Because the long distance communication not only decays the signal amplitude but also requires high power consumption to transmit the signal. The long distance on-chip transmission line has a large parasitic resistance, and the resistance has frequency dependence due to the skin effect. The large parasitic resistance and parasitic capacitance make the signal decay greatly. Furthermore, the power consumption of an electrical signal in SOC is governed by two components. The first component is due to leakage current and DC path between VDD and GND, known as the static power, and the second one is due to switching transient current and short circuit transient current, known as the dynamic power [1]. In conventional high speed link design, *current mode logic* (CML), *Positive emitter coupled logic* (PECL), *low voltage differential signaling* (LVDS) [2] are mostly used. They all require a current source to drive the communication data and the current source will increase the power consumption especially when high speed data is transmitted.

In this paper, we will explore the pulse signaling [3][4] to improve the power efficiency of long distance on-chip interconDept. of Electrical Engineering\* National Central University, Chung-Li, Taiwan, R.O.C. s9521011@cc.ncu.edu.tw

nection. The pulse signaling method is base on the fact that the AC component actually carries all the information of a digital signal and that treat the DC component as redundant. The pulse signaling transmits data by using AC coupled method and that consumes only the dynamic power. Therefore the pulse signaling would reduce the power consumption of the on-chip data communication.

Figure 1 shows the high speed data communication in the SOC design. The distance from driver end to receiver end is in the range of 1000um to 5000um. The long distance transmission line goes through the space between modules for saving chip area. In this way, a long distance transmission of small area overhead is required. Besides, there are also two targets to design this pulse signaling. The first is the structure must be simple and easy for implementation. The second is the circuit should operate at high speed and consume low power for SOC application.



Figure 1. On-chip long distance interconnection

#### **2. ON-CHIP PULSE SCHEME**

Figure 2 shows the on-chip pulse signaling scheme. The transmitter and the receiver are coupled to the on-chip transmission line through the coupling capacitor  $C_{\text{ctx}}$  and  $C_{\text{crx}}$ . Transmitter and receiver ends of the data bus are terminated by the impedance  $Z_{tx}$  and  $Z_{rx}$  respectively with the terminated voltage  $\hat{V}_{term}$ . Furthermore, the transmission line acts like the distributed RLC that decays the transmitted data. The coupling capacitor makes it a high-pass filter that transmits the transient part of the input data. The DC component is blocked. The pulse signaling method is base on the fact that the AC component actually carries all the information of a digital signal. The DC component is treated as redundant. In this way, the pulse data acts as *return to zero signaling* (RZ). In contrast to *non-return to zero* (NRZ) signaling,

pulse signaling has been used to reduce the power consumption by only dissipating the dynamic power at transient time.



Figure 2. Pulse signaling scheme

The step response is important in analyzing the pulse signaling. A step input voltage at node A results in a transient on node C. Transforming a square wave into a short triangular pulse wave on the transmission line. The amplitude  $(V_p)$  becomes

$$
V_{p} = R_{\text{eff}} I_{c} = (Z_{\text{rx}} / / Z_{t}) C_{c} \frac{dV_{A}}{dt}
$$
 (1)

and the voltage value at node B is



Figure 3. (a) On-chip pulse signaling model (b) Transient response of the transmitter

Where  $R_{\text{eff}} = (Z_{\text{rx}}/Z_t)$ ,  $C_{\text{eff}} = C_{\text{cx}} C_d / (C_{\text{cx}} + C_d)$  and  $t = R_{\text{eff}} C_{\text{eff}}$ . Equation (1) shows that the amplitude of the pulse is proportional to three parts: the equivalent value of the termination resistance parallel

to the characteristic resistance, coupling capacitance, and the slew rate at the transmitter output. As illustrated in Fig 3(b), the transmitter needs to provide a large amplitude value of pulse signal for the decay in the transmission line. However, after the transition time, the transmitted waveform becomes steady and the coupling pulse decays according to the RC time constant. The pulse width roughly equals to the rise time during the pulse transition plus the RC decay time. If the amplitude is too large, it creates the pulse tail and that leads to the ISI effect. This ISI effect not only limits the communication speed but also increases the jitter at the receiver end. In other words, large coupling capacitance and large termination resistance are good for transferring the pulse signal but that also create the ISI issue. In this paper, we bring up an equalization method to reduce the ISI effect.

#### **3. ON-CHIP TRANSMISSION LINE**

Figure 4(a) shows the proposed on-chip differential transmission line which is fabricated by TSMC 0.13µm RF technology. A co-planar transmission line is placed in Metal 6. Metal 5 below is reserved for ground shielding. A micro-strip structure is used in *GSGSG* placing, '*S'* for signal and '*G'* for ground. The ground path is not only for signal return but also for the ground shielding. The transmission line model is analyzed and extracted to build the distributed RLC parameters by PTM [5]. Figure 4(b) illustrates the cross section of the differential transmission line. The total length (*l*) of the line is 5mm. The line width (*w*) is 2.3µm and line-to-line space (*s*) is 1.5µm.



Figure 4. (a)Micro-strip structure and the parasitic effect (b) Cross section of the on-chip transmission line

Table 1 shows the data obtained from PTM. The dimension is obtained form the TSMC 013RF technology document. The thickness (*t*) of Metal 6 is 0.37µm, the height (*h*) from Metal 5 to Metal 6 is 0.45µm, and the dielectric constant (*k*) is 3.9. The total parasitic RLC value divided by the total length of the transmission line obtains the parasitic RLC in unit length. The distributed parasitic parameters are *Rul*=25.85Ω/mm, *Lul*=1.74nH/mm, and *Cul*=306.70 fF/mm.

Table 1 Parasitic RLC of the on-chip transmission line

| Dimension        | RLC(5mm)            | $RLC$ (/mm)               |
|------------------|---------------------|---------------------------|
| $w = 2.3 \mu m$  | $R = 129.259\Omega$ | Rul = $25.8518 \Omega/mm$ |
| $s = 1.5 \mu m$  | $L = 8.728nH$       | Lul = $1.7456$ nH/mm      |
| $1 = 5000 \mu m$ | $Cc = 1255.44$ fF   | $Cg = 251.088$ fF/mm      |
| $t = 0.37 \mu m$ | $Cg = 139.035$ fF   | $Ce = 27.807$ fF/mm       |
| $h = 0.45 \mu m$ | $Ct = 1533.51$ fF   | Cul = $306.702$ fF/mm     |
| $k = 3.9$        |                     |                           |

According to the data mentioned above, the characteristic impeadance  $(Z_0)$  of the transmission line is

$$
Z_0 = \sqrt{\frac{R_{ul} + j\omega L_{ul}}{j\omega C_{ul}}} \approx \sqrt{\frac{L_{ul}}{C_{ul}}} \tag{3}
$$

The design of  $w=3.2\mu m$  and  $s=1.5\mu m$  meets the characteristic impedance of 75Ω. And there are three main reasons for this value. Firstly, 75Ω of characteristic impedance makes the parasitic R<sub>total</sub> \* $C_{total}$  < 2E-10 ( $\Omega$ \*F) which decay the signal roughly 18dB. Secondly, 75 $\Omega$  is close to 77 $\Omega$  which theoretically causes minimum attenuation. Thirdly, the values of the width and spacing reduce the layout area as well as the costs.

## **4. TRANSMITTER/ RECEIVER CIRCUIT DESIGN**

#### 4.1 Transmitter Circuit with de-emphasis

The high pass characteristic of the transmitter passes the transient part of the input data and blocks the DC signal component. Equation (1) and (2) also implies that we can use a large termination resistor to generate the better high-pass characteristics. The larger value of the termination resistance is, the larger amplitude of the pulse signal will be delivered. The trade off is that the large termination resistance will introduce a pulse tail within a bit time. This tail not only limits the maximum transmission speed but also increases the ISI effect. We introduce an equalization method to solve this problem.

Figure 5 shows a de-emphasis structure for the equalization. The structure consists of a buffer and a coupling de-emphasis capacitor  $(C_{\text{cde}})$  connected to the differential node of the transmitter output. The main idea is to use the structure to generate a complementary pulse and add the complementary pulse to the original signal. In this way, the output pulse signal at the near end can be describes as



Figure 5. Voltage mode transmitter with de-emphasis

Where  $x(n)$  is a full swing data at the last stage of the transmitter,  $x(n)$ ' is the complement of  $x(n)$ , and k is a weighting factor which depends on the delay of the buffer and the value of the de-emphasis capacitor. Figure 6 illustrates the circuit behavior of the transmitter. Take the positive terminal for example, The  $V_{op}$ is a positive full swing data at the transmitter output and the *x* is the coupling pulse. The  $V_{on}$  is negative as contrast to  $V_{op}$ . The buffer at negative terminal delays the full swing data. And then, the de-emphasis capacitor adds the complement pulse  $x^2$  to the positive terminal to remove the pulse tail. The de-emphasis circuit can reduce the ISI effect and increase the maximum transmission speed.



Figure 6. Coupled pulse signal with de-emphasis circuit

#### 4.2 Receiver End Termination

To minimize reflections, either or both side of the transmission line should be impedance matched. In this paper, a receiver end termination is implemented. Terminating at the receiver side reduces reflections and allows the transmitter side to have highpass behavior as a transmitted pulse encounters the high impedance transmitter. There is reflection noise at the far end transmission line due to the forward reflection at the un-terminated near end transmission line. This forward reflection noise is absorbed by the far end termination resistor and thus no further backward reflection noise shown at the near end termination line. Figure 7 illustrates the receiver end termination. The  $C_{eq}$  is the equivalent capacitance of the receiver end coupling capacitor  $(C_{crx})$  in serial with the receiver end gate capacitance  $(C_g)$ . In general, a node capacitance of a digital circuit is roughly 20fF. The  $C_{crx}$  in our design is 240fF. Thus, the equivalent capacitance  $C_{eq} = C_{crx}C_g/(C_{crx} + C_g) \approx 20$  fF. In other words, the input impedance at receiver end becomes

$$
Z_{rx} = R_{rx} / \frac{I}{sC_{eq}} = \frac{R_{rx}}{I + sRC_{eq}} = \frac{R_{rx}}{I + 2\pi fR_{rx}C_{eq}j}
$$
(5)

Where the  $R_{rx}$  is the receiver end termination resistance which is equal to the characteristic resistance of the transmission line (75 $\Omega$ ) and the magnitude is

$$
|Z_{rx}| = \frac{R_{rx}}{\sqrt{I^2 + (2\pi f R_{rx} C_{eq})^2}} \approx R_{rx} .
$$
 (6)

Equation (6) tells that  $Z_{rx}$  is equal to  $R_{rx}$  in low frequency range. According to TSMC 013RF technology, the maximum

rise time  $(T_r)$  of a single inverter is roughly 40ps over 1.2V power rails. This means that the edge rate of a pulse signal is  $f \approx 0.3/ T_r$  =9GHz. The termination resistance in our design matches well over the frequency range including at the data rate (5Gbps) as well as at the signal edge rate (9GHz).



Figure 7. Receiver end termination scheme

#### 4.3 Receiver Circuit: Self-Bias And Equalization

Figure 8 shows the self-bias and equalization circuit. The circuit not only automatically generates the receiver end common-mode voltage but also reduces the low frequency component of the incoming pulse signal. As illustrated in the dash line area, an inverter with input connected to output generates the  $V_{dd}/2$  common mode voltage. The common mode voltage is then connected to the receiver differential ends through two transmission gates. The size of the self-bias inverter in our design is the same as the pre-amplifier in the next stage for the matching consideration. Besides, the transmission line has a low-pass response which is due to the skin effect. The low-pass response results in a long tail on the pulse signal. If there is no equalization, the tail will cause the ISI effect and reduce the timing margin at the receiver end. In Figure 8, the dotted line area indicates the equalization circuit which consists of an inverter and a de-emphasis capacitor. The circuit de-emphasizes the low frequency components by generating a small complementary pulse and adding the complementary pulse to the original signal. The output pulse signal of the self-bias and equalization circuit can be express in

#### $y(n) = x(n) + kx'(n)$

Where  $x(n)$  is the pulse signal at the output of the self-bias and equalization circuit,  $x(n)$ ' is the complement of  $x(n)$ , and k is the weighting factor which depends on the delay of the inverter and the size of the de-emphasis capacitor.



Figure 8. Receiver end: Self-bias and equalization circuit

### 4.4 Receiver Circuit: Inductive Amplify

An inductive peaking amplifier is composed of parallel inverters as the gain stage with its output connected to an inductive peaking cell as shown in Figure 9. The main idea of the circuit is to use the inverter gain for signal amplification and the inductive peaking for improving the high-frequency performance. Each inductive peaking cell consists of a small size inverter and a resistor. The inverter is configured as diode connected by a resistor which is implemented with a transmission gate. The diode connected inductive peaking cell lowers the output resistance of the inverter and so does the gain. As a result, the 3dB frequency increases as the output resistance decreases. At low frequency, the output resistance is roughly  $1/g<sub>m</sub>$ , while it roughly equals to the resistance of a transmission gate at high frequency. It is intended to design the resistance of a transmission gate to be larger than  $1/g<sub>m</sub>$ , so it increases gain and extends bandwidth at high frequency. The receiver sensitivity depends on the equalization as well as the peaking ability of the amplify stage.



Figure 9. Receiver end: Inductive peaking amplifier

#### 4.5 Receiver Circuit: Non-Clock Latch With Hysteresis

The non-clock latch transforms the RZ pulse into NRZ data. So that the recovered NRZ data can then be fed to a traditional clock and data recovery circuit to generate the receiver end clock to re-sample the NRZ data. Figure 10 illustrates the non-clock latch structure established from four inverters. Two small size inverters are connected back to back at the differential output to generate the hysteresis range. The proper design of the hysteresis range filters out the incoming noise and the interference.



Figure 10. Receiver end: Non-clock latch with hysteresis

# **5. SIMULATION AND IMPLIEMNTION**

We use a large termination resistor to generate the better high-pass characteristics. Figure 11(a) shows different values of the termination resistors and the corresponding pulse signals. The larger value of the termination resistance is, the larger amplitude of the pulse signal will be delivered. The trade off is that the large termination resistance will introduce a pulse tail within a bit time. This tail not only limits the maximum transmission speed but also increases the ISI effect. We introduce an equalization method to solve this problem.

Figure 11(b) is the simulation of the pulse signal at the transmitter output with de-emphasis circuit included. The diamond eye shows the de-emphasis circuit cuts the pulse tail and reduces the ISI effect in one data period (200ps). The voltage swing at the near end is  $400 \text{mV}_{\text{pp}}$  with using a coupling capacitor of 280fF and a de-emphasis capacitor of 140fF.



Figure 11. (a)Voltage mode driver with different termination resistance, (b)Voltage mode driver with de-emphasis



Figure 12. System simulation results

Figure 12 shows the system simulation results. The transmitter end sends the differential pulse signal with an amplitude of  $400 \text{mV}_{\text{pp}}$ . After a 5mm long on-chip transmission line, the pulse amplitude decays to  $50 \text{mV}_{\text{pp}}$  at the receiver front end. The preamplifier stage amplifies the pulse amplitude to  $400 \text{mV}_{\text{pp}}$  and fed it to the hysteresis latch. The latch then turns the RZ pulse signal into the full swing NRZ data. After an open drain output circuit, the oscilloscope can measure the received data with a swing of  $240 \text{mV}_{\text{pp}}$ . The eye diagram of the receiver shows the peak-topeak jitter is 43.7ps (0.218UI) in the TT case. The worse case takes place in SS corner and the jitter is 82.7ps (0.413UI).



Figure 13. Layout

The proposed on-chip pulse signaling is implemented by National Chip Implement Center (CIC) in TSMC 013RF technology. The core area is 584.7 $\mu$ m×411.5 $\mu$ m including a transmitter of 96.1µm×57.8µm, a receiver of 80.1µm×52.1µm, and a 5mm long on-chip transmission line. The total area is  $884\mu m \times 644\mu m$  as shown in Figure 13. Table 2 lists the chip summary. The power consumption of the transmitter is 3.4mW, and it is 3.2mW for the receiver at the data rate of 5Gbps.

**Table 2. Specification**

| Item                               |                | Specification (unit)                         |  |
|------------------------------------|----------------|----------------------------------------------|--|
| Process                            |                | TSMC 0.13µm RF                               |  |
| Supply Voltage                     |                | 1.2V                                         |  |
| Data Rate                          |                | 5Gb/s / channel (at 2.5GHz)                  |  |
| <b>BER</b>                         |                | $10^{-12}$                                   |  |
| Coupling Caps                      |                | $Tx (280+140)$ fF; Rx $(240+5)$ fF           |  |
| Link                               |                | 5mm and 75 $\Omega$ on chip micro-strip line |  |
| Jitter of receiver data (pk-to-pk) |                | 43.7ps (0.218UI)                             |  |
| Transmitter End Layout Area        |                | 57μm x96μm                                   |  |
| Receiver End Layout Area           |                | $52\mu m$ x $78\mu m$                        |  |
| Core Layout Area                   |                | 884um x644um                                 |  |
|                                    | Pulse Driver   | 3.402mW                                      |  |
| Power dissipation                  | Pulse Receiver | 3.213mW                                      |  |
|                                    | Total          | 6.615mW                                      |  |

Table 3 lists the comparison of the pulse signaling and other on-chip communications. Our work uses TSMC 013RF technology to implement an on-chip 5Gbps pulse signaling. The total communication distance is 5mm with a differential transmission line of width in 2.3µm and line-to-line spacing in 1.5µm. The line consumes small area overhead as compares to other onchip differential lines [8][9]. Compared to the twisted differential wire method [10], our work uses a wider width as well as the wider line-to-line space. But the twisted differential wire method has only  $40mV_{pp}$  voltage swing at the far end. However, our work has full swing data at receiver output and that can be used for further receiver end usage. The power consumption of the transmitter in our work is 0.68pJ/bit and the total power is 1.32pJ/bit. Our work is the lowest in power consumption.





## **6. CONCLUSION**

In this paper, we have proposed a 5Gbps on-chip pulse signaling interface. Different from previous researches of pulse signaling or other on-chip communication, our near end architecture uses a high termination resistor combing the de-emphasis scheme to reduce ISI effect as well as to increase the maximum data rate. At far end, the self-bias circuit, the pre-amplifier, and the non-clock latch compose the receiver circuit. The self-bias circuit generates the common mode voltage for the receiver such that the pulse mode data can be received. The amplifier stage and the non-clock latch increase the amplitude of the pulse signal and then transfer the RZ pulse signal into NRZ data. The latch also has an input hysteresis range that can filter out the incoming noise from imperfect termination and common-mode disturbances such as the ground bounce. The receiver circuit is design in a simple scheme and easy for implementation.

We have analyzed the pulse signaling as well as the on-chip channel model. We have also designed a 5mm on-chip differential transmission line in our chip. The characteristic impedance of the line is  $75\Omega$  to minimize the attenuation. Furthermore, the geometry of the line is chosen in the width of 2.3µm and the spacing of 1.5µm. Our on-chip transmission line has small area overhead as compares to other works.

The simulation results show that the receiver has a peak-topeak jitter of 40.7ps from 1.2V supply. The power consumption of the transmitter is 0.68pJ/bit and total power is 1.32pJ/bit. This on-chip pulse signaling is fabricated in TSMC 0.13µm RF technology. The total chip occupies 884µm×644µm of area including a transmitter of 57µm×96µm, a receiver of 52µm×78µm and an on-chip transmission line.

## **REFERENCES**

- [1] Hamid Hatamkhani, Chin-Kong Ken Yang " Power Analysis for High-Speed I/O Transmitters," *IEEE Symposium On VLSI Circuit Digest of Technical Papers* , pp142-145, Jan 2004
- [2] "Introduction to LVDS, PECL, and CML," *MAXIM High-Frequency/Fiber Communications Group Application Note*  HFAN-1.0 (Rev. 0, 9/00), Some parts of this application note first appeared in Electronic Engineering Times on July 3, 2000, Issue 1120
- [3] Min Chen and Yu Cao,"Analysis of pulse signaling for low-power on-chip global bus design," *Proceedings of the 7th International Symposium on Quality Electronic Design*, Mar. 2006.
- [4] Jongsun Kim; Jung-Hwan Choi; Chang-Hyun Kim; Chang, A.F.; Verbauwhede, I.; "A low power capacitive coupled bus interface based on pulsed signaling," *Custom Integrated Circuits Conference*, pp 35 – 38, Oct. 2004.
- [5] http://www-device.eecs.berkeley.edu/~ptm
- [6] J. Kim, I. Verbauwhede, and M.-C. F. Chang, "A 5.6-mW 1-Gb/s/pair pulsed signaling transceiver for a fully AC coupled bus," *IEEE J. Solid-State Circuits*, vol. 40, no. 6, pp. 1331–1340, Jun. 2005.
- [7] L. Luo, J. M.Wilson, S. E. Mick, J. Xu, L. Zhang, and P. D. Franzon, "A 3 Gb/s AC coupled chip-to-chip communication using a low swing pulse receiver," *IEEE J. Solid-State Circuits*, Vol. 41, No. 1, pp 287-296, Jan 2006.
- [8] Ito, H.; Sugita, H.; Okada, K.; Masu, K., "4 Gbps On-Chip Interconnection using Differential Transmission Line," *Asian Solid-State Circuits Conference*, pp 417-420, Nov. 2005.
- [9] Takahiro Ishii, Hiroyuki Ito, Makoto Kimura, Kenichi Okada, and Kazuya Masu "A 6.5-mW 5-Gbps On-Chip Differential Transmission Line Interconnect with a Low-Latency Asymmetric Tx in a 180 nm CMOS Technology" *IEEE A- Solid-State Circuits Conference*, pp131-134, Jul 2006.
- [10] Daniël Schinkel, Eisse Mensink, Eric A. M. Klumperink, Ed (A. J. M.) van Tuijl, and Bram Nauta, "A 3-Gb/s/ch Transceiver for 10-mm Uninterrupted RC-Limited Global On-Chip Interconnects" *IEEE J. Solid-State Circuits*, vol. 41, no. 1, pp. 297–306, Jan. 2006.