# 行政院國家科學委員會專題研究計畫 成果報告

子計畫四:光纖傳輸之類比前端積體電路(3/3)

計畫類別: 整合型計畫

計畫編號: NSC94-2220-E-009-005-

執行期間: 94年08月01日至95年07月31日

執行單位: 國立交通大學電子工程學系及電子研究所

計畫主持人:吳介琮

報告類型: 完整報告

處理方式: 本計畫可公開查詢

中華民國95年6月19日

## 光纖傳輸之類比前端積體電路

## **Analog Front-End ICs for Optical-Fiber Transmission**

計畫編號 : NSC-94-2220-E-009-005

執行期限 : 自 94 年 8 月 1 日起至 95 年 7 月 31 日止

主持人 : 吳介琮 交通大學電子研究所

博士生 : 周儒明、王仲益、陳自強、張智閔 交通大學電子研究所

Email: jtwu@mail.nctu.edu.tw http://www.cc.nctu.edu.tw/~jtwu

### 一、摘要

本計畫將研發用於寬頻傳輸系統之高速 類比前端積體電路。所考慮的光纖系統至少 有 10 Gb/s 的傳輸速率。為了便於用數位訊 號處理的技術來提升傳輸效能,前端電路須 有足夠的動態範圍,並能將輸入訊號轉成多 位元之數位訊號。規劃的電路規格有訊號頻 寬超過 6 GHz,解析度超過 4 Bits,而等效 取樣頻率超過 10 GSamples/s。本計畫今年度 的重點是類比數位轉換器以及時序產生器 等。所設計之電路將以先進之 CMOS 製 程,如 0.13 µm,製作成晶片來加以驗證。 由於取樣並保存電路與類比數位轉換器將採 用平行架構來提升等效取樣頻率,本計畫也 **會發展所需的校正技巧來修正增益不匹配、** Offset 不匹配、以及取樣相位不匹配等各種 誤差。所製作的晶片將用 Silicon-on-Package (SoP) 的技術加以封裝,以利於高頻量測。

關鍵詞:混合訊號式積體電路、類比前端電路、光纖傳輸、CMOS。

#### Abstract

This project is to design and realize analog front-end integrated circuits for high-speed optical fiber communication systems. In addition to having more than 10 Gb/s data transmission rate, the front-end, with sufficient dynamic range, will convert the received optical signals into multi-bit digital digital data, so that signal processing techniques can be used to improve The specifications transmission efficiency. for the circuits are at least 6-GHz signal bandwidth, at least 4-bit resolution, and at least 10-GSamples/s effective sampling rate. This we focused design on the analog-to-digital and clock converters

generators. All circuits will be realized using advanced fabrication technologies, e.g., 0.13 µm CMOS. Since the parallel architecture will be applied in the sample-and-hold and analog-to-digital functions, the calibration techniques will also be developed to correct errors such as gain mismatches, offset mismatch, and sampling phase mismatch. silicon-on-package In addition, (SoP) technology will be used for packaging the fabricated chip, SO as to facilitate high-frequency characterization.

Key Words: Mixed-Signal Integrated Circuits, Analog Front-End Circuits, Optical Fiber Transmission, CMOS.

## 二、緣由與目的

光纖是目前具有最高傳輸量的通訊傳輸 介質。其特性是寬頻,低損耗訊號傳輸,以 及不易受外界雜訊之干擾。然而當傳輸速率 大於 10 Gb/s 且傳輸距離長達數十以至數 百公里時,光纖的信號傳輸便已不能再被視 為是理想的完美傳輸。光纖傳輸的不完美特 性主要包括傳輸損耗 (Attenuation)、群延遲 波形分散 (Differential Group Delay) 及色相 分離 (Chromatic Dispersion) 等現象。雖然可 利用不同型式的光纖互相做初步的補償,若 在接收機電路使用等化器 (Equalizer) 對信 號做進一步等化,則可得到更準確穩定的信 號傳輸效能。另外,使用多位階傳輸訊號 (Multi-level Signaling) 可更有效地利用有限 的頻寬,也是克服電路頻寬限制及光通道不 完美特性的一種方法。而在長距離超高速傳 翰時,利用 Forward Error Correction 的通道 編碼方式來修正傳輸錯誤,以增進傳輸效益 的架構也必然會成為一種趨勢。以上所敘述 的種種發展方向都會影響接收機的設計考 量。如 Fig. 1 之所示,先進的接收機將會引

進數位訊號處理 (Digital Signal Processing, DSP) 的技術。接收端所接收到的訊號不能 視為只有 0 和 1 兩階,電路設計除了須符 合頻寬和雜訊的要求之外,還要顧及線性度 和足夠的動態範圍。至於時序及資訊的還 原,則會使用多階的類比數位轉換器 (ADC) 配合 DSP 技巧以求得最佳功能。



Fig. 1: 先進接收機架構。

本計畫將研究光纖系統接收機所需的 CMOS 類比電路設計技術。其中電路包括 TIA、可調變增 益放大 (Programmable-Gain Amplifier, PGA) 、取 樣並保持電路(Sample and Hold, S/H)、類 比數位轉換器 (Analog-to-Digital Converter, ADC)、以及時脈產生器(Clock Generator)。 目標是 6 GHz 以上的訊號頻寬, 4 Bit 以上 的解析度,而等效取樣頻率則大於 10 GSamples/s。本計畫將會檢視放大器、取樣 並保持電路、類比數位轉換器、以及時序產 生器在高速應用中的基本限制。對於深次微 米的元件特性,包含頻寬限制、線性度、和 雜訊,也都會有深入的研究。所設計之電路 都將以最先進之 CMOS 製程,如 0.13 um, 製作成晶片。

在設計高速電路的過程中,所使用被動元件及其所造成的各類寄生效應均面臨極嚴苛的考驗,而其中最重要的技術瓶頸在於 IC 封裝的技術。傳統 IC 封裝中接腳是以細金線作 Bond Wire 連接晶片與外部接腳或外部元件,其寄生電感及晶片上 Bond Pad 的寄生電容在高速電路應用中,會使頻率響應變得極為不理想。在設計極限下,其 LC 等效電路之截止頻率 (Cut-off Frequency) 約只有 10~20 GHz。

為此,本計畫將引用覆晶接合技術以取 代 Bond Wire 的使用,以另一個矽基座當作 基板,將數個不同用途的晶片倒覆接合在此 基板上作連接。覆晶接合技術不需要 Bond Wire 因此幾乎沒有寄生電感,且其 Bond Pad 可以做得更小,大幅降低其所產生的寄生電容。覆晶接合技術除了可應用在高速信號的需求外,亦可讓不同性質的晶片有適當的隔離,避免因矽基座耦合而造成串音雜訊干擾。在系統應用上,覆晶接合技術則是系統整合封裝 (System-in-a-Package, SIP) 的關鍵技術。

## 三、執行成果

#### A. Time-Interleaved ADC



Fig. 1: Time-interleaved ADC.

Α time-interleaved analog-to-digital converter (ADC) employs multiple analog-to-digital conversion (A/D) channels to increase the achievable sampling rate for a given IC technology. Fig. 1 shows a time-interleaved ADC consisting of M A/D channels, i.e.,  $ADC_1 \cdots ADC_M$ . Each A/D channel, including a sample-and-hold amplifier (SHA) and a quantizer (QTZ), is driven by an f<sub>c</sub> clock with different phase. The M clock phases,  $\phi_1 \cdots \phi_M$ , are equally spaced and constitute an entire clock period. The entire achieves an equivalent  $f_s = f_c \times M$  sampling rate and N-bit resolution. For this project, we want  $N \ge 6$ ,  $f_c \ge 1$  GHz, and  $M \ge 10$ , to yield a 6-bit 10-GSamples/s ADC system.

Although the parallel architecture shown

in Fig. 1 can achieve very high sampling rate, mismatches among the A/D channels introduce additional errors in normal A/D conversion. These mismatches, including sampling phase mismatch, gain mismatch, and offset mismatch, must be removed or calibrated in order to attain high resolution.

## B. Background Calibration for Flash ADC



Fig. 2: Flash A/D architecture.

In this project, each A/D channel employs the flash A/D architecture shown in Fig. 2. The N-bit ADC uses  $2^N$ -1 comparators to simultaneously compare input,  $V_i$ , with  $2^N$ -1 references,  $V_{R,j}$ , where j=1,2, ...,  $2^N$ -1. The overall digital output  $D_o$  is obtained by encoding the binary output from the comparators. The flash architecture has the highest A/D conversion speed at a given N for a given technology.

For high-speed CMOS flash ADCs, it is the random input-referred offset voltages of the comparators that determine the ADC's linearity. The offset of a comparator with symmetric circuit configuration is caused by device's mismatches. There exists a fundamental trade-off among the speed, power, and accuracy of a CMOS flash ADC. overcome this speed-power-accuracy limitation, we have developed a new background calibration technique that can automatically trim the comparator's input offset voltage according to the statistical characteristic of input signals [1] [2].

Fig. 3 shows the block diagram of the proposed background-calibrated comparator (BCC), which is composed of a random chopping comparator (RCC) and a calibration processor (CP). The entire CP is realized in the digital domain. The ACC1 accumulator records the difference between the number of  $D_a[k]=1$  occurrences for q[k]=+1 and

q[k] = -1\$. The bilateral peak detector (BPD) monitors the value of R[k] and generates a corresponding triple-valued output,  $S[k] \in \{+1,-1,0\}$ . The BPD has two thresholds,  $+N_{\rm C}$ and  $-N_{\rm C}$ . When  $R[k] \ge +N_C$  or  $R[k] \le -N_C$ , we have S[k] = +1 or S[k] = -1; otherwise, S[k] = 0.In addition, the ACC1 accumulator is reset to zero whenever S[k] = +1 or S[k] = -1. S[k] sequence is integrated by the ACC2 accumulator. Its output, T[k], controls the comparator's input offset voltage, which can be expressed as:  $V_{OS}[k] = V_{OS}[0] + \Delta V \times T[k]$ , where  $V_{OS}[0]$  is the initial offset and  $\Delta V$  is the offset quantization step of the comparator.



Fig. 3: A background-calibrated comparator (BCC).

There are two design parameters in this calibration scheme, i.e.,  $\Delta V$  and  $N_C$ , both of which affect the converging speed as well as the variation of offset. Large  $\Delta V$  and small  $N_C$  result in fast converging speed but large fluctuation in  $V_{OS}$ . On the other hand, small  $\Delta V$  and large  $N_C$  results in small  $V_{OS}$  fluctuation but also slow converging speed.

To further reduce  $V_{\rm OS}$  fluctuation in the flash ADC applications, the windowed BCCs shown in Fig. 4 can be used. In this architecture, all comparator outputs  $D_{\rm c}$  are fed into a thermometer-code edge detector (TCED) to generate an edge code  $D_{\rm e}$ . Only one of the  $D_{\rm e}$  output is activated which is located at the 1-0 transition edge of the comparator  $D_{\rm c}$  output. The CP then use  $D_{\rm e}$  as its input instead of  $D_{\rm c}$ . In this arrangement, the  $P_{\rm l}$  probability perceived by the BCC is drastically reduced, thus the  $V_{\rm OS}$  fluctuation is reduced.



Fig. 4: A flash ADC employing windowed BCC.

Fig. 5 shows the  $N_C$  and  $\Delta V$  dependence of the BCC  $V_{OS}$  standard deviation in a 6-bit ADC of Fig. 4. For a 6-bit ADC design, one can choose  $\Delta V = 1/4$  LSB and  $N_C = 16$ , which lead to a  $\sigma(V_{OS})$  of 0.13 LSB.



Fig. 5: BCC  $V_{OS}$  standard deviation in a 6-bit ADC with various  $N_C$  and  $\Delta V$  values.

### C. Background Timing-Skew Calibration

Referring to Fig. 1, we can assume that the M clocks from the local clock generator,  $\phi_1$ ,  $\phi_2, \cdots,$ φм, are accurate and have uniformly-spaced phases. The mismatches among the clock routes from the clock generator to the SHAs cause timing skews that must be corrected to avoid A/D resolution degradation. We have developed a new background calibration technique that can automatically detect and trim timing skews by observing the inter-channel zero crossings [3].

The upper half of Fig. 6 illustrates the proposed timing skew detection scheme. Two choppers, a clock chopper and a data chopper,

are placed at the outputs of clock generator and at the outputs of the A/D channels. choppers are controlled by a binary-valued random sequence,  $q[k] \in \{+1,-1\}$ . q = +1, the choppers' outputs are the same as its corresponding inputs. When q = -1, the choppers' outputs are exchanged. timing skew between X i  $q \times (\tau_i - \tau_{i+1})$ . The polarity of the timing skew,  $\tau_{i}$  -  $\tau_{i+1}$ , can be detected by observing the change of zero-crossing probability between  $x_i$  and  $x_{i+1}$  due to different state of q. Once the polarity of timing skew is obtained, the timing delay  $\tau_{_{i+1}}$  can be adjusted to minimize the skew. The calibration processor employed in Fig. 6 is identical to that in Fig. 3. It has a bilateral peak detector (BPD) with a threshold of N<sub>C</sub>. The ACC2 accumulator digitally control the  $\tau_{_{\mathsf{i+1}}}$ delay  $\tau_{i+1}[k] = \tau_{i+1}[0] + \mu_t \times T[k]$ . As in the case of comparator calibration in Section III.B, the two design parameters,  $N_C$  and  $\mu_t$ , determine the converging speed and timing jitter of the calibration process.



Fig. 6: Timing-skew detection and calibration for two channels.

Fig. 7 shows the timing-skew calibration scheme for the entire M-channel time-interleaved ADC. The clock generator produces M clocks with an identical frequency of f<sub>c</sub> and equally-spaced phases. The clocks pass through the clock choppers and the digitally-controlled delay units to generate CK<sub>1</sub>, CK<sub>2</sub>, ..., CK<sub>M</sub> which control the sampling timing of  $ADC_1$ ,  $ADC_2$ , ..., respectively. The calibration processor (CP)

adjusts the digitally-controlled delay units to minimize the timing skews among the A/D channels. The timing skews are caused by mismatches among the clock routes from the outputs of clock choppers to the sample-to-hold amplifiers in the A/D channels. The CP is pure digital and operates at a clock rate of  $f_c$ . It consists of only comparators, adders and registers, and requires no multi-bit multiplier.



Fig. 7: Full-system timing-skew calibration.

As an example, we want to design a 6-bit 16-channel ADC, i.e., M=16. The sampling interval between the adjacent channels is  $T_s$ . Fig. 8 shows the ADC output's signal-to-noise ratio due to timing jitter,  $SNR_{\tau}$ , for various values of  $N_C$  and  $\mu_t$ . The input is assumed to be an asynchronous sine wave. To achieve 6-bit resolution, we choose  $N_C=29$  and  $\mu_t=T_s/28$ , which lead to  $SNR_{\tau}\approx 37\,\mathrm{dB}$ .



Fig. 8: The  $SNR_{\tau}$  versus  $N_{C}$  and  $\mu_{t}$  for a 6-bit 16-channel ADC.

#### D. A 6-Bit 16-GS/s Time-Interleaved ADC

We have designed a 6-bit 16-GS/s time-interleaved (TI) ADC in a 0.13  $\mu m$ 

CMOS technology. The ADC consists of 8 A/D channels. Each A/D channel is a 6-bit 2-GS/s flash ADC using the architecture of Fig. 4 and the background calibrated-comparators (BCC) of Fig. 3. Operating from a single 1.2 V supply, the power consumption of each A/D channel is expected to be less than 90 mW. This low power consumption is made possible by the BCC's digital background calibration.

The architecture of the TI ADC is similar to the one shown in Fig. 7, which includes timing-skew calibration for multi-channel input sampling. The calibration scheme is simplified from the technique described in subsection C. Fig. 9 shows the layout of this TI ADC. Chips area is 2.01x2.83 mm<sup>2</sup>. Operating from a single 1.2 V supply, the ADC's total power consumption of is expected to be less than 800 mW.



Fig. 9: Time-interleaved ADC layout.

#### 四、結論與自評

本計畫 94 年度的執行重點在於高速 Time-Interleaved ADC 之設計。而此架構的 主要問題包括取樣時間誤差以及子系統 ADC 之 Gain/Offset 不匹配等。我們以 Flash A/D 的架構來設計子系統 ADC。如此 可以避開 Gain/Offset 不匹配之問題。

在單一子系統 Flash ADC 設計方面,我們發明了一種自動調整 ADC 內部比較器 Offset 之方法。如此可以設計出面積更小又更省電之 ADC。此自動調整方法是全數位式,而且不會影響 ADC 之正常運作。本新技術將用來設計一個 6-bit 2-GSamples/s 之 ADC。

另外,我們也針對 Time-Interleaved

ADC 發展出偵測取樣時間誤差的方法,並用來控制多相位時脈產生器,就此能自動消除取樣時間誤差。應用此技術,我們將設計一個可數位控制之多相位時脈產生器電路。它可同時輸出 8 組不同之相位之時脈,而頻率可超過 2 GHz。此一時脈產生器將會自動調整輸出相位來消除 A/D 取樣時間之誤差。

以上兩種自動校正技術,我們都有詳盡的理論分析,已經發表了 IEEE 期刊論文 [1] [3],並且都在申請專利中。

為了驗證上述的校正技術,我們已經設計了一個有 8 支 Flash ADC 子系統之Time-Interleaved ADC 晶片。此晶片包含了所需的多相位時脈產生器並且能自動調整取樣時間誤差。此 ADC 設計目前正在以 0.13 μm CMOS 製程製造中。最終目標是實現一個 800-mW 6-bit 16-GS/s 之 ADC 晶片。

本計畫所發表的學術論文如下節所列。 論文在正式發表後,也會置放於主持人的網 頁上: http://www.cc.nctu.edu.tw/~jtwu。

## 五、參考文獻

- [1] C-C Huang and J-T Wu, "A Background Comparator Calibration Technique for Flash Analog-to-Digital Converters," *IEEE Transactions on Circuits and Systems I: Regular Papers*, Vol. 52, No. 9, pp. 1732-1740, Sept. 2005.
- [2] C-C Huang and J-T Wu, "A Statistical Background Calibration Technique for Flash Analog-to-Digital Converters," 2004 IEEE International Symposium on Circuits and Systems, pp. I-125-I-128, May 2004.
- [3] C-Y Wang and J-T Wu, "A Background Timing-Skew Calibration Technique for Time-Interleaved Analog-to-Digital Converters," *IEEE Transactions on Circuits and Systems II:* Express Briefs, Vol. 53, No. 4, pp. 299-303, April 2006.