# A CMOS 6-mW 10-bit 100-MS/s Two-Step ADC Yung-Hui Chung, Student Member, IEEE, and Jieh-Tsorng Wu, Senior Member, IEEE Abstract—A 10-bit 100-MS/s two-step ADC was fabricated using a 90 nm CMOS technology. To reduce power consumption, the ADC uses latch-type comparators for signal digitalization and an open-loop amplifier for residue amplification. The accuracy of the comparators is improved by offset calibration. The gain accuracy and the linearity of the residue amplifier are enhanced by digital background calibration. The ADC consumes 6 mW from a 1 V supply. Measured SNR and SFDR are 58.2 dB and 75 dB respectively. Measured ENOB is 9.34 bits. The FOM is 100 fJ · V per conversion-step. *Index Terms*—Analog-digital conversion, calibration, comparators (circuits), subranging ADC, two-step ADC. #### I. INTRODUCTION Nyquist-rate analog-to-digital converter (ADC) samples and digitizes an analog signal by using a combination of comparators, amplifiers, analog switches, and digital circuits. Many factors are considered in choosing an ADC architecture, including sampling rate, resolution, power consumption, input loading, chip area, and fabrication technology. In this paper, we examine the two-step ADC architecture and demonstrate its performance in the nanoscale CMOS technology. The subranging conversion architecture has been used in high-speed ADCs [1]-[8]. Fig. 1 shows a conventional 10-bit subranging ADC. It contains a 5-bit coarse ADC (CADC) and a 6-bit fine ADC (FADC), both of which are flash ADCs comprising only comparators. A resistor string generates voltage references for both ADCs. The CADC compares the analog input $V_1$ with $31V_{RC}$ references to determine which subrange the $V_1$ is located in. A multiplexer (MUX) then selects 63 references within that subrange to serve as the $V_{\rm RF}$ references for the FADC. The 1-bit redundancy of the FADC provides the over-range protection [2], so that the accuracy requirement for the CADC can be relaxed. The FADC still needs 10-bit accuracy. The spatial averaging technique has been used to improve the accuracy of high-speed comparators [9], [10]. For this 10-bit subranging ADC, the major critical delay path is the MUX. It receives the digital output $D_1$ from the CADC, and Manuscript received January 21, 2010; revised April 23, 2010; accepted June 01, 2010. Date of current version October 22, 2010. This paper was approved by Guest Editor Mototsugu Hamada. This work was supported by the National Science Council (Grant NSC-98-2221-E-009-131-MY2) of Taiwan, R.O.C, and the MediaTek Research Center at National Chiao-Tung University. Y.-H. Chung is with the Department of Electronics Engineering and Institute of Electronics, National Chiao-Tung University, Hsin-Chu 300, Taiwan (e-mail: p9211830@alab.ee.nctu.edu.tw). J.-T. Wu is with the Department of Electronics Engineering and Institute of Electronics, National Chiao-Tung University, Hsin-Chu, Taiwan (e-mail: jtwu@mail.nctu.edu.tw). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSSC.2010.2063590 Fig. 1. A 10-bit subranging ADC architecture. Fig. 2. A 10-bit two-step ADC architecture. then employs analog switches to select 63 $V_{\rm RF}$ references out of 1023 dc voltages generated by the resistor string. The complexity of the MUX can be mitigated by using the interpolation technique [3]–[8]. Both the averaging and the interpolation techniques require amplifiers. To mitigate the complexity of the MUX, two-step ADC architecture has been proposed [11]-[13]. Fig. 2 shows a 10-bit two-step ADC, which is a variation of the subranging architecture. Similar to the subranging ADC shown in Fig. 1, it contains a 5-bit CADC and a 6-bit FADC. However, the $V_{\rm RF}$ references for the FADC are fixed. The MUX only needs to select one voltage out of the 32 possible dc voltages from the resistor string. The MUX functions as a digital-to-analog converter (DAC), which is denoted as resistor-string DAC (RDAC). The RDAC output $V_{da}$ represents an estimation of $V_1$ made by the CADC and the RDAC. In Fig. 2, the residue amplifier (RAMP) amplifies the difference $V_1 - V_{da}$ . Its output $V_2$ is then digitized by the FADC. The RAMP is a linear amplifier. Its linearity must meet the resolution requirement of the FADC. However, its amplification gain also mitigates the accuracy requirement for the FADC. The principle of residue amplification is identical to pipelined ADCs [14]. Fig. 3. Proposed two-step ADC architecture. In this paper, we describe a 10-bit 100-MS/s two-step ADC. The ADC was fabricated using a 90 nm CMOS technology [15]. To take advantage of the nanoscale CMOS transistors, we minimize the use of amplifiers, and employ latches and digital circuits to compensate the analog functions provided by the amplifiers. We use latch-type comparators to construct the CADC and the FADC. The offsets of the comparators are reduced by a mixed-signal offset calibration scheme. We use a simple single-stage amplifier to realize the RAMP. The gain error and nonlinearity of the RAMP are corrected by digital calibration, which operates continuously in the background. The rest of this paper is organized as follows. Section II shows the architecture of the two-step ADC. Circuit designs of key functional blocks are discussed in Section III. Section IV describes the digital calibration scheme. Section V shows the experimental results. Section VI draws conclusions. In addition, Appendix A details the theory and derivation of the digital calibration described in Section IV. ## II. ADC ARCHITECTURE The proposed 10-bit two-step ADC architecture is shown in Fig. 3. The ADC operates with two non-overlapping clocks, $\phi_1$ and $\phi_2$ . The duty cycles for $\phi_1$ and $\phi_2$ are 25% and 75% respectively. During $\phi_1=1$ , the coarse ADC (CADC) compares the analog input $V_1$ with 33 coarse references $V_{RC}$ to estimate the magnitude of $V_1$ , yielding the 5-bit digital output $D_1$ . The $V_{RC}$ references are generated from a resistor string. The $D_1$ code drives an analog multiplexer (MUX) to select a voltage from the resistor string. The MUX is called resistor-string digital-to-analog converter (RDAC). Its output, $V_{\rm da}$ , is an estimation of $V_1$ . During $\phi_1=1$ , the analog input $V_1$ is also sampled onto the sampling capacitor $C_s$ . During $\phi_2=1$ , the residue amplifier (RAMP) amplifies the difference between $V_1$ and $V_{\rm da}$ , yielding the residue signal $V_2$ . The RAMP is an open-loop amplifier with a nominal voltage gain of 8. The fine ADC (FADC) then compares the residue $V_2$ with 65 fine references $V_{\rm RF}$ to estimate the magnitude of $V_2$ , yielding the 6-bit digital output $D_2$ . The FADC has an input range of 64 steps. In an ideal two-step ADC, the FADC needs only an input range of 32 steps. The 1-bit redundancy is added to tolerate the gain error and offset of the RAMP, the comparator offset and metastability of the CADC, and the offset and nonlinearity of the RDAC. It is also used to accommodate the extra signal range required by the RAMP digital calibration. The RAMP voltage gain mitigates the FADC resolution requirement. The analog signal path of the ADC is fully differential. The ADC differential input range is 2 V. One LSB is 1.95 mV. The FADC has a differential input range of 1 V and a step size of 8 LSB. The CADC output $D_1$ is an integer between -16 and +15, and the FADC output $D_2$ is an integer between -32 and +31. To reduce power consumption, the RAMP is a simple open-loop amplifier. Its gain error and nonlinearity are corrected by the digital calibration processor (DCP) shown in Fig. 3. The DCP receives the $D_2$ code from the FADC and generates a corrected $D_2^c$ code. An encoder then combines $D_1$ and $D_2^c$ to produce the final ADC digital output $D_o$ . The DCP also generates a digital random sequence $q \in \{-1,0,+1\}$ . The q sequence also drives the RDAC so that a random signal is injected into the RAMP. The DCP uses this random signal to calibrate the RAMP in the background. #### III. CIRCUIT DESIGN ## A. Comparator The CADC is a 5-bit flash ADC consisting of 33 comparators, a thermometer-code decoder, and a dynamic encoding ROM. Fig. 4 shows the architecture of the comparator in the CADC. Its function is comparing the input $V_1$ with a reference $V_{RC}[n]$ , where n is an integer between 1 and 33 for indexing one of the $V_{RC}$ coarse references. The comparator includes a regenerative latch with an offset calibration control loop. To reduce power consumption, there is no conventional preamplifier. The $V_{OS}$ in front of the latch represents the input-referred offset of the latch due to device mismatches. The $V_{cm}$ represents the input common-mode voltage. The latch is triggered by the clock $\phi_c$ . Comparisons are made near the beginnings of both $\phi_1$ and $\phi_2$ periods. The comparison determines the polarity of the differential voltage at the input port $V_a$ , but with an equivalent input Fig. 4. CADC comparator architecture. offset of $V_{OS}+V_c-V_{cm}$ . In Fig. 4, the switch S3 is controlled by the clock $\phi_{1a}$ , which is an advanced version of the clock $\phi_1$ . The switch S3 is opened before the switch S1 so that the bottom-plate sampling operation is enabled. The $C_1$ sampling capacitor is a 25 fF metal-oxide-metal capacitor. The ac coupling of the $C_1$ input network causes 10% signal loss. In Fig. 4, the effect of $V_{OS}$ is removed by the offset-calibration charge pump (OCCP), similar to a prior design [16]. During $\phi_1=1$ , the ADC analog input $V_1$ is sampled onto the capacitor $C_1$ and the latch input $V_a$ is connected to the common-mode voltage $V_{cm}$ . The latch then makes a calibration comparison. If the comparison result $D_c$ is 1, an up pulse is generated in the OCCP, and $V_c$ is increased by charging the capacitor $C_2$ . If $D_c$ is 0, a down pulse is generated in the OCCP, and $V_c$ is decreased by discharging the capacitor $C_2$ . Voltage $V_c$ eventually converges to $V_{cm}-V_{OS}$ . The effect of $V_{OS}$ is then cancelled. During $\phi_2=1$ , the capacitor $C_1$ is connected to $V_{RC}[n]$ . The latch then makes a conversion comparison, and the output $D_c$ represents the polarity of $V_1-V_{RC}[n]$ . Fig. 5 shows the latch schematic. There are two input ports. One port receives the differential input $V_a$ . The other port receives the difference between $V_c$ and $V_{cm}$ , where $V_{cm}$ is a common-mode reference and $V_c$ adjusts the offset of the latch. Transistors M3 and M7 are added to reduce the conducting currents when the latch is turned on. Kickback noises at the inputs of the latch are also reduced. The sizes of these transistors are minimized without considering the matching requirements. From Monte Carlo simulation results, the offset standard deviation $\sigma(V_{OS})$ is about 50 mV. The $C_2$ capacitor in the OCCP is realized using a nMOS transistor. Its capacitance is 1 pF. The output currents of the charge-pump current sources, $I_p$ and $I_n$ , are 1 $\mu$ A. The width of the up and down pulses is 1 ns. Thus, in each calibration step, the $V_c$ is changed by 1 mV, which is about 1/2 LSB. After the offset calibration settles, $V_c$ may vary in the same direction for at most two consecutive calibration steps, yielding a worst-case fluctuation of $\pm 1$ mV. In other words, the comparator offset is reduced to less than 1 mV by the OCCP. The $V_c$ fluctuation can be affected by $I_p$ , $I_n$ , $C_2$ , and the width of the up and down Fig. 5. Schematic of the latch in CADC comparator. Fig. 6. FADC comparator architecture. pulses. Their variations are tolerated due to the FADC 1-bit redundancy. The matching between $I_p$ and $I_n$ is not crucial. It affects only the ratio of the up and down pulses. Operating at 100 MHz clock rate, each CADC comparator consumes 18 $\mu$ W. The entire CADC, including comparators, decoder, ROM, and clock buffers, consumes 0.8 mW. The total input capacitance of the CADC is 0.8 pF. The FADC is a 6-bit flash ADC. It includes 65 comparators. Fig. 6 shows the architecture of the FADC comparator. Its function is comparing the output from the RAMP $V_2$ with a reference $V_{\rm RF}[n]$ , where n is an integer between 1 and 65 for indexing one of the $V_{\rm RF}$ fine references. Similar to the CADC comparator, it includes a regenerative latch and a offset-calibration charge pump (OCCP). The latch is triggered by the clock $\phi_f$ . Comparisons are made near the ends of both $\phi_1$ and $\phi_2$ periods. During $\phi_1 = 1$ , both input ports $V_a$ and $V_r$ are connected to the $V_{RF}[n]$ reference. The latch makes a calibration comparison, the OCCP then adjusts $V_c$ to minimize the input offset. The $V_c$ fluctuation for the FADC comparator should be less than $\pm 8$ mV, i.e., $\pm 1/2$ of the FADC input step size. Near the end of $\phi_2 = 1$ , the latch makes a conversion comparison, the resulting $D_f$ represents the polarity of $V_2 - V_{RF}[n]$ . Unlike the CADC comparator, the FADC comparator does not employ the switched-capacitor network to perform the $V_2 - V_{RF}[n]$ subtraction. The reason is to avoid extra capacitive loading for the RAMP. Fig. 7 shows the schematic of the latch in the FADC comparator. It has three input source-coupled pairs. The M1-M2 pair Fig. 7. Schematic of the latch in FADC comparator. Fig. 8. Schematic of the residue amplifier (RAMP). is connected to the positive terminals of the input ports $V_a$ and $V_r$ , while the M5-M6 pair is connected to the negative terminals. Operating at 100 MHz clock rate, each FADC comparator consumes 22 $\mu$ W. The entire FADC, including comparators, decoder, ROM, and clock buffers, consumes 1.7 mW. ## B. Residue Amplifier (RAMP) Fig. 8 shows the residue amplifier (RAMP) schematic. It comprises a switched-capacitor input network and a single-stage differential amplifier. The input sampling switches S1–S6 are nMOS transistors with constant- $V_{gs}$ bootstrapped gate drive [17]. pMOS transistors M3 and M4 are current sources. Resistors $R_1$ and $R_2$ are realized with polysilicon and have a resistance of 5 k $\Omega$ . They are used as passive loads to provide better RAMP linearity. Their resistance is close to the output resistance of M1 and M2. pMOS transistor M5 is added to improve the power supply rejection ratio. Half of the tail current in M0 is controlled by a switched-capacitor common-mode feedback (CMFB). During $\phi_1=1$ , the differential input $V_1$ is sampled onto the $C_{s1}$ and $C_{s2}$ capacitors. At the same time, the inputs and the outputs of the differential amplifier are shorted for offset cancellation. This offset cancellation reduces the variation of the RAMP output voltage range. During $\phi_2=1$ , the residue $V_1-V_{\rm da}$ is amplified by the differential amplifier in open-loop configuration. The input capacitors, $C_{s1}$ and $C_{s2}$ , are metal-oxide-metal capacitors with a capacitance of 250 fF. The ac coupling of the $C_s$ input network causes 20% signal loss. The entire RAMP provides a nominal voltage gain of 8 for residue amplification. The RAMP consumes a total power of 1.1 mW at 100 MS/s sampling rate. ## C. Resistor-String DAC (RDAC) The resistor string shown in Fig. 3 provides 33 differential $V_{RC}$ references for the CADC, 65 differential $V_{RF}$ references for the FADC. All differential references are in fact generated from two parallel resistor strings with currents flowing in the opposite direction. To improve the linearity, these two resistor strings are tied together by metal wires connecting the $V_{RC}$ taps of identical voltages. The resistor strings are implemented with nonsalicide polysilicon. Each string has a total resistance of 2 k $\Omega$ . The width of each string is 20 $\mu$ m. The ADC linearity is determined ultimately by the resistor string linearity. The reference voltages generated by the resistor string must have 10-bit accuracy. To generate the RDAC output $V_{\rm da}$ , the resistor string also generate a set of 3 different references separated by 8 LSB for each of the $32~D_1$ codes. The RDAC shown in Fig. 3 comprises a digital decoder and a MUX. The decoder combines the digital Fig. 9. Distributed input track-and-hold. inputs $D_1$ and q to drive the analog switches in the MUX. The MUX selects one voltage out of the 96 references generated by the resistor string. The RDAC output $V_{\rm da}$ can be expressed as $$V_{\text{da}} = (32 \cdot D_1 - 8 \cdot q) \times \text{LSB.} \tag{1}$$ A q random signal with a magnitude of 8 LSB is injected into the analog signal path to enable the digital background calibration described in Section IV. ## D. Distributed Input Track-and-Hold The ADC does not have a single dedicated input sampler. As shown in Fig. 9, the analog input $V_1$ is sampled by the passive samplers in the RAMP and in the CADC comparators. The clocks $\phi_1$ and $\phi_{1a}$ control the samplers. The timing skews of the clocks are minimized by careful matching the delay of the clock buffers. The matching of the $V_1$ signal paths is also critical. Due to the resistivity of metal wires and analog switches, the transfer function from the $V_1$ input to each sampling capacitor in the sampling mode is a low-pass filter. The transfer functions should be identical. As shown in Fig. 9, a tree-like routing scheme is used to connect the $V_1$ input to the CADC comparators. In addition, the transfer function from $V_1$ to the RAMP input, $V_1(R)$ , is made to match the transfer function from $V_1$ to the middle of CADC, $V_1(16)$ . The $V_1$ signal paths are routed using the top two metal layers shorted as a single wire. ### IV. DIGITAL CALIBRATION The RAMP described in Section III-B amplifies the residue $V_1-V_{\rm da}$ . It also exhibits gain error and nonlinearity. Several digital calibration schemes have been proposed to correct both gain error and nonlinearity of a residue amplifier [18]–[21]. Both [19] and [20] are foreground calibration schemes. To enable background calibration, [19] requires a sample-and-hold of different sampling rate, and [20] requires an interpolation filter which limits the bandwidth of the ADC input. Both [18] and [21] are correlation-based background calibration schemes. The [18] scheme requires a busy input to be effective. When applying to Fig. 10. Digital correction of gain error and nonlinearity. a two-step ADC, all the above schemes require substantial modification to the analog signal path. In this paper, we proposed a new digital background calibration scheme to correct both gain error and nonlinearity of the RAMP. It requires only a minor modification to the RDAC. It does not depend on the statistics of the ADC input. The calibration process can be simplified to reduce power dissipation without loosing its effectiveness. As illustrated in Fig. 10, the RAMP output $V_2$ is digitized by the FADC, yielding the digital code $D_2$ . The conversion function from the residue to the $D_2$ is not linear and its slope is not exact as designed. We use a digital calibration processor (DCP) to correct the non-ideal behavior. The signal compensator shown in Fig. 10 is part of the DCP. It corrects both the gain error and the 3rd-order nonlinearity in $D_2$ , so that the conversion function from the residue to the corrected $D_2^c$ is linear and has a slope of correct value. The digital correction shown in Fig. 10 requires a coefficient for gain correction, $b_1$ , and a coefficient for 3rd-order correction, $b_3$ . The DCP is designed to automatically generate these two coefficients in the background. Fig. 11 shows the principle of gain error detection and nonlinearity detection. Assume the analog input $V_1$ is fixed and the FADC does not introduce quantization errors. As shown in (1), the RDAC output $V_{\rm da}$ is embedded with a random signal of $q\times 8$ LSB, where $q\in \{-1,0,+1\}$ . Ideally, the differences of the three corresponding $D_2^c$ codes are 8. We define the actual differences of the corresponding $D_2^c$ codes as $H_1$ and $H_2$ . Gain error is detected if $H_1+H_2\neq 16$ . Nonlinearity is detected if $H_1\neq H_2$ . Fig. 12 shows the generator for both $b_1$ and $b_3$ coefficients. Under normal ADC operation with varying analog input $V_1$ , averaging is used to extract information from the $D_2^c$ codes. Each $D_2^c$ code corresponds to a residue $V_1 - V_{\rm da}$ , where $V_{\rm da}$ includes a random number q as shown in (1). The ADC input $V_1$ is assumed to be uncorrelated with q. The tri-level random signal q is constructed by combining two uncorrelated binary pseudo-random sequences. Each binary random sequence has length of $2^{14}$ and equal numbers of zeros and ones. The generator first receives the $D_2^c$ codes from the compensator, and sorts the data according to the associated q. The generator then averages the sorted data, and applies subtraction to acquire $H_1$ and $H_2$ . The generator Fig. 11. Gain error and nonlinearity detection. Fig. 12. Coefficient generator for $b_1$ and $b_3$ . also acquires the average of $D_2^c - q \times 8$ and $(D_2^c - q \times 8)^2$ , denoted as M and S respectively. The acquired data $H_1$ , $H_2$ , M and S are updated once every $2^{14}$ samples. Two error terms $E_1$ and $E_2$ are defined as $$E_1 = H_1 + H_2 - 16$$ $E_2 = H_1 - H_2$ . (2) Thus, $E_1$ reveals the gain error, and $E_2$ reveals the nonlinearity. Combining both error terms, a single error function L is defined as $$L = \frac{1}{2}E_1^2 + \frac{1}{2}E_2^2. \tag{3}$$ Employing the Lyapunov second theorem on stability [22] to ensure that L will approach to zero asymptotically, we can find the following equations to estimate $b_1$ and $b_3$ . $$b_1[k+1] = b_1[k] - \mu_1 \times \text{sgn}(Z_1)$$ (4) $$b_3[k+1] = b_3[k] - \mu_3 \times \text{sgn}(Z_3)$$ (5) where $$Z_1 = E_1 \left( b_1^3[k] - (3S + 64)b_3[k] \right) - E_2(24Mb_3[k]) \quad (6)$$ $$Z_3 = E_1(3S + 64) + E_2(24M). (7)$$ The value of sgn(x) is +1 if x>0, 0 if x=0, and -1 if x<0. Employing the sgn function simplifies the DCP hardware and Fig. 13. Transient behavior of the digital calibration. reduces its power consumption. Derivation of the above equations is included in Appendix A. The S variable in (6) and (7) can be further simplified by replacing it with a constant 256/3, which is $E[(D_2^c)^2 - q \times 8]$ when $D_2^c - q \times 8$ is uniformly distributed between -16 and +16. Simulations show that using a constant S do not affect the calibration process. The updating factors $\mu_1$ and $\mu_3$ are two positive constants. Smaller $\mu_1$ and $\mu_3$ result in less fluctuations in $b_1$ and $b_3$ , but also slower settling time. We want faster settling when errors are large and less fluctuation when errors are small. Thus, the DCP sets $\mu_1 = 1/64$ when $|E_1| \geq 1$ and $\mu_1 = 1/256$ when $|E_1| < 1$ . The DCP also choose $\mu_3 = \mu_1/1024$ . Fig. 13 shows the transient behavior of the calibration. In the simulation, the RAMP is followed by an ideal 6-bit FADC. The RAMP transfer function is obtained from SPICE simulation, and is modeled as $$y_d = a_1 \cdot y + a_3 \cdot y^3 + a_5 \cdot y^5 + a_7 \cdot y^7 \tag{8}$$ where y is the FADC output of an ideal RAMP, and $y_d$ is that of a real RAMP. The values of the coefficients are $a_1=0.8$ , $a_3=-5.5\times 10^{-5}$ , $a_5=-1\times 10^{-7}$ and $a_7=3\times 10^{-10}$ . The initial values for $b_1$ and $b_3$ are set as 1.0 and $5.5\times 10^{-5}$ respectively. The ADC input is a full-scaled sine wave. The coefficients $b_1$ and $b_3$ are settled to 1.225 and $2.75\times 10^{-4}$ respectively. The error function L approaches zero asymptotically. The overall convergent time is about 30 iteration cycles. Each iteration cycle is $2^{14}$ sampling periods. At 100 MS/s sampling rate, the calibration convergent time is about 5 msec. The decoder shown in Fig. 3 combines the 5-bit $D_1$ from the CADC and the 6-bit $D_2^c$ from the DCP to produce the final ADC Fig. 14. ADC chip micrograph. digital output $D_o$ . The q signal injected from the RDAC must be removed from $D_o$ . Thus, $D_o$ is calculated as $$D_o = D_1 \times 2^5 + D_2^c - q \times 8. (9)$$ The full range of the 10-bit output code $D_o$ is from -512 to +511. ## V. EXPERIMENTAL RESULTS The ADC was fabricated using a 90 nm digital CMOS technology with one layer of polysilicon and six layers of metal. The ADC chip micrograph is shown in Fig. 14. It occupies an active area of 0.36 mm². Operating at 100 MHz sampling frequency, the ADC core consumes a total power of 6 mW from a 1 V supply. The single-ended input swing range can be as high as 1 V, equal to the supply voltage VDD. The input capacitance of this ADC is about 1.2 pF, which includes the input capacitances of RAMP and CADC, and other parasitic capacitors. All digital circuits including calibration processor, clock generator and clock buffers, dissipate 1.4 mW at 100 MS/s sampling rate. The differential nonlinearity (DNL) and integral nonlinearity (INL) are measured by using a code-density testing setup with a 1 MHz sine wave input. Fig. 15 shows the measured DNL of the CADC before and after comparator offset calibration. The DNL is -1/+1 LSB before calibration, and becomes -0.25/+0.25 LSB after calibration. The offset calibration effectively reduces the offsets of the latch comparators. Figs. 16 and 17 show the measured DNL and INL of the ADC before and after the RAMP digital calibration. The DNL is improved from -1/+4 LSB to -0.5/+0.6 LSB by the calibration. The INL is improved from -17/+18 LSB to -0.9/+0.9 LSB by the calibration. Fig. 18 shows the ADC output spectrums before and after the RAMP digital calibration at 100 MS/s sampling rate. The input Fig. 15. Measured CADC differential nonlinearity (DNL). Fig. 16. Measured ADC differential nonlinearity (DNL). Fig. 17. Measured ADC integral nonlinearity (INL). is a 1 MHz sine wave. The measured SNDR is improved from 35 dB to 58 dB by the calibration. The measured SFDR is improved from 43 dB to 75 dB by the calibration. Fig. 19 shows the ADC dynamic performance versus input frequencies at 100 MHz sampling rate. The measured SFDR degrades gradually towards higher input frequencies. It is caused by the mismatch between the CADC distributed input track-and-holds and the RAMP track-and-hold. The SNDR degradation at higher input frequencies is due to the sampling clock jitter. The effective resolution bandwidth (ERBW) is about 46 MHz. Fig. 20 shows the ADC dynamic performance versus sampling Fig. 18. Measured output spectrum at 100 MS/s before and after RAMP's calibration. Fig. 19. Dynamic performance versus input frequency. Fig. 20. Dynamic performance versus sampling frequency. rates. The input is a 1 MHz sine wave. The SFDR is higher than 66 dB up to 160 MS/s sampling rate, and the SNDR can maintain 56 dB up to 150 MS/s sampling rate. The SFDR begin to degrade for sampling rates higher than 100 MS/s. This is mainly due to the incomplete settling of the RAMP. TABLE I PERFORMANCE SUMMARY | Technology | 90nm CMOS | |--------------------------------------------|-----------| | Supply Voltage (V) | 1.0 | | Resolution (bit) | 10 | | Sampling Rate (MHz) | 100 | | Input Range (V <sub>pp</sub> differential) | 2.0 | | Input Loading (pF) | 1.2 | | DNL (LSB) | +0.6/-0.5 | | INL (LSB) | +0.9/-0.9 | | SNDR (dB) ( $F_{in}$ =1 MHz) | 58 | | SNDR (dB) ( $F_{in}$ =50 MHz) | 53.7 | | SFDR (dB) ( $F_{in}$ =1 MHz) | 75 | | SFDR (dB) ( $F_{in}$ =50 MHz) | 64 | | THD (dB) ( $F_{in}$ =1 MHz) | -70 | | THD (dB) ( $F_{in}$ =50 MHz) | -60 | | Power Consumption (mW) | 6 | | FOM1 (fJ/convstep) | 92 | | FOM2 (fJ·V/convstep) | 100 | | Active Area (mm <sup>2</sup> ) | 0.36 | TABLE II 10-BIT ADCS COMPARISON | Design | [24] | [25] | [26] | [27] | [28] | This work | |----------------------|------|------|------|------|------|-----------| | Technology (nm) | 90 | 90 | 130 | 65 | 90 | 90 | | Supply (V) | 0.8 | 1.0 | 1.2 | 1.2 | 1.2 | 1.0 | | Power (mW) | 6.5 | 33 | 19.2 | 1.78 | 1.44 | 6 | | $F_S$ (MHz) | 80 | 100 | 60 | 26 | 50 | 100 | | SNDR (dB) | 55 | 55.3 | 56 | 54.3 | 49.4 | 58 | | FOM1(fJ/conv-step) | 176 | 694 | 621 | 162 | 119 | 92 | | FOM2(fJ·V/conv-step) | 162 | 694 | 497 | 195 | 144 | 100 | Table I summarizes the measured specifications of this ADC chip. Two types of the figure-of-merit (FOM) for the ADC are defined as $$FOM1 = \frac{Power}{2^{ENOB} \times F_S}$$ (10) $$FOM2 = \frac{Power}{2^{ENOB} \times min(2 \cdot ERBW, F_S)} \times VDD \quad (11)$$ where ENOB is the effective number of bits at low input frequency. The FOM1 is a general figure-of-merit definition for most ADCs. The FOM2 considers the importance of the input frequency towards the Nyquist frequency and the design contribution due to low supply voltage (VDD) in nanoscale CMOS technologies [23]. Table II compares this work with other 10-bit ADCs published in recent years. ## VI. CONCLUSIONS A 10-bit 100-MS/s two-step ADC fabricated in a 90 nm CMOS technology is presented. It effectively takes the advantage of the nanoscale technology to achieve low-power dissipation. Its internal coarse ADC and fine ADC are realized with the latch-type comparators whose accuracy are enhanced by offset calibration. The gain error and nonlinearity of the open-loop residue amplifier are corrected in the digital domain with a calibration processor. The ADC consumes only 6 mW from a single 1 V supply. ## APPENDIX LYAPUNOV-BASED CALIBRATION Considering the signal path in Fig. 10, the residue $V_1-V_{\rm da}$ is amplified by the RAMP, and then digitalized by the FADC. As shown in (1), the RDAC output $V_{\rm da}$ is embedded with a random sequence $q\times 8$ LSB, where $q\in \{-1,0,+1\}$ . Assume both RAMP and FADC are ideal and the quantization errors of the FADC are neglected. The outputs of the FADC are denoted as y if q=0, y+8 if q=+1, and y-8 if q=-1. If the RAMP exhibits gain error and nonlinearity, the outputs of the FADC become $y_{d,0}, y_{d,+1}$ , and $y_{d,-1}$ for q=0,+1, and -1 respectively. They can be expressed as $$y_{d,0} = a_1 \cdot y + a_3 \cdot y^3$$ $$y_{d,+1} = a_1 \cdot (y+8) + a_3 \cdot (y+8)^3$$ $$y_{d,-1} = a_1 \cdot (y-8) + a_3 \cdot (y-8)^3$$ (12) where $a_1$ and $a_3$ are coefficients for gain error and nonlinearity. As shown in Fig. 10, the $y_d$ data are corrected by the signal compensator using the $b_1$ and $b_3$ coefficients. The corrected data can be expressed as $$y_{c,0} = b_1 \cdot y_{d,0} + b_3 \cdot y_{d,0}^3$$ $$y_{c,+1} = b_1 \cdot y_{d,+1} + b_3 \cdot y_{d,+1}^3$$ $$y_{c,-1} = b_1 \cdot y_{d,-1} + b_3 \cdot y_{d,-1}^3.$$ (13) The correction makes $y_{c,0} = y$ . Thus, we have $$b_1 = \frac{1}{a_1}$$ and $b_3 = -\frac{a_3}{a_1^4}$ . (14) We neglect other high-order terms, which are treated as disturbances in the estimation process. In Fig. 12, the coefficient generator collects the $y_c$ data to extract information. The acquired variables are $$M = E[y_c - q \times 8] \approx E[y] \tag{15}$$ $$S = E[(y_c - q \times 8)^2] \approx E[y^2] \tag{16}$$ $$E_1 = H_1 + H_2 - 16$$ $$= E[y_{c,+1}] - E[y_{c,-1}] - 16$$ $$\approx 16(b_1 a_1 - 1) + 16(b_1 a_3 + b_3 a_1^3)(3S + 64) \quad (17)$$ $$E_2 = H_1 - H_2$$ $$= (E[y_{c,+1}] - E[y_{c,0}]) - (E[y_{c,0}] - E[y_{c,-1}])$$ $$\approx 384M(b_1a_3 + b_3a_1^3). \tag{18}$$ A positive semi-definite function L is defined as $$L = \frac{1}{2}E_1^2 + \frac{1}{2}E_2^2. \tag{19}$$ By the Lyapunov second theorem on stability, if L satisfies $$L \ge 0$$ and $\frac{dL}{dt} < 0$ . (20) L is called a Lyapunov function candidate and the system is asymptotically stable. From (19), the condition $L \ge 0$ is always true. Since $E_1$ and $E_2$ are functions of $b_1$ and $b_3$ , which are varying with time, the condition dL/dt < 0 can be rewritten as $$\frac{dL}{dt} = \frac{\partial L}{\partial b_1} \cdot \frac{db_1}{dt} + \frac{\partial L}{\partial b_3} \cdot \frac{db_3}{dt} < 0. \tag{21}$$ One sufficient condition to satisfy the above inequality is $$\frac{\partial L}{\partial b_1} \cdot \frac{db_1}{dt} < 0$$ and $\frac{\partial L}{\partial b_3} \cdot \frac{db_3}{dt} < 0$ . (22) For a discrete-time system with slow varying $b_1$ and $b_3$ , the above equations become $$\frac{\partial L}{\partial b_1} \cdot (b_1[k+1] - b_1[k]) = -\alpha_1 < 0 \tag{23}$$ $$\frac{\partial L}{\partial b_3} \cdot (b_3[k+1] - b_3[k]) = -\alpha_3 < 0 \tag{24}$$ where $\alpha_1$ and $\alpha_3$ are two positive variables. Thus, the difference equation for $b_1$ estimation can be expressed as $$b_{1}[k+1] = b_{1}[k] - \alpha_{1} \left(\frac{\partial L}{\partial b_{1}}\right)^{-1}$$ $$= b_{1}[k] - \left[\alpha_{1} \left(\frac{\partial L}{\partial b_{1}}\right)^{-2}\right] \times \frac{\partial L}{\partial b_{1}}.$$ (25) Since $\alpha_1(\partial L/\partial b_1)^{-2}$ is always positive, the above equation can be represented by $$b_1[k+1] = b_1[k] - \mu_1 \times \frac{\partial L}{\partial b_1}$$ (26) where $\mu_1$ is a positive constant, denoted as the updating factor for $b_1$ . With similar procedure, the estimation equation for $b_3$ can also be obtained as $$b_3[k+1] = b_3[k] - \mu_3 \times \frac{\partial L}{\partial b_3}$$ (27) where $\mu_3$ is a positive constant, denoted as the updating factor for $b_3$ . From (19), $\partial L/\partial b_1$ and $\partial L/\partial b_3$ are $$\frac{\partial L}{\partial b_1} = E_1 \cdot \frac{\partial E_1}{\partial b_1} + E_2 \cdot \frac{\partial E_2}{\partial b_1} \tag{28}$$ $$\frac{\partial L}{\partial b_1} = E_1 \cdot \frac{\partial E_1}{\partial b_2} + E_2 \cdot \frac{\partial E_2}{\partial b_2}.$$ (29) Applying (14), (17) and (18), the above equations can be rewritten as $$\frac{\partial L}{\partial b_1} = \frac{16}{b_1^4} \times \left[ E_1 \left( b_1^3 - (3S + 64)b_3 \right) - E_2(24Mb_3) \right]$$ (30) $$\frac{\partial L}{\partial b_3} = \frac{16}{b_1^3} \times [E_1(3S + 64) + E_2(24M)]. \tag{31}$$ Equations (4) and (5) shown in Section IV are the simplified implementations of (26) and (27) respectively. The $Z_1$ and $Z_3$ variables defined in (6) and (7) are the simplified $\partial L/\partial b_1$ and $\partial L/\partial b_3$ shown in (30) and (31). The $16/b_1^4$ term in (30) is neglected in calculating $Z_1$ since the term is always positive and thus does not affect the final value of $\mathrm{sgn}(Z_1)$ . The $16/b_1^3$ term in (31) is also neglected in calculating $Z_3$ . #### ACKNOWLEDGMENT The authors would like to thank Faraday Technology Corporation (FTC), Hsin-Chu, Taiwan, and United Microelectronics Corporation (UMC), Hsin-Chu, Taiwan, for their technical support and chip fabrication. #### REFERENCES - [1] A. G. F. Dingwall and V. Zazzu, "An 8-MHz CMOS subranging 8-bit A/D converter," *IEEE J. Solid-State Circuits*, vol. SC-20, pp. 1138–1143, Dec. 1985. - [2] T. Matsuura, T. Tsukada, S. Ohba, E. Imaizumi, H. Sato, and S. Ueda, "An 8 b 20 MHz CMOS half-flash A/D converter," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 1988, pp. 220–221. - [3] K. Kusumoto, A. Matsuzawa, and K. Murata, "A 10-b 20-MHz 30-mW pipelined interpolating CMOS ADC," *IEEE J. Solid-State Circuits*, vol. 28, no. 12, pp. 1200–1206, Dec. 1993. - [4] B. P. Brandt and J. Lutsky, "A 75-mW, 10-b, 20-MSPS CMOS subranging ADC with 9.5 effective bits at Nyquist," *IEEE J. Solid-State Circuits*, vol. 34, no. 12, pp. 1788–1795, Dec. 1999. - [5] J. Mulder, C. M. Ward, C.-H. Lin, D. Kruse, J. R. Westra, M. Lugthart, E. Arslan, R. J. van de Plassche, K. Bult, and F. M. L. van der Goes, "A 21-mW 8-b 125MSample/s ADC in 0.09-mm<sup>2</sup> 0.13-μm CMOS," *IEEE J. Solid-State Circuits*, vol. 39, no. 12, pp. 2116–2125, Dec. 2004. - [6] D. J. Huber, R. J. Chandler, and A. A. Abidi, "A 10 b 160 MS/s 84 mW 1 V subranging ADC in 90 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2007, pp. 454–455. - [7] Y. Shimizu, S. Murayama, K. Kudoh, and H. Yatsuda, "A split-load interpolation-amplifier-array 300 MS/s 8 b subranging ADC in 90 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2008, pp. 552–553. - [8] K. Ohhata, K. Uchino, Y. Shimizu, K. Oyama, and K. Yamashita, "Design of a 770-MHz, 70-mW, 8-bit subranging ADC using reference voltage precharging architecture," *IEEE J. Solid-State Circuits*, vol. 44, no. 12, pp. 2881–2890, Dec. 2009. - [9] K. Kattmann and J. Barrow, "A technique for reducing differential non-linearity errors in flash A/D converters," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 1991, pp. 170–171. - [10] H. Pan and A. A. Abidi, "Spatial filtering in flash A/D converters," IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 50, pp. 424–436, Aug. 2003. - [11] D. A. Kerth, N. S. Sooch, and E. J. Swanson, "A 12-bit 1-MHz two-step flash ADC," *IEEE J. Solid-State Circuits*, vol. 24, pp. 250–255, Apr. 1989. - [12] M. Yotsuyanagi, H. Hasegawa, M. Yamaguchi, M. Ishida, and K. Sone, "A 2 V, 10 b, 20 Msample/s, mixed-mode subranging CMOS A/D converter," *IEEE J. Solid-State Circuits*, vol. 30, no. 12, pp. 1533–1537, Dec. 1995. - [13] H. ven der Ploeg and R. Remmers, "A 3.3-V, 10-b, 25-Msample/s two-step ADC in 0.35-μm CMOS," *IEEE J. Solid-State Circuits*, vol. 34, no. 12, pp. 1803–1811, Dec. 1999. - [14] S. H. Lewis and P. R. Gray, "A pipelined 5-Msample/s 9-bit analog-to-digital converter," *IEEE J. Solid-State Circuits*, vol. SC-22, pp. 954–961, Dec. 1987. - [15] Y.-H. Chung and J.-T. Wu, "A CMOS 6-mW 10-bit 100-MS/s two-step ADC," in *Proc. IEEE Asian Solid-State Circuits Conf.*, Nov. 2009, pp. 137–140. - [16] P. M. Figueiredo, P. Cardoso, A. Lopes, C. Fachada, N. Hamanishi, K. Tanabe, and J. Vital, "A 90 nm CMOS 1.2 V 6 b 1 GS/s two-step subranging ADC," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2006, pp. 568–569. - [17] M. Dessouky and A. Kaiser, "Very low-voltage digital-audio $\Delta\Sigma$ modulator with 88-dB dynamic range using local switch bootstrapping," *IEEE J. Solid-State Circuits*, vol. 36, no. 3, pp. 349–355, Mar. 2001. - [18] B. Murmann and B. E. Boser, "A 12-bit 75-MS/s pipelined ADC using open-loop residue amplification," *IEEE J. Solid-State Circuits*, vol. 38, no. 12, pp. 2040–2050, Dec. 2003. - [19] C. R. Grace, P. J. Hurst, and S. H. Lewis, "A 12-bit 80-MSample/s pipelined ADC with bootstrapped digital calibration," *IEEE J. Solid-State Circuits*, vol. 40, no. 5, pp. 1038–1046, May 2005. - [20] B. D. Sahoo and B. Razavi, "A 12-Bit 200-MHz CMOS ADC," *IEEE J. Solid-State Circuits*, vol. 44, no. 9, pp. 2366–2380, Sep. 2009. - [21] A. Panigada and I. Galton, "A 130 mW 100 Ms/s pipelined ADC with 69 dB SNDR enabled by digital harmonic distortion correction," *IEEE J. Solid-State Circuits*, vol. 44, no. 12, pp. 3314–3328, Dec. 2009. - [22] P. C. Parks, "A. M. Lyapunov's stability theory—100 years on," *IMA J. Math. Control Inform.*, vol. 9, pp. 275–303, 1992. - [23] Y. Chiu, P. R. Gray, and B. Nikolić, "A 14-b 12-MS/s CMOS pipeline ADC with over 100-dB SFDR," *IEEE J. Solid-State Circuits*, vol. 39, no. 12, pp. 2139–2151, Dec. 2004. - [24] M. Yoshioka, M. Kudo, T. Mori, and S. Tsukamoto, "A 0.8 V 10 b 80 MS/s 6.5 mW pipelined ADC with regulated overdrive voltage biasing," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2007, pp. 452–453. - [25] K. Honda, M. Furuta, and S. Kawahito, "A low-power low-voltage 10-bit 100-MSample/s pipeline A/D converter using capacitance coupling techniques," *IEEE J. Solid-State Circuits*, vol. 42, no. 4, pp. 757–765, Apr. 2007. - [26] H.-C. Choi, Y.-J. Kim, S.-W. Yoo, S.-Y. Hwang, and S.-H. Lee, "A programmable 0.8-V 10-bit 60-MS/s 19.2-mW 0.13-μm CMOS ADC operating down to 0.5 V," *IEEE Trans. Circuits Syst. II*, vol. 55, no. 4, pp. 319–323, Apr. 2008. - [27] S.-K. Shin, Y.-S. You, S.-H. Lee, K.-H. Moon, J.-W. Kim, L. Brooks, and H.-S. Lee, "A fully-differential zero-crossing-based 1.2 V 10 b 26 MS/s pipelined ADC in 65 nm CMOS," in VLSI Circuits Symp. Dig., Jun. 2008, pp. 218–219. - [28] J. Hu, N. Dolev, and B. Murmann, "A 9.4-bit, 50-MS/s, 1.44-mW pipelined ADC using dynamic source follower residue amplification," *IEEE J. Solid-State Circuits*, vol. 44, no. 4, pp. 1057–1066, Apr. 2009. Yung-Hui Chung (S'04) received the B.S. and M.S. degrees in control engineering from National Chiao-Tung University, Hsin-Chu, Taiwan, in 1992 and 1994, respectively, and the Ph.D. degree in electronics engineering from National Chiao-Tung University, Hsin-Chu, Taiwan, in 2010. From 1994 to 1998, he worked in OES/ITRI to develope the optical disk drive (ODD) and system emulation. From 1998 to 1999, he was an engineer working on analog circuits in ERSO/ITRI. From 1999 to 2003, he was working on the clock generation circuits in Global Unichip Corporation and Faraday Technology Corporation. His current research interests include clock generation circuits and high-speed low-power data converters. **Jieh-Tsorng Wu** (S'83–M'87–SM'06) was born in Taipei, Taiwan. He received the B.S. degree in electronics engineering from National Chiao-Tung University, Hsin-Chu, Taiwan, in 1980, and the M.S. and Ph.D. degrees in electrical engineering from Stanford University, Stanford, CA, in 1983 and 1988, respectively. From 1980 to 1982, he served in the Chinese Army as a Radar Technical Officer. From 1982 to 1988, at Stanford University, he focused his research on high-speed analog-to-digital conversion in CMOS VLSI. From 1988 to 1992, he was a Member of Technical Staff at Hewlett-Packard Microwave Semiconductor Division in San Jose, CA, and was responsible for several linear and digital gigahertz IC designs. Since 1992, he has been with the Department of Electronics Engineering, National Chiao-Tung University, Hsin-Chu, Taiwan, where he is now a Professor. His current research interests are high-performance mixed-signal integrated circuits. Dr. Wu is a member of Phi Tau Phi.