# An On-Chip Test Structure and Digital Measurement Method for Statistical Characterization of Local Random Variability in a Process

Saibal Mukhopadhyay, Member, IEEE, Keunwoo Kim, Senior Member, IEEE, Keith A. Jenkins, Senior Member, IEEE, Ching-Te Chuang, Fellow, IEEE, and Kaushik Roy, Fellow, IEEE

Abstract—This paper presents an on-chip characterization method for random variation in minimum sized devices in nanometer technologies, using a sense amplifier-based test circuit. Instead of analog current measurements required in conventional techniques, the presented circuit operates using digital voltage measurements. Simulations of the test structure using predictive 70 nm and hardware based 0.13  $\mu$ m CMOS technologies show good accuracy (error ~ 5%-10%) in the prediction of random variation even in the presence of systematic variations. A test chip is fabricated in 0.13  $\mu$ m bulk CMOS technology and measured to demonstrate the operation of the test structure.

*Index Terms*—Characterization, digital measurement, on-chip test structure, random variation, sense amplifier.

# I. INTRODUCTION

**L** OCAL random variation in transistor parameters, particularly, threshold voltage (Vt), increases with technology scaling and can degrade circuit robustness [1]–[3]. For small transistors in nanometer technologies intrinsic fluctuation in Vt, due to effects such as random dopant fluctuations (RDF) or line edge roughness (LER) can dominate the mismatch in neighboring devices [4]. The effect of this local randomness is most pronounced in area constrained circuits, such as Static Random Access Memory (SRAM) cells, and limits the density scaling [3], [4]. Hence, measurement, characterization, and estimation of local random variability in process are crucial for yield learning and enhancement in nanoscaled technologies, particularly, for SRAM design.

Conventionally, differential current measurement between identical neighboring devices is used to characterize local random Vt mismatch [4]–[7]. However, measurement of small currents through minimum size transistors requires sophisticated analog measurement techniques. Moreover, complex data manipulation and analysis is required to extract Vt differences

Manuscript received March 28, 2007; revised May 6, 2008. Current version published September 10, 2008. Test chip fabrication funded by MOSIS.

S. Mukhopadhyay is with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: saibal@ece. gatech.edu).

K. Kim and K. A. Jenkins are with the IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 USA (e-mail: kkim@us.ibm.com; jenkinsk@us. ibm.com).

C.-T. Chuang is with the National Chiao Tung University, Hsinchu, Taiwan 300, R.O.C. (e-mail: chingte.chuang@gmail.com).

K. Roy is with the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907 USA (e-mail: kaushik@ecen.purdue.edu).

Digital Object Identifier 10.1109/JSSC.2008.2001896

from current differences. Hence, this method is unwieldy for on-chip characterization of local mismatches. An on-chip characterization can significantly reduce the time and cost associated with the collection of a large number of variability data (lower characterization time and cost). This paper demonstrates a sense-amplifier based test circuit and measurement method to characterize local random variation in a process. In this method offset voltage of sense-amplifier is used to measure device mismatch. Further, a built-in-self-test scheme for on-chip measurement of device mismatch is proposed. The primary advantages of presented test structure and measurement scheme over conventional methods are that:

- it provides a direct measurement of complete probability distribution of local mismatch;
- it provides a simple digital measurement technique instead of complex analog voltage-current measurements;
- the possibility of digital measurement suggests a fast, on-chip self-characterization scheme to measure random variability.

The test structure is designed and simulated in predictive 70 nm technology [8], hardware-based 0.13  $\mu$ m bulk CMOS and sub-90 nm silicon-on-insulator (SOI) technologies, to show its accuracy. A test chip was designed in 0.13  $\mu$ m bulk CMOS technology and fabricated through MOSIS services. The measurement of the test chip successfully demonstrates the operation of the test structure in measuring local random variation in a process.

The rest of the paper is organized as follows. Section II presents the theoretical analysis of the test circuit. Section III describes the test structure and the measurement methods. Section IV presents the statistical simulation results to verify the operation of the test circuit. Section V presents the test chip design and measurement results. Section VI draws the conclusions.

### II. MISMATCH MEASUREMENT METHOD

This mismatch characterization scheme uses a current latchtype sense amplifier (CLSA) [9] based test circuit to measure the local random variability of a process. Fig. 1(a) shows the circuit schematic and basic operation of the sense-amplifier circuit. When the sense amplifier enable signal (SAE) is low, the nodes OUT and OUTB are pre-charged to V<sub>DD</sub>. When SAE is raised high, if V<sub>IN</sub> = V<sub>DD</sub> > V<sub>INB</sub> (= V<sub>DD</sub> -  $\Delta$ ), OUT discharges at a rate faster than OUTB. If OUT reaches below the trip-point of the inverter  $P_{\text{INVB}} - N_{\text{INVB}}$ , the node OUTB switches back to "1" and OUT goes to "0". However, if there is a mismatch in



Fig. 1. Current latch sense-amplifier based test structure for local variability measurement. (a) Circuit schematic and waveform of CLSA. (b) Effect of random and correlated variation (Vt, L, and W) on offset voltage.

the threshold voltage of the transistors, it is possible that even if  $V_{IN} > V_{INB}$ , node OUT can become "1" while OUTB goes to "0", resulting in an incorrect operation. A higher value of  $\Delta$ is required to avoid this incorrect operation. The offset voltage (V<sub>os</sub>) of this circuit is defined as the minimum voltage difference  $(\Delta)$  between V<sub>IN</sub> and V<sub>INB</sub> required for correct sensing. Let us now investigate the effect of process variation on offset voltage. First, random Vt variations were applied independently to all the transistors in the circuit and Monte Carlo simulation was performed using predictive 70 nm devices [8] to extract the offset voltage. Next, a correlated component was added to the Vt variations. Fig. 1(b) shows that the offset voltage is a strong function of random Vt variation, while correlation does not significantly impact its distribution. An increase in the random mismatch increases the spread of the offset voltage. This shows that the offset voltage of CLSA eliminates the effect of systematic variation and depends only on the random components.

# A. Analysis of Offset Voltage

To understand how the CLSA can be used for measurement of Vt mismatch, consider the origin of the offset voltage. In this analysis, we will assume that all the different sources of random local variation is lumped into a single parameter, i.e., threshold voltage Vt. This is a reasonable assumption for narrow-width devices in nanometer technologies (such as the ones used in SRAM) as the local variation is dominated by intrinsic fluctuations in Vt due to effects such as random dopant fluctuations, line edge roughness, etc. If all devices in CLSA on the two sides of the symmetry line " $l_{sym}$ " are identical, any voltage difference between the inputs can be sensed correctly (i.e.,  $V_{os} = 0$ ). Assume a Vt difference between the driver transistors ( $N_{DR}, N_{DRB}$ ) such that  $Vt_{NDR} > Vt_{NDRB}$ . This suggests that, although  $V_{INB} < V_{IN}$ , it can be possible that  $I_{OUT} < I_{OUTB}$  as follows:

$$I_{OUT} \propto (V_{IN} - Vt_{NDR})^2 \text{ and } I_{OUTB} \propto (V_{INB} - Vt_{NDRB})^2;$$
  
$$\therefore (Vt_{NDR} - Vt_{NDRB}) > (V_{IN} - V_{INB}) = > I_{OUTB} > I_{OUT} \quad (1)$$

where  $I_{\rm OUT}$  is the discharging current for node OUT and  $I_{\rm OUTB}$  is the discharging current for node OUTB. If  $I_{\rm OUTB} > I_{\rm OUT}$ , node OUTB discharges at a rate faster than OUT resulting in an incorrect sensing. Hence, for proper sensing,

$$V_{\rm os} = \Delta V_{\rm IN} = (V_{\rm IN} - V_{\rm INB}) > (V t_{\rm NDR} - V t_{\rm NDRB}).$$
(2)

From (2) we can observe that Vt mismatch in the driver transistors results in a non-zero offset voltage. Similarly, difference between the trip-points of the two cross-coupled inverters in latch  $(P_{INV} - N_{INV}, P_{INVB} - N_{INVB})$  can also increase the offset voltage. The total offset voltage is linear combination of the offset due to Vt mismatch only in the driver FETs  $(V_{os-driver})$  and that due to Vt mismatch only in the latch FETs  $(V_{os-latch})$ . To verify this, worst-case Vt mismatch was applied only to latch FETs, then only to driver FETs, and finally to all the devices. Simulation using predictive 70 nm devices shows that the total offset is a linear combination of  $V_{os-latch}$ and  $V_{os-driver}$  [Fig. 2(a)]. The offset voltage due to the driver FETs is given by the input voltage difference required to make the current through  $N_{\rm DR}$  and  $N_{\rm DRB}$  equal. From (2) it can be concluded that  $V_{os-driver}$  is the same as the Vt mismatch between the driver FETs. Hence, we obtain

$$V_{\rm os-driver} = V t_{\rm NDR} - V t_{\rm NDRB} = \Delta V t_{\rm DR}.$$
 (3)

The offset voltage due to the latch is more difficult to estimate. To understand the latch offset, consider that  $I_{OUT} > I_{OUTB}$ . Hence, the time required for node OUT to reach the trip-point of the inverter  $P_{INVB} - N_{INVB}$  (say,  $T_{OUT}$ ) should be less than the time required for node OUTB to reach the trip-point of the inverter  $P_{INV} - N_{INV}$  (say,  $T_{OUTB}$ ). Assuming a constant discharging current until this time and a step input to SAE, we can obtain

$$T_{OUT} = C_{L}(V_{DD} - V_{TRIPB})/I_{OUT}$$
  
and 
$$T_{OUTB} = C_{L}(V_{DD} - V_{TRIP})/I_{OUTB}$$
 (4)

where  $C_L$  is the load capacitance,  $V_{TRIP}$  is the trip-point of the inverter associated with  $P_{INV} - N_{INV}$ , and  $V_{TRIPB}$ is the trip-point associated with  $P_{INVB} - N_{INVB}$ . For correct sensing,  $T_{OUT} < T_{OUTB}$ , and for incorrect sensing,  $T_{OUT} > T_{OUTB}$ . Hence, latch offset is given by the input voltage required to have  $T_{OUT} = T_{OUTB}$ . Considering Vt mismatch in latch (Vt<sub>NINV</sub> = Vt<sub>NINVB</sub> +  $\Delta$ Vt<sub>N</sub> and Vt<sub>PINV</sub> = Vt<sub>PINVB</sub> -  $\Delta$ Vt<sub>P</sub>) we get

$$T_{OUT} - T_{OUTB} = \frac{C_L}{I_{OUTB}} \left( -(V_{DD} - V_{TRIPB}) \frac{\Delta I}{I_{OUT}} + \Delta_{TRIP} \right),$$

$$I_{OUT} = I_{OUTB} + \Delta I,$$

$$V_{TRIP} = V_{TRIPB} + \Delta_{TRIP},$$

$$\Delta_{TRIP} = \frac{\Delta V t_P + \Delta V t_N r}{1 + r},$$

$$r = \sqrt{\frac{\beta_N}{\beta_P}},$$
(5)

where  $\Delta I$  is the current difference between the two paths. Since we are interested in the latch offset here, we assume that the driver offset is zero. In other words, Vt mismatch in latch FETs is considered and no mismatch is considered for driver FETs (i.e.,  $\Delta I > 0$  as  $V_{IN} > V_{INB}$  and  $Vt_{NDR} = Vt_{NDRB}$ ). Therefore, for correct sensing,

$$T_{OUT} < T_{OUTB} \Longrightarrow (V_{DD} - V_{TRIPB}) \frac{\Delta I}{I_{OUTB}}$$
$$> \Delta_{TRIP} \Longrightarrow \Delta I > \frac{\Delta_{TRIP} I_{OUTB}}{V_{DD} - V_{TRIPB}}.$$
(6)

The current difference between the two branches can be obtained by solving the differential stage formed by  $N_{\rm DR}$ ,  $N_{\rm DRB}$ , and  $N_{\rm CLK}$  and is given by [9]  $\Delta I = \sqrt{2\beta_{\rm DR}I_0}\Delta V_{\rm IN}$ , where  $I_0$ is the current through the clock transistors. Hence, we get

$$\begin{split} \Delta V_{\rm IN} &> \frac{\Delta_{\rm TRIP} I_{\rm OUTB}}{(V_{\rm DD} - V_{\rm TRIPB})\sqrt{2\beta_{\rm DR}I_0}} \\ &= \frac{\Delta_{\rm TRIP}\sqrt{I_0}}{2(V_{\rm DD} - V_{\rm TRIPB})\sqrt{2\beta_{\rm DR}}}; \\ I_{\rm OUTB} &\approx \frac{I_0}{2} \\ &=> V_{\rm os-latch} \approx \frac{\sqrt{I_0}}{2(V_{\rm DD} - V_{\rm TRIPB})\sqrt{2\beta_{\rm DR}}} \\ &\qquad \times \left(\frac{\Delta V t_P + \Delta V t_N \sqrt{(\beta_N/\beta_P)}}{1 + \sqrt{(\beta_N/\beta_P)}}\right). \end{split}$$
(7)

The above analysis shows that  $V_{os-driver}$  is a direct measure of the local Vt mismatch while  $V_{os-latch}$  introduces estimation error. Moreover,  $V_{os-driver}$  does not depend on the size of other transistors. Hence, we propose to use the driver transistors as the device under test (DUT) and the offset voltage of CLSA is measured to obtain Vt mismatch. Therefore, the statistics of the offset voltage directly measure the statistics of local random Vt variations. To improve the accuracy of this method, the latch offset needs to be minimized. This can be achieved by reducing the size of the clock transistor (i.e., reducing  $I_0$ ) and increasing the NFET-to-PFET



Fig. 2. Design and optimization of CLSA circuit for random variability measurement. (a) Effect of driver and latch offset. (b) Effect of width of clock Tx ( $W_{CLK}$ ) on latch offset. (c) Effect of NFET to PFET width ratio ( $W_N/W_P$ ) on latch offset.

beta ratio as demonstrated in Fig. 2(b) and (c) [9]. Increase in sensing delay due to a smaller clock transistor (which makes it unsuitable for SRAM application) is not a major concern for this application. Further, a slower rise of the SAE also helps to reduce  $V_{\rm os-latch}$ .



Fig. 3. Implementation of the test structure. (a) Optimized CLSA circuit. (b) Complete test structure.

#### **III. TEST STRUCTURE AND TESTING METHOD**

## A. Organization of the Test Structure

The basic element of the test structure is the CLSA circuit optimized to reduce the latch offset [Fig. 3(a)]. To minimize the latch offset, the latch FETs are designed to be large, since random variation decreases with size, with large NFET-to-PFET width ratio ( $\sim 8$ ) and a small clock transistor  $(N_{\text{CLK}})$  is used (same as driver FETs). Driver FETs (i.e., DUTs) are placed in closest possible proximity and inputs (V<sub>IN</sub> and V<sub>INB</sub>) are connected to the gates of DUTs. One of the outputs of the latch [Fig. 3(a)] is used for measurement. A sufficient number of the optimized CLSA structures are arranged in an array [Fig. 3(b)]. The inputs (IN, INB) of all CLSA in a column are shared. Note that this sharing does not significantly increase the current load as IN and INB are connected to the MOSFET gate (high-impedance path). The SAE signals (measurement clock) are gated using row-select signals. The array measurement is performed by measuring one CLSA at a time and determining its offset. While measuring a single CLSA, SAE is zero for the unselected rows. Using column decoder, inputs of CLSAs in the unselected columns are kept at "0" (i.e., driver FETs are off). This prevents switching of all unselected CLSAs in a selected row. Prevention of switching in



Fig. 4. On-chip system for random variability measurement.

all unselected CLSAs reduces power dissipation and spurious transitions during testing/characterization.

## B. Characterization Method

First, a decoder circuit selects one CLSA at a time.  $V_{DD}$  is applied to IN and  $V_{DD} - \Delta$  to INB of the selected CLSA, where  $\Delta$  is initially set to zero. It should be noted that SAE is low, which pre-charges OUT and OUTB to high. Next, SAE is raised high and kept high for a reasonably long period of time since the clock transistor is small and the delay is expected to be large. It is expected that OUTB will be high and OUT will be low if there is no Vt mismatch. Hence, when  $V_{DD}$  is applied to IN, OUT is compared to "0". If OUT is observed to be "1", in the next step  $\Delta$  is increased in a small step and the measurement is repeated. This process is repeated until correct offset is reached (i.e., OUT changes to "0"). The final  $\Delta$  value for the sense amplifier is stored and the measurement for next CLSA is started with  $\Delta$  reset to zero.

## C. On-Chip Variability Measurement System

The above discussion shows that, although the CLSA based test structure operates based on differential current between two devices, it does not require analog measurement of the current difference. It only needs application of a voltage difference and measurement of a digital output (digital signature of the local variation). Hence, this scheme can be used to design an in-line on-chip built-in-self-test (BIST) circuit for random variability measurement, which is described in Fig. 4. An on-chip voltage divider network can be used to generate different  $\Delta s$ . The on-chip test controller selects a sense-amplifier to apply the  $\Delta$  (starting from  $\Delta = 0$ ) and compares its output to determine if there is a failure. In case of a failure, the controller advances its state and selects the next  $\Delta$ . As soon as a success is detected, the digitized  $\Delta$  value is stored in an on-chip memory. A self-test option makes the characterization simpler and faster compared to the conventional methods.

# **IV. STATISTICAL SIMULATION RESULTS**

The effectiveness of the test structure is evaluated through Monte Carlo simulations. Random and correlated Vt and L shifts were applied to all the transistors in the circuit where area-dependent variations for Vt were assumed. The simulated distribution of the offset voltage was compared to the distribution of the applied Vt shift. Prediction errors for standard deviation and entire distributions are given by

$$\frac{\sigma(\text{applied Vt mismatch}) - \sigma(\text{estimated Vt mismatch})}{\sigma(\text{applied Vt mismatch})}$$
(8)

$$Y_{x\sigma} = \frac{\# \text{ of instances within } - x\sigma \text{ to } + x\sigma}{\text{total } \# \text{ of measurements}}$$
(9)

where the "applied Vt mismatch" refers to the mismatch applied to the different devices in the CLSA while performing the SPICE simulations, and the "estimated Vt mismatch" refers to the offset voltage values (which is expected to be same as the applied Vt mismatch) the obtained from the simulations. The difference between the true (computed from applied Vt variation values) and estimated (computed from offset values obtained from the simulations) values of  $Y_{x\sigma}$  (for different *x*) is used to quantify the estimation error in the distribution.

# A. Estimation of Vt Mismatch

First, we have performed Monte Carlo simulations of the test circuit considering random Vt variation in each transistor in the test circuit. In the Monte Carlo simulation, a set of seven random Vt values (one for each transistor in CLSA) represents one random instance of the test circuit. A large number ( $\sim 1000$ ) of such random instances of the test circuit were simulated and the offset voltage for each case was estimated from the simulation. The offset voltage distribution thus obtained is referred to as the "estimated Vt mismatch" in the following analysis. As mentioned before, the Vt variation applied to the devices while performing the simulation, is referred to as the "applied Vt mismatch" in the following analysis. Simulation using predictive 70 nm devices shows that the estimated Vt mismatch (i.e., offset voltage) distribution closely follows the applied Vt shifts (Fig. 5). The estimation error in standard deviation was observed to be within 8% [Fig. 5(b)]. It can also be observed that simulated offset values tend to overestimate the Vt distribution, due to non-zero latch offset. It was observed that reducing the clock transistor size improves the estimation accuracy in both standard deviation and Vt distribution (Fig. 5). On the other hand, increasing the size of the latch transistors helps to reduce estimation error as the Vt mismatch inversely depends on device width. However, increasing the PFET size beyond a certain point only has a small impact on error, even assuming only area-dependent variation. Thus, the latch PFETs were to have large  $W_N/W_P$  ( $\geq 8$ ) which helps reduce latch offset due to area-independent components of Vt mismatch. The test circuit can also obtain a good estimate of the mismatch distribution even if the distribution is non-normal in nature [Fig. 5(c)]. Estimation accuracy improves as the current through clock transistor reduces due to lower latch offset. This can be achieved by using  $V_{\rm IN}$  =  $V_{\rm TEST}$  and  $V_{\rm INB}$  =  $V_{\rm TEST}$  –  $\Delta$  and using  $V_{TEST}$  lower than  $V_{DD}$  (Fig. 6). Fig. 7(a) shows that even for correlated Vt distribution, the test circuit can correctly estimate  $\sigma$  and  $Y_{x\sigma}$ . The error in the prediction of complete cumulative



Fig. 5. Random Vt mismatch. (a) Vt mismatch estimation. (b) Estimation of standard dev- $\sigma$ . (c) Vt mismatch estimation for non-normal mismatch distribution.

distribution is also small [Fig. 7(b)]. For on-chip measurement it is necessary that inter-die variation should have minimal impact on test circuit operation. To evaluate this, both inter-die Vt shift (same for all the transistors) and local (random and correlated) Vt variation (same at all inter-die corners) were applied



Fig. 6. Effect of input voltage level on estimation error.



Fig. 7. Estimation of Vt mismatch in the presence of correlation: (a) effect of correlation among the device Vt variation on estimation error, (b) estimation of cumulative distribution.

to the transistors. The test circuit can correctly predict the Vt mismatch at all inter-die corners (Fig. 8).



Fig. 8. Effect of die-to-die variation on estimation accuracy.

### B. Application of Test Circuit

Along with the intrinsic Vt fluctuations, neighboring devices are also expected to have geometric mismatches, e.g., channel length variation. Let us analyze the effectiveness of the test circuit in predicting the total random variation in process. This is useful to analyze whether estimated distribution can be used for process optimization and/or circuit simulation.

Estimation of Device Variation: We have studied the effectiveness of the proposed circuit in measuring total device variation when both local geometry and threshold mismatches are present in a technology. We have performed Monte Carlo simulations of the test circuit considering random L and Vt variation in each transistor in the test circuit. In the Monte Carlo simulation, a set of seven random L and Vt values (one for each transistor in CLSA) represents one random instance of the test circuit. A large number ( $\sim 1000$ ) of such random instances of the test circuit were simulated and the offset voltage for each case was estimated from the simulation. Hence, the above Monte Carlo simulation provides a distribution of the offset voltage considering mismatch in both Vt and L. Next, the estimated offset distribution was applied as Vt variation to two identical transistors to obtain their current mismatch. The current mismatch thus obtained is referred to as the "estimated mismatch" in Fig. 9. We have also directly applied the random L and Vt variation (with the same standard deviation as applied in the case of Monte Carlo simulation of the test circuit) to these two identical transistors and obtained their current mismatch. The current mismatch thus obtained is referred to as the "true mismatch" in Fig. 9. The estimated current mismatch observed to closely follow the true current mismatch (Fig. 9). This is due to the fact that offset voltage not only depends on the Vt mismatch but also on other local mismatches. Hence, the offset distribution can closely predict the total random mismatch in process and is useful at the initial phase of technology development. Moreover, the obtained offset distribution can be used as "Vt mismatch" for circuit design and simulation.

*Estimation of SRAM Variability:* The application of the offset distribution is considered in predicting characteristics of SRAM cell under process variation. As in the previous case, Vt and L variations were applied to the transistors in test circuit to estimate offset distribution of DUTs of different widths and use



Fig. 9. Estimation of current mismatch in devices due to random variation in Vt and L. (a) Linear current. (b) Saturation current.

this distribution as Vt distribution to obtain cell characteristics. The cell characteristics (namely, read current, read voltage, and trip-point) distribution thus obtained are referred to as the "estimated distributions" in Fig. 10. Next, we applied the L and Vt variations directly to the SRAM transistors and re-obtained the distributions of these cell characteristics. The distribution thus obtained is referred to as the "true distributions" in Fig. 9. The estimated distribution of read current closely follows the true read current distribution obtained by applying both Vt and L variations directly to cell transistors [Fig. 10(a)]. The variation in read voltage (i.e., voltage to which the node storing "0" rises while reading) and trip voltage (trip-point of the inverter associated with the node storing "1") can also be predicted with good accuracy [Fig. 10(b)]. This suggests that measured offset voltage can be used for simulations and estimation of random variation effects in SRAM cell characteristics.

#### C. Verification Using Hardware Based Models

The functionality and effectiveness of the test circuit is also verified using industrial standard, well-characterized hardware based models. First, the test circuit is optimized in 0.13  $\mu$ m bulk CMOS technology. Monte Carlo simulations of the test circuit are performed using the process variation parameters internal to the technology model, which are calibrated against hardware. Along with RDF, other sources of mismatch (e.g.,



Fig. 10. Estimation of SRAM variability with random and correlated  $\rm Vt$  and L variation. (a) Variation in read current. (b) Read and trip voltage variation.

geometric mismatch, orientation dependent mismatch, etc.) were also included in the simulation. The simulated offset voltage is then used as Vt distribution of the devices (as explained in Section IV-B) to estimate mismatch in saturation current between two minimum sized identical devices. The estimated mismatch closely follows its true value obtained by direct Monte Carlo simulation using process variations internal to the technology [Fig. 11(a)].

We also verified the test circuit in a sub-90 nm SOI process through simulations using hardware based models. In this case, intentional Gaussian variations in Vt were applied. The test circuit successfully estimated the applied variation [Fig. 11(b)]. The verification using hardware based models shows that the test circuit can be very useful in predicting local variation both in bulk CMOS and SOI technologies.

#### D. Analysis and Discussions

The effectiveness of the proposed design strongly depends on the following factors: 1) the number of test structure required; 2) characterization time; and 3) the resolution of the offset voltage.

*Number of Test Structures:* Increasing the number of test structure will reduce the estimation error at the expense of higher test cost (larger area) and test time. To estimate the number of sense amplifier required, assume that the estimated



Fig. 11. Verification of the test circuit using hardware-based models. (a) Estimation of current mismatch in 0.13  $\mu$ m technology. (b) Estimation of applied Vt mismatch in a sub-90 nm SOI technology. The mismatch values are normalized with respect to the standard deviation of the true mismatch.

standard deviation of Vt is s and its true value is  $\sigma$ . Confidence interval for  $\sigma$  is given by [10]

$$\sqrt{(n-1)s^2/\chi^2_{\alpha/2,(n-1)}} \le \sigma \le \sqrt{(n-1)s^2/\chi^2_{1-\alpha/2,(n-1)}}$$
(10)

where  $(1 - \alpha)$  is the confidence level, *n* is the total number of test structures, and  $\chi^2$  is inverse function for the chi-square distribution with (n - 1) degrees of freedom. From (10), we obtain

$$1 - \sqrt{\frac{\chi_{\alpha/2,(n-1)}^2}{(n-1)}} \le \text{error} = \frac{\sigma - s}{\sigma} \le 1 - \sqrt{\frac{\chi_{1-\alpha/2,(n-1)}^2}{(n-1)}}.$$
(11)

Fig. 12(a) shows the maximum percentage error in estimated value of  $\sigma$  for different numbers n of test structures. From Fig. 12(a), it is estimated that ~ 200 test structures are sufficient to measure Vt mismatch within 10% error with a 95% confidence level.

Characterization Time: The time required to test n CLSA is a major design/analysis parameter for the test structure. The expected value of the characterization time is given by

$$E(T_C) = T_0 \sum_{k=1}^{n} E(S_k) = n T_0 E(S_k)$$
 (12)



Fig. 12. Area and time requirements of the test circuit. (a) Number of test structure required for acceptable errors. (b) Expected value of the characterization time.

where  $T_0$  is the time required for measurement of single step,  $S_k$  is the number of steps required to measure a single CLSA, and n is the number of test structures. It is obvious that total characterization will increase with an increase in the number of structures. It is interesting to note that the characterization time in the proposed circuit also depends on the variability in process. To understand this property, let us evaluate the expected number of steps required to characterize a test structure  $[E(S_k)]$  using the proposed method. Since the offset voltages of all the CLSAs are identical independent variables, their expected values are equal and can be obtained as

$$E(S_k) = \sum_{j} j \times P_j \ (P_j = \text{prob. that } j \text{ steps are required})$$
$$= 2\sum_{j} j \times [\Phi(j\Delta/\sigma) - \Phi((j-1)\Delta/\sigma)]$$
(assuming Normal Vt dist.) (13)

where  $\Phi$  is the cumulative distribution function for Normal distribution. Fig. 12(b) shows the variation of characterization time for different process variation ( $\sigma$ ) and measurement resolution ( $\Delta$ ). A higher process variation and smaller  $\Delta$  increases characterization time. For reasonable values of process



Fig. 13. Effect of minimum offset resolution on Vt estimation. (a) Mismatch distribution. (b) Standard deviation and Y errors.

variations, the test time is calculated to be less than 100  $\mu$ +s (significantly smaller than conventional methods). The dependence of the characterization time on process variability is an important property of the proposed design. Since in this method we modify input step  $\Delta$  until we observe a change of state at the output, a larger mismatch between driver devices will require a larger number of voltage steps. Higher process variability implies a larger number of test structures will have driver devices with high mismatch and will require a higher number of voltage steps. Therefore, the total characterization time will increase with an increase in process variability. This is in contrast with the conventional mismatch characterization technique using  $I_{\rm d}$  –  $V_{\rm g}$  measurements. The number of  $V_{\rm g}$ steps require for  $I_d$  –  $V_g$  characterization is independent of the process variation. Therefore, the characterization time for conventional  $I_d - V_g$  measurement is independent of the variability in process.

Minimum Resolution of Input Voltage: It is expected that using a higher step size for increasing  $\Delta$  (i.e., higher minimum resolution) will increase the measurement error. By simulating the test circuit to estimate offset voltage using different minimum resolution, it was observed that a resolution of 10 mV can provide good estimation accuracy (error <10%) (Fig. 13).



Fig. 14. Partial die photo showing the test structure for local variability measurement.



Fig. 15: Measurement of local random Vt mismatch in a die. (a) Layout of the complete test structure implemented in the test chip. (b) Measured values (in mV) of the offset voltages in different location of the die.

# V. TEST CHIP AND MEASUREMENT RESULTS

A test chip is fabricated in 130 nm triple-well bulk CMOS technology through MOSIS services and measured to demonstrate the operation of the test structure. Fig. 14 shows the partial die photo of the test chip with the test structure. Fig. 15(a)shows the layout the local variability sensor, which contains two arrays each with 256 ( $16 \times 16$ ) CLSAs. Individual CLSAs in the structure are accessed using a 5-bit row and 4-bit column decoder. The 512 CLSAs are divided into eight groups each with 64 CLSAs. The groups are designed to have DUTs of different widths (W<sub>min</sub>, 2W<sub>min</sub>, 4W<sub>min</sub>, 8W<sub>min</sub>), different channel lengths (2L<sub>min</sub>, 4L<sub>min</sub>), different Vts (regular Vt and high Vt), and with rotated layout. All NMOS devices, except DUTs, are designed in the isolated p-well of the triple-well process. Digital nature of the test structure allowed software controlled automated measurement of local mismatch. Measurements are performed at  $V_{DD} = 1.5 \text{ V}, V_{TEST} = 1.0 \text{ V}$ and clock period =  $300 \ \mu s$ .

Fig. 15(b) shows the measured offset voltage for different CLSAs for a particular die are random and local in nature. As expected from the discussion in Sections II and IV, the spatial correlation was observed to be negligible. The randomness in the measured data is clearly larger for the groups with width W (row 0–3) compared to the groups with higher widths (e.g.,



Fig. 16. Measured values of the local mismatches from different dies at different spatial locations for minimum width and regular Vt devices.

group with 8W, row 12–15). Fig. 16 shows the spatial variation of Vt mismatch values for minimum size devices from two dies. It can be observed that there is minimal die-to-die correlation between mismatch values at a given spatial location. Further, as expected from discussions in Sections II and IV, the within-die spatial correlation was also observed to negligible. Fig. 17(a) shows the Vt mismatch distribution for minimum size devices. The distribution was observed to be close to Normal. Moreover, the difference in the offset, i.e., Vt mismatch, distribution obtained from different dies was observed to be small. This suggests that the die-to-die variation has a weak impact on the measurement accuracy of the test structure, as predicted in Fig. 8. Fig. 17(b) shows the measured offset voltage values for a single die for DUTs with different width. It can be clearly observed that the spread in the mismatch reduces for devices with larger width.

Fig. 18 shows the measured offset voltages obtained from three different dies for DUTs with higher Vt and longer channel lengths. The spread is larger for higher Vt devices, and lower for longer channel devices. This is due to the fact that higher doping in the higher Vt devices tends to increase the random dopant fluctuation effect, resulting in higher mismatch [11]. On the other hand, higher channel length increases the channel area and reduces the short channel effect [11]. Both of these effects reduce the random variability due to RDF. Further, a longer channel means the highly doped "halo" regions near the junction are shifted further away from each other which is expected to reduce the channel doping. This could also results in a lower Vt variation. Due to all these effects, the  $\sigma_{Vt}$  tends to reduce at a faster than square-root rate (as predicted from first-order analysis of RDF in [11]) with channel length.

Fig. 19 shows the measured standard deviation of mismatch for devices with different geometry and Vt. The measured standard deviation ( $\sigma$ ) values from five different dies are close to each other, which re-emphasizes the fact that the impact of chip-to-chip variation on the effectiveness and accuracy of the test circuit is very low. As observed in Fig. 17, the standard deviation of the Vt mismatch is lower for larger widths. Moreover, the  $\sigma$  values for different widths tend to follow the characteristics W<sup>-(1/2)</sup> nature expected for Vt mismatch due to RDF [11]. However, presence of area independent mismatch



Fig. 17. Measured values for Vt mismatch distribution and the effect of device width on mismatch distribution (130 nm CMOS technology). (a) Histogram of Vt mismatch distribution (different shades represent measured data from two different dies). (b) Measured Vt mismatch from different CLSAs in a die with DUTs of different width.



Fig. 18. Measurement data showing the impact of higher Vt and longer channel length on Vt mismatch. (Data obtained from three different dies in 130 nm CMOS technology.)

is also observed;  $\sigma$  reduces at a rate slower than  $W^{-(1/2)}$ . As expected from Fig. 18,  $\sigma$  of the Vt mismatch is higher for the high-Vt devices, compared to the regular Vt devices and lower for longer channel devices. The devices with different



Fig. 19. Impact of device width and threshold voltage on standard deviation of Vt mismatch. (Measured results from five different dies in 130 nm CMOS technology.)



Fig. 20. Area-efficient implementation of the test structure using a shared latch and multiplexed DUT pairs.

orientation (i.e., with  $90^{\circ}$  rotated layout) was observed to have similar Vt variation as the same orientation.

Note that in the implementation of the test circuit, a separate set of latch transistors was used with each of the DUT pairs. This is simple design but it has a higher area overhead as the latch FETs [and associated NAND gates in Fig. 3(a)] are repeated for each DUT pair. This can be avoided by using only one set of latch FETs (and associated NAND gates) and multiplexing the DUT pairs as illustrated in Fig. 20. The clock FET can also be distributed with each DUT pair. The DUT selector will select only one pair and SAE signal. For the unselected DUT pairs, SAE will be turned off along with the inputs IN and INB (both set to "0"). This results in a two transistor stack in the unselected path, which substantially reduces leakage through these paths. The leakage can be further reduced by using a small negative voltage for the SAE, IN, and INB for unselected DUTs. The number of DUTs that can be multiplexed will be determined by the leakage through the unselected path. We think as long as the current through the selected path is 100 times larger than the total leakage current through the unselected paths the circuit will provide a good indication of the Vt mismatch between selected

DUTs. For a technology with  $I_{ON}/I_{OFF}$  of ~1000 and using the fact that a two-transistor stack has ~  $10\times$  lower leakage compared to a single off device, ~100 DUT pairs can be multiplexed. We believe multiplexing 64 DUT pairs is a good choice. Note, using a single set of latch FETs not only reduces the area, it also completely eliminates the effect of latch offset in the measurement. The latch offset adds equally to each measured offset value and provides only a shift in the mean of the measured offset distribution. It has a negligible effect on the standard deviation of the measured offset distribution.

# VI. CONCLUSION

Measurement and characterization of local variation are very important for robust circuit design and better manufacturing yield. In this paper, a sense-amplifier-based test structure for fast and accurate characterization of local random variation has been demonstrated. The presented test circuit essentially measures a digital signature of local variation, thereby eliminating the need for analog measurements and complex data analysis involved in conventional mismatch characterization methods. The digital measurement technique makes the design of an on-chip built-in self-characterization scheme feasible. The effectiveness and accuracy of the test circuit is demonstrated through statistical simulations and measurement of test chip. The simulation and measurement results show that the proposed test structure can extract local random mismatch in a process with very low test time and cost. Digital nature of the testing scheme is very useful for fast and accurate characterization of process which will facilitate technology development and help make pre-silicon design decisions to improve circuit robustness, resulting in better manufacturing yield in nanometer technologies.

## ACKNOWLEDGMENT

Thanks to Dr. Keejong Kim, Purdue University, for helping with the integration of the design in a multi-project test chip and the preparation of the test board. Thanks also to MOSIS services for the fabrication of the test chip.

#### REFERENCES

- J. Meindl *et al.*, "The impact of stochastic dopant and interconnect distributions on gigascale integration," in *IEEE i.e.*, *Dig. Tech. Papers*, 1997, pp. 232–233.
- [2] A. Asenov, "Random dopant induced threshold voltage lowering and fluctuations in sub-0.1 μm MOSFET's: A 3-D "atomistic" simulation study," *IEEE Trans. Electron. Devices*, vol. 45, no. 12, pp. 2505–2513, Dec. 1998.
- [3] A. Bhavnagarwala *et al.*, "The impact of intrinsic device fluctuations on CMOS SRAM cell stability," *IEEE J. Solid-State Circuits*, vol. 36, no. 4, pp. 658–665, Apr. 2001.
- [4] A. Keshavarzi et al., "Measurements and modeling of intrinsic fluctuations in MOSTET threshold voltage," in *Proc. IEEE Int. Symp. Low Power Electronics and Design (ISLPED'05)*, San Diego, CA, Aug. 2005, pp. 26–29.
- [5] "Circuits and methods for characterizing random variations in device characteristics in semiconductor integrated circuits," U.S. 20050043908, Feb. 24, 2005.
- [6] H. Klimach et al., "Characterization of MOS transistor current mismatch," in Proc. 17th Symp. Integrated Circuits and Systems Design (SBCCI'04), Sep. 2004, pp. 33–38.
- [7] A. Bassi *et al.*, "Measuring the effects of process variations on circuit performance by means of digitally-controllable ring oscillators," in *Proc. Int. Conf. Microelectronic Test Structures (ICMTS'03)*, Mar. 2003, pp. 214–217.

- Berkeley Predictive Technology Model (BPTM). Univ. California, Berkeley [Online]. Available: http://www-device.eecs.berkeley.edu/ ~ptm/interconnect.html
- [9] B. Wicht, T. Nirschl, and D. Schmitt-Landsiedel, "Yield and speed optimization of a latch-type voltage sense amplifier," *IEEE J. Solid-State Circuits*, vol. 39, no. 7, pp. 1148–1158, Jul. 2004.
- [10] A. Papoulis and S. Pillai, Probability, Random Variables and Stochastic Processes. New York: McGraw-Hill, 2002.
- [11] Y. Taur and T. K. Ning, *Fundamentals of Modern VLSI Devices*. Cambridge, U.K.: Cambridge Univ. Press, 1998.



Saibal Mukhopadhyay (S'99–M'07) received the B.E. degree in electronics and telecommunication engineering from Jadavpur University, Calcutta, India, in 2000, and the Ph.D. degree in electrical and computer engineering from Purdue University, West Lafayette, IN, in 2006.

He is with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, as an Assistant Professor. Prior to joining Georgia Tech, he was with the IBM T. J. Watson Research Center, Yorktown Heights, NY, as a

Research Staff Member and worked on high-performance circuit design and technology-circuit co-design focusing primarily on static random access memories (SRAMs). His research interests include analysis and design of low-power and robust circuits in nanometer technologies. He has authored or coauthored more than 50 papers in refereed journals and conferences.

Dr. Mukhopadhyay received the IBM Ph.D. Fellowship Award for 2004–2005. He also received SRC Technical Excellence Award in 2005, the Best in Session Award at 2005 SRC TECNCON, and Best Paper Awards at 2003 IEEE Nano and the 2004 International Conference on Computer Design.



Keunwoo Kim (S'98–M'01–SM'06) was born in Daegu, Korea, in 1968. He received the B.S. degree in physics from Sung-Kyun-Kwan University, Seoul, Korea, in 1993, and the M.S. and Ph.D. degrees in electrical and computer engineering from the University of Florida, Gainesville, FL, in 1998 and 2001, respectively. His doctoral research was in the area of SOI and double-gate device design and modeling.

Since June 2001, he has been with the VLSI Design Department, IBM T. J. Watson Research Center,

Yorktown Heights, NY, as a Research Staff Member. He has worked on the design of high-performance and low-power microprocessors, novel VLSI circuit techniques, scaled and exploratory CMOS technology performance/power evaluation, and physics/modeling for bulk-Si, SOI, strained-Si, SiGe, hybrid orientation/device, and double-gate technologies. His present work includes IBM's POWER7 processor design, the analysis and prediction of SRAM variability/ yields, and system-level performance/power projections. He has published more than 70 papers in technical journals and conference proceedings, and holds five U.S. patents with another 12 U.S. patents pending.

Dr. Kim has received five invention achievement awards from IBM. He has been a reviewer of the journal publications for IEEE TRANSACTIONS ON ELECTRON DEVICES, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, and *Solid-State Electronics*. He was listed in *Who's Who in America* (2007 edition).

**Keith A. Jenkins** (SM'98) received the Ph.D. degree in physics from Columbia University, New York, NY, for experimental work in high-energy physics.

He is a Research Staff Member at the IBM Thomas J. Watson Research Center, Yorktown Heights, NY, where he is a member of the Communications Technology department. At the IBM Research Division, he has done research in a variety of device and circuit subjects, including high-frequency measurement techniques, electron beam circuit testing, radiation-device interactions, low-temperature electronics, and SOI technology. His current activities include designing circuits for analog built-in self-test, investigations into substrate coupling in mixed-signal and RF circuits, evaluating the frequency response of nanoscale devices, and studying the impact and mechanisms of self-heating in advanced CMOS technologies.



**Ching-Te Chuang** (S'78–M'82–SM'91–F'94) received the B.S.E.E. degree from National Taiwan University, Taipei, Taiwan, R.O.C., in 1975, and the Ph.D. degree in electrical engineering from the University of California at Berkeley in 1982.

From 1977 to 1982 he was a research assistant in the Electronics Research Laboratory, University of California, Berkeley, working on bulk and surface acoustic wave devices. He joined the IBM T. J. Watson Research Center, Yorktown Heights, NY, in 1982. From 1982 to 1986, he worked on

scaled bipolar devices, technology, and circuits. He studied the scaling properties of epitaxial Schottky barrier diodes, did pioneering works on the perimeter effects of advanced double-poly self-aligned bipolar transistors, and designed the first sub-nanosecond 5-Kb bipolar ECL SRAM. From 1986 to 1988, he was Manager of the Bipolar VLSI Design Group, working on low-power bipolar circuits, high-speed high-density bipolar SRAMs, multi-Gb/s fiber-optic data-link circuits, and scaling issues for bipolar/BiCMOS devices and circuits. Since 1988, he has managed the High Performance Circuit Group, investigating high-performance logic and memory circuits. Since 1993, his group has been primarily responsible for the circuit design of IBM's high-performance CMOS microprocessors for enterprise servers, PowerPC workstations, and game/media processors. Since 1996, he has been leading the efforts in evaluating and exploring scaled/emerging technologies, such as PD/SOI, UT/SOI, strained-Si devices, hybrid orientation technology, and multi-gate/FinFET devices, for high-performance logic and SRAM applications. Since 1998, he has been responsible for the Research VLSI Technology Circuit Co-design strategy and execution. His group has also been very active and visible in leakage/variation/degradation tolerant circuit and SRAM design techniques. He holds 24 U.S. patents with another 17 pending, and has authored or coauthored over 260 papers. He took early retirement from IBM to join National Chiao-Tung University, Hsinchu, Taiwan, as a Chair Professor in the Department of Electronic Engineering in February 2008.

Dr. Chuang has received one Outstanding Technical Achievement Award, one Research Division Outstanding Contribution Award, five Research Division Awards, and 12 Invention Achievement Awards from IBM. He has received the Outstanding Scholar Award from Taiwan's Foundation for the Advancement of Outstanding Scholarship for 2008–2013. He served on the Device Technology Program Committee for IEDM in 1986 and 1987 and the Program Committee for Symposium on VLSI Circuits from 1992 to 2006. He was the Publication/ Publicity Chairman for the Symposium on VLSI Technology and the Symposium on VLSI Circuits in 1993 and 1994, and was the Best Student Paper Award Sub-Committee Chairman for the Symposium on VLSI Circuits from 2004 to 2006. He was the co-recipient of the Best Paper Award at the 2000 IEEE International SOI Conference. He was elected an IEEE Fellow in 1994 "for contributions to high-performance bipolar devices, circuits, and technology". He has authored many invited papers in international journals such as International Journal of High Speed Electronics, Proceedings of IEEE, IEEE Circuits and Devices Magazine, and Microelectronics Journal. He has presented numerous plenary, invited or tutorial papers and talks at international conferences such as International SOI Conference, DAC, VLSI-TSA, i.e., Microprocessor Design Workshop, VLSI Circuit Symposium Short Course, ISQED, ICCAD, APMC, VLSI-DAT, ISCAS, MTDT, and WSEAS.



**Kaushik Roy** (F'02) received the B.Tech. degree in electronics and electrical communications engineering from the Indian Institute of Technology, Kharagpur, India, and the Ph.D. degree from the Electrical and Computer Engineering Department at the University of Illinois at Urbana-Champaign in 1990.

He was with the Semiconductor Process and Design Center of Texas Instruments, Dallas, TX, where he worked on FPGA architecture development and low-power circuit design. He joined the electrical and

computer engineering faculty at Purdue University, West Lafayette, IN, in 1993, where he is currently a Professor and holds the Roscoe H. George Chair of Electrical and Computer Engineering. His research interests include VLSI design/CAD for nanoscale silicon and non-silicon technologies, low-power electronics for portable computing and wireless communications, VLSI testing and verification, and reconfigurable computing. He has published more than 450 papers in refereed journals and conferences, holds eight patents, and is a co-author

of two books on low power CMOS VLSI design. He is the Chief Technical Advisor of Zenasis Inc. and Research Visionary Board Member of Motorola Labs (2002).

Dr. Roy received the National Science Foundation Career Development Award in 1995, IBM faculty partnership award, ATT/Lucent Foundation Award, 2005 SRC Technical Excellence Award, SRC Inventors Award, Purdue College of Engineering Research Excellence Award, and Best Paper Awards at 1997 International Test Conference, IEEE 2000 International Symposium on Quality of IC Design, 2003 IEEE Latin American Test Workshop, 2003 IEEE Nano, 2004 IEEE International Conference on Computer Design, 2006 IEEE/ACM International Symposium on Low Power Electronics and Design, 2005 IEEE Circuits and System Society Outstanding Young Author Award (Chris Kim), and 2006 IEEE TRANSACTIONS ON VLSI Systems Best Paper Award. He is a Purdue University Faculty Scholar. He has been on the editorial board of *IEEE Design and Test*, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, and IEEE TRANSACTIONS ON VLSI SYSTEMS. He was Guest Editor for the Special Issue on Low-Power VLSI in *IEEE Design and Test* (1994), the IEEE TRANSACTIONS ON VLSI SYSTEMS (June 2000), and *IEE Proceedings—Computers and Digital Techniques* (July 2002).