# Analysis and Design of a New Race-Free Four-Phase CMOS Logic

Chung-Yu Wu, Member, IEEE, Kuo-Hsing Cheng, and Jinn-Shyan Wang, Member, IEEE

Abstract—In this paper, a new four-phase dynamic logic, called the high-speed precharge–discharge CMOS logic (HS-PDCMOS logic), is proposed and analyzed. Basically the HS-PDCMOS logic uses two different units to implement the logic function and to drive the output load separately. Thus, a complex function can be implemented within a single gate and form the pipelined structured as well. The HS-PDCMOS logic needs four operation clocks and has three different versions. An experimental chip has been designed and measured to partly verify the results of circuit analysis and simulation. It is shown that the HS-PDCMOS logic has an operation speed about 2.5 to 3 times higher than the conventional four-phase dynamic logic. Moreover, the new logic has no clock skew, race, and charge redistribution problems. These advantages make the HS-PDCMOS logic very promising in high-speed complex VLSI design.

#### I. INTRODUCTION

**R**ECENTLY, CMOS dynamic logic has been widely applied to high-performance VLSI. So far, many dynamic logic circuits [1]–[5] have been proposed to improve packing density and operation speed. But sparingly they still suffer from certain problems like charge redistribution, dc power dissipation, clock skew, race, etc. Among the proposed dynamic logic circuits [1]–[5], the domino CMOS circuits [1] have a limitation that all of the gates are noninverting. The NORA circuits [2] have the charge redistribution problem as does the domino CMOS. The zipper CMOS [3] consumes a little dc power and requires a delicately designed zipper driver. The four-phase logic has the clock skew problem [4], [5].

In this paper, a new four-phase dynamic logic called the high-speed precharge–discharge CMOS logic (HS-PDCMOS logic) is proposed and investigated. This new logic can be used in the pipelined structure. The HS-PDCMOS logic has three mutually compatible versions that can be used in the same chip. The new logic can implement complex combinational logic function within a single gate and achieve an operation speed beyond 2.5 times higher than the conventional fourphase logic [4]. The more the logic complexity is, the more speed benefit the new logic has. Moreover, it is shown from both theoretical and experimental results that the pipelined structure of the HS-PDCMOS logic has no static power dissipation, clock skew, race, and charge redistribution problems.

Manuscript received January 31, 1992; revised June 18, 1992.

IEEE Log Number 9204137.

The circuit structure, clocking strategy, and operational principle of the three versions of the HS-PDCMOS logic are described in Section II. Section III presents the speed evaluation of the HS-PDCMOS logic. Section IV shows the speed comparisons of the HS-PDCMOS and the conventional four-phase logic. The measurement results of the experimental chip of the HS-PDCMOS logic are given in Section V. Finally a conclusion is given.

#### II. CIRCUIT STRUCTURE AND OPERATIONAL PRINCIPLE

The proposed new HS-PDCMOS logic has three circuit versions. They are described below.

## A. The First Version of the HS-PDCMOS Logic

The circuit structure and the corresponding clock timing of the first version of the HS-PDCMOS logic is shown in Fig. 1 where the logic gates can be divided into two types, i.e., type 1 and type 3. The block Ni (i = 1, 2, 3) in the NMOS tree is used to realize the logic function. The circuit has three operation phases, that is, charge/evaluate, evaluation, and discharge/hold. When  $\overline{\phi}_3$  and  $\phi_{12}$  are high in gate B, the circuit is in the discharge/hold phase. The path from  $V_{DD}$  to node A is off whereas the paths from nodes B, C, and D to GND are on. Thus, the voltages at nodes B, C, and D are discharged to 0 V. Since the PMOS  $P_e$  and the NMOS  $N_{e1}$  are off, the output node X is isolated from other nodes and can hold the previous data. When  $\overline{\phi}_3$  and  $\phi_{12}$  go low, all the paths to GND in gate B are off and the output node X is precharged to  $V_{DD}$ . This is the charge/evaluate phase of gate B. At this time, gate Adriving gate B is in the discharge/hold phase and stable signals can be sent to the inputs of gate B from the charge/evaluate phase to the next phase, the evaluation phase. Therefore, the logic block N3 of gate B behaves as a switching network and performs the desired logic function in the charge/evaluate phase. The logic values at nodes B and C can be obtained in the charge/evaluate phase and the logic functions  $f_B$  and  $f_C$ at nodes B and C, respectively, can be equated as

$$f_B = f_C = f_{N3} \tag{1}$$

where  $f_{N3}$  is the logic function performed by the switching block N3.

When  $\overline{\phi}_3$  goes high and  $\phi_{12}$  stays low, gate *B* is in the evaluation phase. In this time interval  $T_4, P_c$  is turned off and  $N_{c2}$  is turned on, so the charges at output node *X* may be discharged or remain unchanged, depending on the logic value at node *C*. In this phase, the preceding gate *A* is still in the

0018-9200/93\$03.00 © 1993 IEEE

C.-Y. Wu and K.-H. Cheng are with the Department of Electronics Engineering and Institute of Electronics, National Chiao Tung University, Hsinchu, Taiwan 300, Republic of China.

J.-S. Wang is with the Computer and Communication Research Laboratories, Industrial Technology Research Institute, Hsinchu, Taiwan 310, Republic of China.



Fig. 1. The circuit schematic and clock timing diagrams of the first version of the HS-PDCMOS logic.

precharge/hold phase with a stable output. Thus the circuit has no charge redistribution problem. The logic function obtained at output node X is

$$f_X = \overline{f}_C = \overline{f}_{N3}.$$
 (2)

The output function of node X is an inverting function of N3. This verifies the correct logic function performed by this special circuit structure. Note that the evaluation network is separated from the network performing the logic function. Thus the evaluation path always consists of only two stacked NMOS transistors, i.e.,  $N_{c1}$  and  $N_{c2}$  which enhance the operation speed. This is the difference between the proposed logic and the conventional four-phase logic. Due to this separation, the HS-PDCMOS logic has no race problem even when clock skew occurs. Moreover, the complex logic function can be implemented in a single gate to achieve a high operation speed. Meanwhile, wire routing complexity can be reduce efficiently.

Since  $\overline{\phi}_3$  is high in the evaluation phase, the PMOS  $P_c$  is off and the evaluated output value is not affected by input signals. After evaluation, the output logic value is still held constant throughout the next two phases when the clock  $\overline{\phi}_3$  keeping high.

In the time interval  $T_1, \overline{\phi}_1$  and  $\phi_{34}$  go low. The output node of gate A is precharged to  $V_{DD}$  in this interval. The precharged output node of gate A turns on the NMOS in the logic tree  $N_3$  of gate B. But the evaluated logic value at node X of gate B is not affected by the precharging in gate A because node X has been isolated from the inputs as described previously. Thus, it is realized that the new dynamic four-phase logic has no precharge-race problem, i.e. no clock skew problem. In this phase, gate B begins to discharge the internal node charges inside the logic tree while holding the data at the



Fig. 2. The circuit schematic and clock timing diagrams of the second version of the HS-PDCMOS logic.

output node X. This interval is called the discharge/hold phase of gate B. The three phases of operations, i.e., charge/evaluate, evaluation, and discharge/hold, proceed in sequence every four clock intervals.

Gate A(B) is defined as the type-1 (3) gate. The two types of gates should be connected alternatively in a system, thus forming an extensively pipelined structure. Static gates can also be mixed with this new dynamic gates if necessary to implement a logic function and this does not cause the precharge-race problem, i.e., no clock skew problem. In such a case, the basic operations described previously are not altered, but the discharge time may become longer due to the static gate delay.

## B. The Second Version of the HS-PDCMOS Logic

The second version of the HS-PDCMOS logic is shown in Fig. 2. This version is derived from the first version of the HS-PDCMOS logic. As shown in Fig. 2, the HS-PDCMOS logic is separated into two units, the function unit and the driver unit. As shown in gate B, both units have their own precharge PMOS transistors  $P_{p1}$  and  $P_{p2}$ . Thus, the output capacitive load of the HS-PDCMOS logic is completely separated from the logic block Ni. The operational principle of the second version of the HS-PDCMOS logic is similar to that of the first version, since the second version has fewer control MOS transistors and the output capacitive load at the output node is completely separated from the logic block Ni. Thus, it is expected that the operation speed of the second version is faster than that of the first version.

## C. The Third Version of the HS-PDCMOS Logic

In applying the dynamic logic in the pipelined structure, the maximum system operation frequency of the pipelined structure is limited by some worst-delay gates in the system.



Fig. 3. The circuit schematic and clock timing diagrams of the third version of the HS-PDCMOS logic.

Thus, the overall system speed performance can be improved if the speeds of those gates are increased. Using this concept, it is helpful to develop logic gates with very high speed and low dc power dissipation and use them in those worst-delay gates. Thus, the overall operation frequency of the pipelined system can be raised with a little dc power dissipation.

The gate connection and the corresponding clock timing of the third version of the HS-PDCMOS logic is shown in Fig. 3. Its operation speed is higher than the second version but with a small dc power dissipation. The primary structure difference between the third version and the second version of the HS-PDCMOS logic is that the evaluation NMOS transistor  $N_{e3}$  in the output driver of Fig. 2 has been removed. This does not affect the logic function.

The circuit operations of the third version of the HS-PDCMOS logic is similar to the first version. As shown in Fig. 3, in the discharge/hold phase of gate B, the paths from  $V_{DD}$  to nodes A and X are off and the path from node B to GND is on. The voltage at node B in this time is discharged to 0 V. In the charge/evaluate phase, all the input signals that are just evaluated and output from the preceding gates (type-1 gates) to gate B are constant from the charge/evaluate phase to the next phase, the evaluation phase. Thus, the logic value at node B can be obtained in the charge/evaluate phase and held to the next phase. In the evaluation phase, the PMOS device  $P_{p1}$  is turned off and the output logic function at output node X can be obtained as

$$f_X = \overline{f}_B = \overline{f}_{N3}.\tag{3}$$

If the logic value at node B is ZERO, there is no dc power dissipation. The output node X can be precharged to  $V_{DD}$ and keep the voltage level at  $V_{DD}$  correctly. If the logic vlaue at node B is ONE, the precharge device  $P_{p2}$  of the output driver is turned on in the charge/evaluate phase. The output driver has dc power dissipation in this phase. However, consider the voltage at node B. At the beginning of the charge/evaluate IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 28, NO. 1, JANUARY 1993

phase, the voltage at node B is 0 V. It will be verified later that at the end of the charge/evaluate phase, the node voltage at node B is charged to a voltage level below 2 V. Thus, in the charge/evaluate phase, the NMOS device  $N_{e1}$  is only slightly turned on, so that the dc power dissipation is very small. Due to the weakly turned on  $N_{e1}$ , the voltage at node X can only be precharged to about 2/3  $V_{DD}$  within a short time period.

The speed improvement is achieved because the removal of the evaluation NMOS transistor  $N_{e3}$  results in a smaller evaluation path resistance in the output driver. If the final output logic value at node X is ZERO, the output node X is only precharged to about 2/3  $V_{DD}$  in the charge/evaluate phase. In the evaluation phase, the output voltage at node X is discharged from 2/3  $V_{DD}$  to 0 V. Due to smaller voltage swing and evaluation path resistance in logic ZERO, the evaluation time can be reduced in the evaluation phase. Thus, using the third version of the HS-PDCMOS logic in the worst-delay gates of a pipelined system can improve the overall system speed performance. The cost paid for the high speed is the small dc power dissipation in the charge/evaluate phase.

## **III. SPEED PERFORMANCE EVALUATION**

The maximum operation frequency of the new dynamic circuits depends on the total charge/evaluate, evaluation, and discharge time. Although the HS-PDCMOS logic requires some extra transistors stacked below the logic network, it will be shown later that the new dynamic circuit still spends less time in charge/evaluate, evaluation, and discharge than the conventional four-phase logic circuit.

As shown in Fig. 1, the longest charge-evaluate time is determined by the voltage level at node C, which should be charged. Because the evaluation path consists of only two stacked NMOS transistors ( $N_{e1}$  and  $N_{e2}$ ), the evaluation speed for the logic ZERO can still be very fast even when node C is only charged to 2.0 V or less at the end of the charge/evaluate phase. The circuit used for the simulation of the worst-case charge/evaluate time is shown in Fig. 4, which is an eightinput HS-PDCMOS NAND gate of the first version. The worst charge/evaluate case of gate A happens if initially the charges stored at the output node X and the internal node 1-9 have been discharged completely in the previous discharge phase and in this phase the output node will be evaluated low. The charge/evaluate time  $t_{cc}$  is defined as the time difference of  $t_2$ , when node 10 is charged to 2.0 V, and  $t_1$ , when the gate begins to charge. If the charge voltage at node 10 is higher than 2.0 V, the evaluation speed of output node X will be faster in the evaluation phase. But it requires a longer charge/evaluate time  $t_{ce}$  in the charge/evaluate phase. Thus, there exists a compromise between the speed in the charge/evaluate phase and in the evaluation phase. In this case, 2.0 V is chosen as the charge/evaluate voltage at node 10.

As shown in Fig. 3, the evaluation path of the third version HS-PDCMOS logic consists of only two NMOS transistors  $N_{e1}$  and  $N_{e2}$ . Thus, the evaluation speed for the logic ZERO can be faster than the second version if the charge voltage is 2 V at the gate of the NMOS transistor  $N_{e1}$ . Keeping the same evaluation speed as in the second version, the charge voltages



Fig. 4. The circuit used to determine the worst-case charge/evaluate time  $t_{ce}$  of the HS-PDCMOS logic.

at node B in the third version can be lower than 2 V. Thus the charge/evaluate phase can be made shorter to enhance the speed.

After determining the charge/evaluate time, the worst discharge time and evaluation time are determined to find the overall speed performance. Consider the worst case for the discharge time as shown in Fig. 5 where all the input signals to gate B come from the preceding dynamic gates (type-1 gates) through static inverter gates. When gate B enters the discharge/hold phase, gate A enters the charge/evaluate phase and the voltage at node  $X_8$  is pulled high. It pulls down the voltage at node  $Y_8$ , which retards gate B to discharge its internal node charge. Thus the discharge of gate B cannot be completed in this charge/evaluate phase of gate A. In the next phase, gate B is still in the discharge/hold phase whereas gate A enters its evaluation phase. It is evident that the discharge of gate B can only be completed after all the preceding gates A complete their evaluation operation and the outputs (i.e., the inputs  $X_1 - X_8$  to gate B) are stabilized. The worst-case discharge time of gate B is that the residual charges stored at the internal nodes of gate B from node 2 to the node 8 in Fig. 5 should be removed because the input signals  $X_2 - X_8$  are stabilized at logic ZERO whereas  $X_1$  is at logic ONE. Thus the required worst-case discharge time  $t_{dis}$  of gate B is related to the time in the evaluation phase of gate A. This worstcase discharge time  $t_{dis}$  of gate B is longer than the required evaluation time of gate A by the amount of static inverter delay plus the time interval for removing the stored charges at nodes 2 to 8 in gate B. Since the worst discharge time is always longer than the evaluation time, only the worst discharge time is considered. As shown in Fig. 5, the worst discharge time  $t_{dis}$ of gate B is defined as the time interval from the time when the preceding stage begins to evaluate to the time when the voltage at node 2 is pulled down to 0.5 V. At this time, the node



Fig. 5. The circuit used to determine the worst-case discharge time  $t_{dis}$  of the HS-PDCMOS logic.

voltage at node 9 is as low as 0.05 V, which is small enough to turn off the NMOS transistor  $N_{e1}$ . Note that if a longer discharge time is given, the voltages at nodes 2 and 9 can be lower. The resulting improvement in the circuit performance is very small but the speed performance is degraded. Thus we choose the above reasonable node voltages for high-speed applications.

After determining the worst charge/evaluate time  $t_{cc}$  and the worst discharge time  $t_{dis}$ , the maximum operation frequency  $f_{max}$  of the HS-PDCMOS logic is defined as

$$f_{\max} = \frac{1}{2(t_{cc} + t_{dis})}.$$
 (4)

#### **IV. PERFORMANCE COMPARISONS**

To compare the speed performance of the HS-PDCMOS logic with that of the conventional four-phase dynamic logic, the longest precharge and evaluation times in the conventional four-phase dynamic logic should be determined. Fig. 6 shows the circuit structure and the corresponding clock timing of the conventional four-phase dynamic logic [4]. Fig. 7 illustrates the circuit condition that leads to the worst-case precharge time in the conventional four-phase logic. Since the input signals are available before the gate begins to precharge, the worst precharge case is that the lowest positioned input signal in a string of stacked NMOS transistors is held low while other input signals are held high during the precharge phase. Thus the PMOS transistor  $P_p$  has to precharge the output node X and those internal nodes connected to the output node. In the evaluation phase,  $P_p$  is turned off while  $N_h$  is on. Then charge redistribution occurs between the output node and the internal nodes if the internal nodes are not precharged to their maximum extent and the output node is evaluated high.



Fig. 6. The circuit schematic and the clock timing diagrams of the conventional four-phase dynamic logic.



Fig. 7. The circuit used to determine the worst-case precharge time  $t_{pr}$  of the conventional four-phase dynamic logic.

It is found from simulation results that node 8 in Fig. 7 should be at least precharged to 3.3 V at the end of the precharge phase in order not to cause serious corruption to the output node voltage due to the charge redistribution effect. Fig. 8(a) and (b) shows the simulation results. It is seen from Fig. 8(a) that the output voltage is seriously corrupted if node 8 in Fig. 7 is precharged only to 3.0 V at the end of the precharge phase. In Fig. 8(b), node 8 is precharged to 3.3 V at the end of the precharge phase. Then the output voltage at node X, although it falls below 5 V in the evaluation phase due to the charge redistribution effect, is still kept at a tolerable level.

As shown in Fig. 7, the worst-case precharge time  $t_{pr}$  for

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 28, NO. 1, JANUARY 1993



Fig. 8. SPICE simulation results showing the charge-redistribution effect in the conventional four-phase dynamic logic with node 8 shown in Fig. 7 precharged to (a) 3.0 V and (b) 3.3 V at the end of the precharge phase.

this test circuit is defined as the maximum value between the output rise delay time  $t_{r1}$  and the rise delay time  $t_{r2}$  at node 8, because dynamic circuits generally have a tighter requirement for noise immunity than the static circuits. The rise delay time  $t_{r1}$  of the output node is defined as the difference between the time when the clock  $\overline{\phi}_1$  begins to fall and the time when the output node reaches 4.9 V.

The worst evaluation case occurs when the output node is evaluated low in the evaluation phase. In such a case as shown in Fig. 9, the charges accumulated at the internal nodes 2–9 during the precharge phase should be removed through the very long signal path to ground. The evaluation time  $t_{eva}$ defined in Fig. 9 is from  $t_1$  when the gate begins to evaluate, to  $t_2$  when the voltage level of output node X is pulled down to 0.1 V. In the conventional four-phase logic, the maximum operation frequency can be formulated as

$$f_{\max} = \frac{1}{2(t_{pr} + t_{eva})}.$$
 (5)

Based on the 1.2- $\mu$ m CMOS process, the speed comparisons on multi-input NAND gates (Fig. 4) between the HS-PDCMOS logic and the conventional four-phase dynamic logic are presented in Table I where the SPICE simulated  $t_{ce}$ ,  $t_{dis}$ ,  $t_{pr}$ ,  $t_{eva}$ , and  $f_{max}$  are listed. In Table I, all three circuit designs have no dc power dissipation. The simulated logic gate has a single fan-out load, and the SPICE MOS models of the 1.2- $\mu$ m CMOS process are shown in Table II. It is seen from Table I that the second version of the HS-PDCMOS logic is the fastest. The maximum frequency of the new dynamic circuit



Fig. 9. The circuit used to determine the worst-case evaluation time  $t_{eva}$  of the conventional four-phase dynamic logic.



| Number of stacked |      | t versio<br>PDCM |       |      | nd vers<br>PDCM |      |       | l four-<br>S logic |      |
|-------------------|------|------------------|-------|------|-----------------|------|-------|--------------------|------|
| NMOS's            | tce  | tdis             | fmax  | tce  | tdis            | fmax | tpr   | teva               | fmax |
| 3                 | 1.71 | 0.72             | 206 . | 0.41 | 0.68            | 459  | 1.11  | 1.49               | 192  |
| 6                 | 2.66 | 1.19             | 130   | 0.90 | 1.15            | 244  | 2.94  | 3.01               | 84.0 |
| 9                 | 3.85 | 1.93             | 86.5  | 1.68 | 1.89            | 140  | 5.99  | 5.06               | 45.2 |
| 12                | 5.26 | 2.97             | 60.8  | 2.73 | 2.94            | 88.2 | 10.21 | 7.64               | 28.0 |
| 15                | 6.90 | 4.31             | 44.6  | 4.06 | 4.28            | 60.0 | 15.61 | 10.74              | 19.0 |

(tce tdis tpr teva :ns ; fmax :MHz)

TABLE II THE SPICE MOS MODELS OF THE 1.2- $\mu$ m CMOS PROCESS

| .MODEL NMOS<br>VTO =0.840<br>NSUB =1.90E16<br>RSH =46<br>THETA=69M<br>WD =-100N<br>MJSW =240.8M | NMOS MODEL<br>NMOS LEVEL=3<br>UO =561<br>VMAX =122.9K<br>XJ =460N<br>ETA =157.9M<br>CJ =183.3U | NFS =729.1G<br>RS =43.25<br>LD =110.6N<br>KAPPA=0<br>MJ =617.8M | TOX =25N<br>RD =43.25<br>DELTA=0.8<br>PB =724.4M<br>CJSW =303.1P |
|-------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|-----------------------------------------------------------------|------------------------------------------------------------------|
| .MODEL PMOS<br>VTO =-0.940<br>NSUB =3.57E16<br>RSH =65<br>THETA=165M<br>WD =0<br>MJSW =662.0M   | PMOS MODEL<br>PMOS LEVEL=3<br>UO =205<br>VMAX =662.9K<br>XJ =500N<br>ETA =224.2M<br>CJ =231.2U | NFS =1E12<br>RS =10<br>LD =54N<br>KAPPA=22.6<br>MJ =336.4M      | TOX =25N<br>RD =10<br>DELTA=1.006<br>PB =1.26<br>CJSW =1.444N    |

is about 2.5 to 3 times higher than that of the conventional four-phase dynamic logic circuit. Moreover, the more the logic complexity is, the more speed benefit the HS-PDCMOS logic has.

Another design of the HS-PDCMOS logic is the third version of the HS-PDCMOS logic. This logic consumes a little dc power dissipation in the charge/evaluate phase and the operation frequency can be enhanced. The speed performances

TABLE III SPICE Simulation Results of  $t_{cc}$ .  $t_{dis}$ .  $f_{max}$  and Average Power Dissipation  $P_d$  for the Multi-Input NAND Gates of the Third-Version

| Number of<br>stacked<br>NMOS's | Second version of<br>HS-PDCMOS |      |      |       | Third version of<br>HS-PDCMOS |      |      |       |
|--------------------------------|--------------------------------|------|------|-------|-------------------------------|------|------|-------|
|                                | tce                            | tdis | fmax | Pd    | tce                           | tdis | fmax | Pd    |
| 3                              | 0.41                           | 0.68 | 459  | 0.130 | 0.38                          | 0.52 | 556  | 0.210 |
| 6                              | 0.90                           | 1.15 | 244  | 0.139 | 0.80                          | 0.96 | 284  | 0.218 |
| 9                              | 1.68                           | 1.89 | 140  | 0.153 | 1.47                          | 1.69 | 158  | 0.223 |
| 12                             | 2.73                           | 2.94 | 88.2 | 0.164 | 2.37                          | 2.72 | 98.0 | 0.227 |
| 15                             | 4.06                           | 4.28 | 60.0 | 0.167 | 3.50                          | 4.03 | 66.4 | 0.234 |

( tee tdis :ns ;  $f_{max}$  :MHz)





Fig. 10. Chip photograph of the fabricated test circuits of the second-version HS-PDCMOS logic. They include (a) the cascaded 15-input NAND gates, and (b) the 4-b carry generator.

and average power dissipations of the second-version HS-PDCMOS and the third-version HS-PDCMOS logic are listed in Table III for comparisons. Note that in obtaining the time  $t_{ce}$ , the voltage at node B in Fig. 3 is precharged to 1.8 V in the charge/evaluate phase. Thus the charge/evaluate time of the third-version HS-PDCMOS is shorter than that of the second version.

## V. EXPERIMENTAL VERIFICATIONS

Several experimental circuits were designed and fabricated to verify part of the simulated results of the HS-PDCMOS logic circuits. This experimental chip was fabricated in a 1.2- $\mu$ m, double-metal single-poly, n-well CMOS process. The test circuits for the second version of the HS-PDCMOS logic are two cascaded pipelined 15-input NAND gates and a 4-b carry generator. Fig. 10(a) and (b) shows the chip photograph of the test circuits. The test logic gates have a single fan-out load.



Fig. 11. Measurement results of the fabricated 15-input HS-PDCMOS NAND gates. This circuit is operated at 60 MHz with a large clock skew between each clock signal.



Fig. 12. Shmoo plot of the fabricated 15-input HS-PDCMOS NAND gates.

From the measurement results in Fig. 11, the fabricated 15input HS-PDCMOS NAND gates can work with a four-phase clock rate of 60 MHz. Through suitable tuning on the delay and pulse widths of the four clock signals, the clock delay among each clock line can be formed as shown in Fig. 11. Such clock waveforms with skews can be used to test the immunity of the fabricated circuit to the clock skew. From the measurement results in Fig. 11, it is shown that the HS-PDCMOS logic can tolerate considerable clock skew.

Since the charge/evaluate level and the speed of the HS-PDCMOS logic are affected by the power supply voltage, the charge/evaluate time  $t_{ce}$  of the HS-PDCMOS logic and the operation frequency are supply-dependent. To investigate the dependence, the shmoo plot of the 15-input second version HS-PDCMOS NAND gates is measured and shown in Fig. 12 where the x-axis represents the pulse width of the clock signal  $\overline{\phi}_3$ , i.e., the charge/evaluate time, and the y-axis represents the power supply  $V_{DD}$  changing from 3.37 to 6.0 V. As the power supply voltage decreases, the charge/evaluate time of the HS-PDCMOS logic increases, which leads to the decrease of the operation clock rate. When the power supply is below 3.5 V, this test circuit of the 15-input NAND gates cannot function

0

Ō٦

correctly because of too many series NMOS devices.

**\$**12

VDD

Go

G1-G2-G3**Ø**3

The other test circuit is a HS-PDCMOS 4-b carry generator as shown in Fig. 13. The measured maximum clock rate is 147 MHz.

#### VI. CONCLUSIONS

In this paper, new four-phase dynamic logic circuits called HS-PDCMOS logic circuits are proposed and analyzed. The new logic circuits have three versions. Two of them have no dc power dissipation. The third version consumes a small dc power dissipation, but it has a higher operation speed than the other two. The three new HS-PDCMOS logic versions can be used in the same chip. As compared to the conventional four-phase dynamic logic, the operation speed of the HS-PDCMOS logic is 2.5 to 3 times higher. Moreover, the new logic has no clock skew, race, and charge redistribution problems. The static gates can also be mixed with these new dynamic circuits if necessary to implemented a logic function without the precharge-race problem. This increases the design flexibility. The performance of the proposed HS-PDCMOS logic has been partly verified by an experimental chip.

#### REFERENCES

- [1] R. H. Karambeck, C. M. Lee, and H.-S. Law, "High-speed compact circuits with CMOS," *IEEE J. Solid-State Circuits*, vol. SC-17, pp. 614-619, June 1982
- [2] N. F. Goncalves and H. De Man, "NORA: A race free dynamic CMOS technique for pipeline logic structures," IEEE J. Solid-State Circuits, vol. SC-18, pp. 261-266, June 1983.
- [3] C. M. Lee and E. W. Szeto, "Zipper CMOS," IEEE Circuits Devices Mag., pp. 10-16, May 1986.
- D. J. Myers and P. A. Ivey, "A design style for VLSI CMOS," *IEEE J. Solid-State Circuits*, vol. SC-20, no. 3, pp. 741–745, June 1985.
  J.-S. Wang, C.-Y. Wu, and M.-K. Tsai, "A novel dynamic CMOS logic free from problems of charge sharing and clock skew," Int. J. Electron.,
- vol. 66, no. 5, pp. 679–695, May 1989, I. S. Hwang and A. L. Fisher, "Ultrafast compact 32-bit CMOS adders [6] in multiple-output domino logic," IEEE J. Solid-State Circuits, vol. 24, no. 2, pp. 358–369, Apr. 1989.



Chung-Yu Wu (S'76-M'76) was born in Chiayi, Taiwan, Republic of China, in 1950. He received the B.S. degree from the Department of Electrophysics, and the M.S. and Ph.D. degrees from the Institute of Electronics, National Chiao-Tung University, Hsinchu, Taiwan, Republic of China, in 1972, 1976 and 1980, respectively. During 1975-1976 he studied ferroelectric films

on silicon and their device applications. During 1976-1979 he engaged in the development of integrated differential negative resistance devices and

their circuit applications, with support from the National Electronics Mass Plan (Semiconductor Devices and Integrated Circuit Technologies) of the National Science Council. From 1980 to 1984 he was an Associate Professor at the Institute of Electronics, National Chiao-Tung University. During 1984-1986 he was an Associate Professor in the Department of Electrical Engineering, Portland State University, Portland, OR. Presently he is a Professor in the Department of Electronics Engineering and the Institute of Electronics, National Chiao-Tung University. His research interests have been in analog and digital integrated circuits and systems, special semiconductor devices, and neural networks

Dr. Wu is a member of Eta Kappa Nu and Phi Tau Phi.



Kuo-Hsing Cheng was born in Taipei, Taiwan, Republic of China, in 1962. He received the B.S. degree from the Department of Electrical Engineering, National Central University, Chungli, Taiwan, Republic of China, and the M.S. degree from the Institute of Electronics, National Chiao-Tung University, Hsinchu, Taiwan, Republic of China, in 1985 and 1987, respectively. He is currently working towards the Ph.D. degree at the same Institute. During 1986-1987 he studied the design of new microcontroller and multiplier. During 1987-1988

he studied the implementation of high-speed digital/analog converter. During 1989-1990 he studied in the development of programmable logic device and MOS memory IC's. During 1987-1992 he engaged in the development of high-performance digital integrated circuits and system. His main research interest is in the area of high-speed digital integrated circuits and systems.



Jinn-Shyan Wang (S'85-M'88) was born in Taiwan in 1959. He received the B.S. degree in electrical engineering from the National Cheng-Kung University in 1982 and the M.S. and Ph.D. degrees from the Institute of Electronics, National Chiao-Tung University, Taiwan, in 1984 and 1988, respectively.

During the years 1988-1990 he was with the Electronics Research and Service Organization, Industrial Technology Research Institute (ERSO, ITRI), Taiwan, engaged in the development of

digital TV chip sets. Since 1990 he has been with the Computer and Communication Research Labatories (CCL), ITRI, responsible for the research and development on ASIC circuit and system design. He has published seven papers and holds several patents on VLSI circuits and architectures.

Dr. Wang was awarded the 1988 Long-Terng Thesis Award from Sertek International Inc., Taiwan.