# 行政院國家科學委員會補助專題研究計畫 ■ 成 果 報 告

微瓦級動態電壓與頻率調整之晶片匯流排設計

- 計畫類別:■個別型計畫 □整合型計畫
- 計畫編號:NSC 98-2221-E-009 -137 -MY2
- 執行期間:98 年 8 月 1 日至 100 年 7 月 31 日
- 執行機構及系所:國立交通大學電機與控制工程學系(所)
- 計畫主持人:蘇朝琴 教授
- 計畫參與人員:何盈杰、曾煜輝、徐仁乾、盧泓瑋、鄭鈞藝、莊俢銘、黃博祥、 林璟伊、楊弘宇

成果報告類型(依經費核定清單規定繳交):□精簡報告 完整報告

本計畫除繳交成果報告外,另須繳交以下出國心得報告:

- □赴國外出差或研習心得報告
- □赴大陸地區出差或研習心得報告
- 出席國際學術會議心得報告
- □國際合作研究計畫國外研究報告
- 處理方式:除列管計畫及下列情形者外,得立即公開查詢 ■涉及專利或其他智慧財產權,□一年■二年後可公開查詢
  - 中華民國100年10月25日

| 目錄                                                  | i  |
|-----------------------------------------------------|----|
| 中英文摘要                                               | ii |
| 1. Introduction                                     | 1  |
| 2. Background Review                                | 3  |
| 3. Proposed On-chip Bus                             | 9  |
| 4. DVFS scheme with a low-voltage multi-phase ADPLL | 15 |
| 5. Test chips and the measured results              | 23 |
| 6. Comparisons and Conclusion                       | 30 |
| 7. References                                       | 34 |
| 國科會補助專題研究計畫成果報告自評表                                  | 37 |
| 國科會補助計畫衍生研發成果推廣資料表                                  | 39 |
| 出席國際學術會議心得報告                                        | 43 |

#### 中文摘要

本計畫在於探討微瓦級匯流排DVFS 的設計技術,主要的應用領域在低電壓操作系統,如生醫應 用的植入式與外掛式裝置。匯流排的主體為Segmented Buffer 的架構,我們使用Bootstrap 技巧以提升 電路的速度,降低電路轉態的時間。在本計畫中,我們分別提出了主動抑制漏電流拔靴帶式反相器 (Active Leakage-current-reduced Bootstrapped Inverter, ALBI),以及抑制 ISI 拔靴帶式驅動器 (ISI-Suppressed Bootstrapped Driver, ISBD)兩種拔靴帶式電路來完成微瓦級匯流排設計。DVFS系統的操 作模式是以一個低電壓PLL提供多重相位的時脈來完成Timing Margin Measurement (TMM)模組,量測 CLK 與DATA 的Timing Margin,並據此調整資料的傳送頻率與(或)電源供應電壓。如此,不論製程的 落點為何,我們可以讓電路工作在最為省電的情況下。

本案一共產出三顆奈米製程的晶片,包括有次臨界電晶體與電路的測試電路,用以建立次臨界的 電晶體電路模型;兩種拔靴帶式電路ALBI與ISBD。一個0.25-0.5V PLL提供多重相位的時脈來完成的 TMM 電路。

關鍵字: DVFS, 匯流排電路, 微瓦級電路, 拔靴帶式電路, 中繼器。

英文摘要

This project proposed an uW on-chip bus design based using DVFS techniques. The main application domain includes low-voltage circuit systems, such as invasive and noninvasive biomedical applications. The body of the proposed on-chip bus has a segmented buffer structure. The buffers are implemented by bootstrap inverters to improve the speed and decrease the transient time. In this project, we proposed two bootstrap circuits in the on-chip bus design. One is the Active Leakage-current-reduced Bootstrapped Inverter (ALBI), the other is ISI-Suppressed Bootstrapped Driver (ISBD). The DVFS on-chip bus operates as follows. The timing margin of the bus is measured by a TMM module which is constructed with a low-voltage multi-phase PLL. According the timing margin, the DVFS controller controls the data rate and/or the supply voltage. With the proposed structure, the circuit will operate at its optimal condition regardless of its process corner and operation condition.

The project has taped out three nm-process chips. They include the subthreshold transistor and circuit test keys to build the transistor and circuit model under subthreshold operation, two bootstrap circuits, and a timing margin measurement circuit with a 0.25-0.5V multi-phase PLL.

Keywords: DVFS, on-chip bus, uW circuit, bootstrap circuit, repeater.

#### 1. Introduction

In the past few years, low voltage and low power designs have attracted significant attentions because of the popularity of portable devices. Emerging embedded biomedical applications have once more pushed the low-power designs into another extreme case. Scaling the supply voltage down below the threshold voltage is the most favorable solution for low-power designs. A 180mV, 1024-point FFT processor is a pioneer subthreshold-supply design. Subthreshold SRAM is another important category. Other designs include a 6-bit Flash ADC for use at 0.2–0.9V and a 14-tap 8-bit finite impulse response (FIR) at 20MHz under 0.27V.

As technology continues to be scaled down, the on-chip global interconnect in the SoC design becomes a bottleneck with regard to speed, power and cost. The repeater insertion represents a feasible technique for the trade-off among speed, power and cost requirements.

Subthreshold circuit design is challenging because the driving capability (Ion), the Ion/Ioff ratio, and process variations are degraded significantly, affecting the circuit performance, the power efficiency (leakage power), and the fabrication yield. First of all, although circuits down to the sub-threshold-supply can achieve ultra-low power consumption, the driving capability of CMOS devices in sub-threshold region remains challenging. Then, while requiring a large area to compensate for driving efficiency in sub-threshold power supply, a conventional CMOS tapered buffer also incurs a severe I<sub>off</sub> problem in the nano meter process. In addition, subthreshold circuits suffer serious process variations, which could be even several times variations.

Unfortunately, in the subthreshold region, conventional CMOS repeaters still suffer from the severe design problems mentioned above. Bootstrapping is an effective means of enhancing the speed in order to raise the driving efficiency. Therefore, a previous work has developed a bootstrapped CMOS driver for large capacitive loads. According to Lou (JSSC 1997), the bootstrapped driver consists of a pull-up and pull-down control pair to drive the PMOS and NMOS transistors, respectively. In Lou's and Kuo's design, the gate voltages of PMOS and NMOS driver transistors are kept  $V_{DD}$  and 0 in the cut-off phase. In the driving phase, the gate voltages of PMOS and NMOS transistors are fed  $-V_{DD}$  and  $2V_{DD}$  to increase the current density. Several researchers have proposed some improvements based on the architecture in Fig. 1(a). Despite a previous effort to increase the boosting efficiency by rearranging the timing of the switching and boosting signals, reverse leakage current remains the main drawback of conventional bootstrapped drivers.

This project proposes two bootstrap circuits, an active leakage-current-reduction bootstrapped inverter (ALBI) and an ISI-suppressed bootstrapped driver (ISBD). The ALBI is applicable in both increasing driving ability by boosting signals into super-threshold region and reducing the leakage current in sub-threshold region. The ISBD is designed to suppress ISI noise in data link applications, and improve the precharge and leakage current reduction schemes as well. The  $I_{on}/I_{off}$  ratio enhancement is also adopted in these two circuits.

Dynamic Voltage and Frequency Scaling (DVFS) scheme is an efficient power saving method according to the relationship among power consumptions, supply voltage and operating frequency. DVFS scheme is applied to adjust PVT variation as well. As designer of view, simulation results should fit all kinds of PVT corners. In fact, such designs often remain large redundant margin in particle chip. DVFS can adjust the supply voltage or the frequency for the task appropriately and dynamically and therefore exceeds most efficient power saving. In the proposed DVFS system, supply voltage or the data rate can be dynamically

changed to meet the specification requirements of on-chip bus; therefore the power consumption can be optimized for the computational tasks conditionally.

In our DVFS on-chip bus design, a multi-phase low-voltage Phase-locked loops (PLLs) is a critical function block. PLLs are key building blocks in integrated circuits and some 0.5V PLLs are presented in the last few years. In the project, we proposed a 0.25-0.5V 10-phase all-digital PLL (ADPLL), which can operate 37.6-480MHz output frequency.

The rest of the paper is organized as follows. Section II quickly reviews some reported bootstrap drivers and low-voltage PLLs. Section III introduces the on-chip bus structure with the proposed bootstrapped repeaters and describes the operations and waveforms of our designs. Section IV introduces a ring oscillator design for 0.2-0.6V supply voltage using a bootstrapped technique and a low-voltage multi-phase ADPLL. Section V shows the test chips and the measured results. Finally, Section VI shows some performance comparisons and draws conclusions.

#### 2. Background Review

#### **Bootstrapped Drivers**

#### Bootstrapped Drivers (JSSC 1997)

According to Lou's the bootstrapped driver (JSSC 1997), which consists of a pull-up and pull-down control pair to drive the PMOS and NMOS transistors, respectively. In Lou's and Kuo's design as shown in Fig. 2-1, the gate voltages of PMOS and NMOS driver transistors are kept  $V_{DD}$  and 0 in the cut-off phase. In the driving phase, the gate voltages of PMOS and NMOS transistors are fed  $-V_{DD}$  and  $2V_{DD}$  to increase the current density. However, in the ultra-low voltage design, this circuit consist some drawbacks. First of all, due to the poor current capability, the precharge current from  $M_{N1}$  and  $M_{P1}$  restrict the speed of precharge time. Next,  $M_{N1}$  and  $M_{P1}$  incur reverse current at the boosting mode. The reverse current discharge bootstrap caps at the boosting nodes and flow charges back to the supply. Although the reverse current doesn't consume power, more precharge time is needed. Moreover, the  $I_{on}/I_{off}$  ratio is serious in the low-voltage design. The low  $I_{on}/I_{off}$  ratio will induce more leakage power.



Fig. 2-1. Bootstrapped driver reported in JSSC 1997

#### Bootstrapped Drivers (T. VLSI 2008)

Kil *et al.* (2008) proposed a precharge enhancement scheme to accelerate the bootstrapped circuit operations in a 6MHz distributed clock network at 0.4V, as shown in Fig 2-2. That scheme also feeds back the output boosting signal to effectively suppress the reverse leakage current and maintain the output boosting voltage level. However, extra capacitors increase the hardware cost by two folds. However, this approach is not suitable for use in a data link due to the kick-back disturbance through the boosting capacitors degrades rapidly at a high frequency and causes a large timing jitter. Additionally, static power in this design encounters a serious problem since it accounts for most of the power consumption. Among other bootstrapped circuits, single capacitor ones reduce the costs of hardware overhead. However, their complex circuitry design seriously degrades charge sharing at the capacitor node. Moreover, the leakage current is problematic as well.



Fig. 2-2. Bootstrapped driver reported in T. VLSI 2008.

#### Low-voltage ring oscillator

#### ■ 0.6~1.2V Current-Controlled Oscillator (VLSI Symposium 2004)

In order to achieve a wide range of power-supply operation without introducing a secondary higher power supply, clock generation architecture needs to be able to operate down to sub-1V regions. CMOS ring oscillator based on the self-biased architecture requires voltage headroom of  $V_{DD} > V_{TH} + 2V_{DSAT}$  for operation. Existing current-controlled oscillator also has similar voltage headroom requirements. Current-starved inverter ring oscillator has a  $V_{DD}$  requirement of  $V_{TH} + V_{DSAT}$ . The PLL presented is based on a single  $V_{TH}$  current-controlled oscillator, which still requires enough voltage headroom for low power supply applications.



Fig. 2-3. Differential delay element of current-controlled oscillator.

The basic delay cell is shown in Fig. 2-3.  $M_{N1}/M_{N2}$  and  $M_{N1'}/M_{N2'}$  form a complementary differential pair that are driven by current sources  $M_{P1}$  and  $M_{P2}$ . The swing of the differential output and input nodes

depends on the bias current  $I_{BIAS}$ , the voltage drop across  $M_{P1,2}$ , and the size of the differential load devices. When the delay cells are connected in ring oscillator,  $V_{O,MAX} = V_{DD} - |V_{DSATP}|$  and  $V_{O,MIN} = |V_{DSATN}|$ , where  $V_{DSATP}$  and  $V_{DSATN}$  are the drain-source saturation voltages of  $M_{P1}$  and  $M_{N1,2}$  respectively under a given bias current  $I_{BIAS}$ . This is the minimum requirement in maintaining the switching capability of the ring oscillator.

#### ■ VCO Using Bulk-Driven Technique (TCASII 2009)

One important solution to the threshold voltage limitation is the bulk-driven technique. It is a typical technique to the operational amplifier design in low-voltage. The bulk-driven MOSFET allows zero, negative, and even small positive bias voltage to achieve the desired DC current. It also extends the input common-mode range which is difficult to achieve at low supply voltage. Typically, the bulk terminals of MOSFETs are always connected to the highest (or lowest) voltage in the circuit for PMOS (or NMOS) transistors to avoid the latch-up problem. In a 0.5-V design, no risk of forward biasing of this junction exists. The bulk terminal can be used with a rail-to-rail input.



Fig. 2-4. Comparison between gate- and bulk- driven.

Fig. 2-4 shows the characteristics of the gate- and bulk-driven PMOS transistors. In the gate-driven PMOS without forward body biasing (FBB), the source-to-gate voltage must be greater than the absolute value of the threshold voltage while in the bulk-driven technique, it can eliminate the above limitation.



Fig. 2-5. Circuit configuration of the bulk-driven VCO.

A bulk-input technique is employed to implement a VCO to improve the operating frequency at lower supply voltage. Fig. 2-5 shows the circuit configuration of the VCO. It consists of three stages of fully differential cells. The NMOS transistors ( $M_1$ ,  $M_2$ ) in the delay element are used in a positive feedback topology to reduce the transition time of the output and ensure the logic state. The bulk terminals of the PMOS transistors in the delay elements are directly controlled by the loop filter output voltage  $V_{CTRL}$ . Therefore, the bulk-input technique can extend the dynamic range of the control voltage to rail-to-rail to provide a highly linear gain without a V-I converter. In addition, the current source  $I_{VCO}$  is digitally-controlled by a 2-bit word from the calibration circuit to prevent PVT variations and maintain the desired operating frequency.

#### ■ Ultra-Low-Power and Portable DCO (TCASII 2007)



Fig. 2-6. Architecture of proposed DCO



Fig. 2-7. Coarse tuning stage.



Fig. 2-8. Fine tuning stage.

Fig. 2-6 illustrates the architecture of an ultra-low-power DCO in 2007. The main idea is to change the oscillation frequency by means of modifying the delay time of the signal path. To preserve the control code resolution and the operation range, the DCO employs a cascade structure consisting of coarse-tuning and fine-tuning stages. Therefore, the control code-to-delay linearity and operating range can be achieved easily. Two low-power techniques are obtained. First, the segmental delay line (SDL) can disable the transition of redundant segmental delay cells in coarse-tuning stage, as shown in Fig. 2-7. Second, the hysteresis delay cell (HDC) is proposed for the fine tuning stage to reduce the number of short-delay cells. The fine tuning stage is shown in Fig. 2-8 which has three sub-fine-tune stages. It is composed of a HDC, a long-delay digitally-controlled varactor (DCV), and a short-delay DCV. In addition, it should be noted that the controllable range in each stage must be larger than the delay step of the previous stage. So the cascade DCO structure doesn't have any dead zone larger than the LSB resolution of the DCO.

#### ■ Low Jitter ADPLL (2009)

In ADPLLs, digital loop filters replace analog loop filters to reduce the area overhead. But an extra adder is required to sum up the proportional and the integral parts of the digital loop filter. As shown in Fig. 2-9, with a mixed-signal PFD, the extra adder is no more needed and the longest transport delay is reduced. In addition, the PFD has a shorter switching time. It is proportional to the phase error. So it has better jitter performance than the bang-bang PFD. As for the DCO, higher frequency resolution of DCO can decrease the output jitter. Thus, a DCO resolution enhancement circuit is applied to achieve the low jitter requirement.



Fig. 2-9. Low jitter ADPLL architecture.

Fig. 2-10 shows the architecture of the DCO. To improve the frequency resolution, an 8-bit sigma-delta modulator is applied. Besides, a DCO resolution enhancement circuit is added to the DCO. It controls one PMOS from the array with the high frequency output of the oscillator to improve the resolution. Without the enhancement circuit, the on- or off-time of the control signal from sigma-delta modulator is beyond one reference clock period. On the other hand, the equivalent on- or off-time is halved with the enhancement circuit, which means the equivalent frequency resolution is doubled.



Fig. 2-10: DCO architecture.

#### 3. Proposed On-chip Bus

#### Proposed on-chip bus

Fig. 3-1 shows the proposed 4-bit on-chip bus for data communication under the subthreshold power supply. A bus is divided into several segments, each of which is driven by a bootstrapped repeater. Ground shielding is used to eliminate the effective-loading uncertainty and decouple noise from adjacent channels. The staggered repeaters on adjacent channels are misaligned to reduce the coupling noise and simultaneous switching noise (SSN).



Fig. 3-1. Proposed on-chip bus architecture with new bootstrapped repeater insertion.

#### Proposed bootstrapped drivers

This project proposes two bootstrap circuits, an active leakage-current-reduction bootstrapped inverter (ALBI) and an ISI-suppressed bootstrapped driver (ISBD). We will introduce in following section.

#### ■ <u>ALBI</u>

Figure 3-2 schematically depicts the proposed bootstrapped CMOS inverter. Where  $C_{BP}$  and  $C_{BN}$  are the bootstrap capacitors;  $M_{P1}$  and  $M_{N1}$  are the transistors for  $C_{BP}$  precharge and  $C_{BN}$  pre-discharge; INV refers to the inverter to control  $M_{P2}$  and  $M_{N2}$ ;  $M_{PD}$  and  $M_{ND}$  are the output drivers for  $C_L$ ;  $N_P$  and  $N_N$  are the boosted nodes. The node  $N_B$  is boosted above  $V_{DD}$  and below ground to enhance the driving capability.

Figure 3-3 shows the simulated transient waveforms with an output load of 0.5pF under a power supply of 200 mV. According to this figure, before  $V_{in}$  transits from H-to-L, node  $N_N$  has the initial voltage of 0 V. After transiting from H-to-L,  $N_N$  is boosted below ground to (-188 mV). Meanwhile,  $M_{P2}$  is turned off and  $M_{N2}$  is turned on. Therefore, the boosted signal at  $N_N$  passes through  $M_{N1}$  to  $N_B$  to drive  $M_{PD}$  in order to pull up the capacitive load  $C_L$ . At this moment,  $M_{P1}$  is turned on to precharge  $N_P$  to  $V_{DD}$  (0.2 V). However,  $M_{N1}$  is turned on reversely causing the reverse current flow to charge  $N_N$ . At the end of the period while  $V_{in}$  is L,  $N_N$ still holds (-90 mV). When  $V_{in}$  goes from L to H, the operation is similar to  $V_{in}$  transiting from H to L.  $N_P$  is boosted above  $V_{DD}$  to 389 mV and discharged to 303 mV at the end of the period while  $V_{in}$  is H.



Fig. 3-2. Proposed bootstrapped inverter.



Fig. 3-3. Simulated timing waveforms at 5MHz at 200 mV  $V_{DD}$ .

Delay time is another important feature of bootstapped circuits. Although the driving transistors operate in a triode region under the subthreshlod-supply, other devices remain in the subthreshlod region. The total delay time is thus the sum of the propagation delay of the INV and the driver, which is denoted as

$$t_{P,BI} = t_{P,INV} + t_{P,Driver}$$
(3-1)

Where  $t_{P,BI}$ ,  $t_{P,INV}$ , and  $t_{P,Driver}$  are the delays of the bootstrapped inverter, the INV, and the driver, respectively.

Assume that the boosting efficiency is the same for all bootstrapped drivers. Delay time of the INV becomes a dominant factor. The sub-threshold logic delay is derived as

$$t_{p} = \frac{k_{f} \cdot C_{L} \cdot V_{DD}}{\mu C_{dep} \frac{W}{L} V_{T}^{2} \exp(\frac{V_{DD} - V_{th}}{n V_{T}})}$$
(3-2)

Where  $k_f$  is a fitting parameter. However, circuit delay time is related to the RC loading effects. The proposed bootstrapped inverter has the shortest delay time among the other bootstrapped circuits since the loading of INV is only gate capacitance of  $M_{N2}$  and  $M_{P2}$ .

Fig. 3-4 summarizes the comparison results for the delay time (from H to L) and the power consumption as a function of  $C_L$  at 10MHz with a supply of 200 mV. The proposed design is the lowest in power consumption and delay time.



Fig. 3-4. Delay time and power consumption versus capacitive loads at 10MHz.

#### ■ <u>ISBD</u>

The proposed bootstrapped repeater is composed of an inverter as the driver and a bootstrap control circuit. The bootstrap control circuit has many important features. First, a precharge enhancement scheme improves the precharge capability to achieve high-speed operation. Second, a leakage current elimination technique suppresses the ISI noise. Third, the bootstrap control circuit produces a boosted output swing from  $-V_{DD}$  to  $2V_{DD}$  to increase the driving current  $(2V_{DD})$  and turn off the transistor aggressively  $(-V_{DD})$ . As a result, the  $I_{on}/I_{off}$  ratio is improved substantially.

Fig. 3-5 depicts the proposed bootstrapped CMOS repeater.  $C_{BP}$  and  $C_{BN}$  are the bootstrap capacitors;  $M_{P1}$  and  $M_{N1}$  are the precharge transistors for  $C_{BP}$  and  $C_{BN}$ ;  $INV_P$  and  $INV_N$  are the pre-drivers to boost  $C_{BP}$  and  $C_{BN}$ ; and  $M_{PD}$  and  $M_{ND}$  are the output drivers.  $N_{BT}$  is boosted to  $2V_{DD}$  and  $-V_{DD}$  to enhance the driving capability of  $M_{PD}$  and  $M_{ND}$ .  $N_{BT}$  is also fed back to control  $M_{P1}$  and  $M_{N1}$  to enhance the precharge capacity and eliminate the reverse leakage current simultaneously.



Fig. 3-5. Circuit of proposed bootstrapped repeater.

Assume that the bootstrap capacitors  $C_{BP}$  and  $C_{BN}$  had stored a voltage potential of  $V_{DD}$  before  $V_{in}$  transitions from H to L; node  $N_{BP}$  has an initial voltage of  $V_{DD}$ , and node  $N_{BT}$  has an initial voltage of  $-V_{DD}$ , ideally. After  $V_{in}$  transitions from H to L,  $N_{OP}$  transitions from L to H and  $N_{BP}$  is boosted to  $2V_{DD}$ . At the same time,  $M_{P2}$  is turned on and  $M_{N2}$  is turned off.  $2V_{DD}$  at  $N_{BP}$  starts to charge  $N_{BT}$  through  $M_{P2}$  and pushes  $N_{BT}$  to  $2V_{DD}$ . After  $N_{BT}$  is charged above threshold voltage  $V_{th}$ ,  $M_{N1}$  is turned on to precharge  $N_{BN}$  to GND. Now,  $C_{BN}$  has a potential of  $-V_{DD}$ . As  $V_{in}$  transits from L to H, a similar mechanism pushes  $N_{BT}$  to  $-V_{DD}$ . Fig. 3-6 shows the simulated transient waveforms with a 1mm wire load and a  $V_{DD}$  of 200mV. Here,  $N_{BT}$  swings from 384mV to -186mV instead of the ideal 400mV to -200mV owing to the charge sharing effect of  $N_{BP}$  and  $N_{BT}$ .



Fig. 3-6. Simulated timing waveforms under 200mV supply.

In a low-voltage design, the leakage current  $I_{off}$  accounts for a large portion of the total power consumption. The  $I_{off}$  current is mostly the sub-threshold leakage current, which is expressed as follows.

$$I_{off} = \mu C_{dep} \frac{W}{L} V_T^2 \exp(\frac{V_{GS} - V_{th}}{nV_T}) \left( 1 - \exp(\frac{-V_{DS}}{V_T}) \right).$$
(3-3)

Where  $\mu$  is the effective mobility;  $C_{dep}$  is the depletion capacitance; W and L are the width and length of the device;  $V_T$  is the thermal voltage;  $V_{GS}$  is the gate-to-source voltage;  $V_{th}$  is the threshold voltage; n is the sub-threshold slope factor, and  $V_{DS}$  is the drain-to-source voltage. Scaling down to a sub-threshold supply voltage substantially reduces  $I_{on}$ , which is proportional to  $(V_{GS} - V_{th})$  and varies exponentially with  $V_{DS}$ . Since  $I_{off}$  also remains exponentially proportional to  $V_{DS}$ ,  $I_{off}$  becomes responsible for a significant fraction of the total power consumption. As a result, scaling the supply voltage below the subthreshold directly lowers the  $I_{on}/I_{off}$  ratio.

Although HSPICE can simulate steady-state leakage power, characterizing the leakage power under dynamic operations is difficult. The following approach is taken. The total energy per bit is represented as

$$E_T = E_{SW} + E_{SC} + E_{Leakage}.$$
(3-4)

Where  $E_T$  represents the total energy per bit;  $E_{SW}$  is the switching energy;  $E_{SC}$  is the short-circuit energy, and  $E_{Leakage}$  is the leakage energy. Here,  $E_{SW}$  and  $E_{SC}$  are independent of the period because they are consumed only when an H-to-L or an L-to-H transition occurs. However,  $E_{Leakage}$  is proportional to T and can be represented as  $E_{Leakage} = P_{Leakage} \cdot T$ . Thus, (3-6) can be rewritten as

$$E_T = E_{SW} + E_{SC} + P_{Leakage} \cdot T \quad (3-5)$$

For two identical signals with different periods  $T_1$  and  $T_2$ ,

$$E_{T_1} = E_{SW} + E_{SC} + P_{Leakage} \cdot T_1 = P_{T_1} \cdot T_1.$$
(3-6)

$$E_{T_{2}} = E_{SW} + E_{SC} + P_{Leakage} \cdot T_{2} = P_{T_{2}} \cdot T_{2} .$$
(3-7)

From equations (8) and (9),  $P_{Leakage}$  is derived as

$$P_{Leakage} = \frac{P_{T_1} \cdot T_1 - P_{T_2} \cdot T_2}{(T_1 - T_2)} .$$
(8)

To demonstrate the reduction of leakage current, the proposed design is compared with the conventional inverter and two reported works [16-17]. They are all designed to drive a 200fF C<sub>L</sub>. A 55nm SPRVT process is used. For all bootstrap drivers,  $C_B = 50$ fF and the widths of M<sub>PD</sub> and M<sub>ND</sub> are 288nm and 108nm, respectively, for a fair comparison. The conventional inverter was designed to be 50 times the size of the bootstrapped driver to obtain the same output t<sub>rise</sub> and t<sub>fall</sub> as the bootstrapped one.

Fig. 3-7 shows the  $P_{Leakage}/P_T$  ratio as function of the supply voltage. The proposed design has a  $P_{Leakage}/P_T$  ratio of less than 1% even though  $V_{DD} = 0.1$ V. It is roughly one order of magnitude lower than those of the others.



Fig. 3-7. Comparisons of  $P_{Leakage}/P_T$  ratio.

#### 4. **DVFS scheme with a low-voltage multi-phase ADPLL**

#### Proposed DVFS scheme

In order to achieve the highest power efficiency of on-chip bus, a new DVFS for on-chip bus scheme is proposed. In our proposed DVFS scheme, supply voltage or the data rate can be dynamically changed according to the practical PVT corners and noise condition by the measured timing margin; therefore the power consumption can be optimized to meet the specification requirements of the on-chip bus. However, timing margin is related to timing jitter. Timing jitter is due to the phase variation of a periodic signal in data communications and telecommunications; for instance, the maximum jitter is specified 0.7UI in the PCI-Express standard at the receiver end. Receiver can't recover the data if the timing jitter is larger than 0.7UI.

Timing jitter may be quite different in variable PVT conditions. However, the goal of our proposed DVFS scheme is to change supply voltage or the data rate to make timing margin just fit the timing specification whatever the corner of the practical IC is. If there is too much redundant margin, we can lower down supply voltage to save power dissipation according the results of DVFS. On the other hand, if there is not enough timing margin, we can increase supply voltage to speed up the circuits or slow down the operating frequency. Consequently malfunctions can be prevented from dynamical tuning.

In order to measure the timing jitter on chip, the transitions monitoring scheme which can detect data transitional edge is shown in Figure 4-1.  $D_{IN}$  represents input data and CK is multi-phase clock. The concept of data transitional edge detection is that when the adjacent sample phases get the different value from sampling data, data transition occurs. After many times of sampling from multi-phases, data transitions form peak-to-peak timing jitter which are represented by sampling phases. Furthermore, we can construct the region of jitter of whole eye-diagram by the results of timing measurement. On the other hand, results of timing measurement can also be represented by how much eye-diagram opening according to unit interval of data rate, which is shown in Figure 4-1(b).



Fig 4-1. Eye-diagram measurement

#### Bootstrapped ring oscillator (BTRO)

In recent years, low-voltage supply is essential in minimizing the energy consumption for battery and solar-powered electronics systems. However, the design of CMOS circuits under low-voltage supply in subthreshold region is challenging. Weak driving capability and large process variation are two major concerns.

Phase-locked loops (PLLs) are key building blocks in integrated circuits. In which, voltage controlled

oscillators (VCO) and digitally-controlled oscillators (DCO) are the most power starving components. In order to reduce the power consumption according to  $P = fCV^2$ , lowering the supply voltage is the most effective means. Raha (VLSI Symposium 2004) proposed a increased the voltage headroom to operate at 300 MHz under 0.6 V V<sub>DD</sub>. With analog cell, the supply cannot be reduced further. Sheng (T. CASII 2007) uses digital delay cell to construct a 15-bit DCO. However, the large number of delay cells and delay paths make it difficult to reduce the power further. A delay cell using bulk-driven technique is reported in T.CASII 2009. It increases the driving current by lowering  $V_{th}$ . Under the process variation, the  $V_{th}$  lowering effect is also greatly affected.



Fig. 4-2. Architecture of the proposed bootstrapped ring oscillator.

In this project, we will introduce a ring oscillator design for 0.2-0.6 V supply voltage using a bootstrapped technique, as shown in Fig.4-2. The proposed bootstrapped delay cell generates large gate voltage swing from  $-\beta \cdot V_{DD}$  to  $\beta \cdot 2V_{DD}$  where  $\beta$  denotes the boost efficiency. Different from conventional inverters, a bootstrap control circuit is used to improve the driving capability significantly using single low-voltage supply. The boosted output swing keeps the transistors operating in the linear region to improve the linearity of the output frequency as function of  $V_{DD}$ . Furthermore, the linear operation reduces the process variation in the low-voltage region. As compared to reported bootstrap techniques, the proposed bootstrapped delay cell provides precharge enhancement and reverse current elimination schemes. Thus, the delay cell can increase the boost efficiency so as to achieve high operating frequency. Additionally, the proposed technique has low hardware overhead.



Fig. 4-3. Schematic diagram of the proposed bootstrapped CMOS delay cell.

Fig. 4-3 depicts the proposed bootstrapped CMOS delay cell schematically. It consists of a bootstrap circuit and a driver. The bootstrap circuit includes  $INV_P$  and  $INV_N$  for the bootstrap control,  $C_{BP}$  and  $C_{BN}$  as the bootstrap capacitors, and  $M_{P1}$  and  $M_{N1}$  as the precharging transistors.  $V_{in}$  and  $V_{out}$  are the boosted input and output nodes with  $M_{P2}$  and  $M_{N2}$  as the driver. Besides,  $V_{out}$  is fed back to the gate of the  $M_{P1}$  and  $M_{N1}$  to enhance the precharge ability and also eliminate the reverse current to keep the charges on the bootstrap capacitors. The charges on the bootstrap capacitors dominate the boost efficiency which affects the driving enhancement directly.

The simulated transient waveforms of a five-stage bootstrapped ring oscillator under a power supply of 0.5 V are depicted in Fig. 4-4. When V<sub>in</sub> has a H-to-L transition, N<sub>OP</sub> has a L-to-H transition. It bootstraps N<sub>BP</sub> to  $2V_{DD}$  through boost capacitor C<sub>BP</sub>. At the same time, M<sub>P2</sub> is turned on to deliver the bootstrapped N<sub>BP</sub> ( $2V_{DD}$ ) to V<sub>out</sub>. The bootstrapped V<sub>out</sub> ( $2V_{DD}$ ) does not only drive M<sub>N2</sub> of the next stage better but also close M<sub>P1</sub> better to reduce the leakage current and turn on M<sub>N1</sub> better to enhance the pre-discharge of N<sub>BN</sub> (C<sub>BN</sub>). The same operation is for the L-to-H transition.



Fig. 4-4. Simulated transient waveforms of a five-stage bootstrapped ring oscillator.

Although ring oscillators with digital delay cells reduce power consumption, the convention inverters as delay cells have many drawbacks in the low-voltage region. With the bootstrapping technique, it is able to enhance driving capability, reduce the process variation effects, and improve the linearity of the transfer curve. They will be detailed in the section, and comparison to conventional inverter-based ring oscillators will also be outlined.

In the low-voltage design, poor driving capability is the most critical design issue. However, to improve driving by increasing transistor size results in loading effect penalty. The proposed bootstrapped ring oscillator uses bootstrap circuit to generate boosted voltage to improve the driving capability. Ideally, the boosted voltage  $V_{out}$  provides voltage swing from -  $V_{DD}$  to  $2V_{DD}$ . However, the parasitic capacitance at node  $V_{out}$  causes charge-sharing effect with boost capacitance. Assume that  $C_{PT}$  is the total parasitic capacitance at  $N_{BT}$  node. Thus, we can represent  $V_{out}$  as

$$V_{out} \approx \frac{C_{BP}}{C_{BP} + C_{PT}} \cdot 2V_{DD} @\beta \ \cdot 2V_{DD} .$$
(4-1)

Where  $\beta$  is defined as the boost efficient factor or so-called boost efficiency. We can easily find out that boost efficiency is heavily dependent on the total parasitic capacitance. In order to obtain higher driving capability, the boost capacitance is designed significantly larger than the node parasitic capacitance. Similarly, for the output low, we can rewrite V<sub>out</sub> as (4-2)

$$V_{out} \approx \frac{C_{BN}}{C_{BN} + C_{PT}} \cdot \left(-V_{DD}\right) @\beta \cdot \left(-V_{DD}\right).$$
(4-2)

Owing to the boosted voltage, components of the driver circuit are push to the linear region According to (4-1), the discharge current in the proposed delay cell is

$$I_{D} = \mu C_{dep} \frac{W}{L} \bigg[ (\beta \cdot 2V_{DD} - V_{th}) V_{DD} - \frac{1}{2} (V_{DD}^{2}) \bigg].$$
(4-3)

Where  $\mu$  denotes the effective mobility;  $C_{dep}$  is the depletion capacitance; W and L are the device width and length;  $V_{th}$  is the threshold voltage. Consequently, the boosted voltage increases the current driving capability.



Fig. 4-5. Supply-regulation VCO transfer curve comparison

Low-voltage operation degrades the yield due to its serious process variations, especially in the sub-threshold operation. Although the boosted control signal pushes the driver transistors into the triode region, the residue circuit devices still incur the same serious problems with the variation. With fewer devices in the sub-threshold region, the proposed design is less affected by the process variation. In our design, all the devices are boosted into higher voltage region. For a design example, the proposed bootstrapped ring oscillator is designed as a VCO using supply-regulation technique. As compared to conventional inverter-based VCO, the VCO transfer curve in different process corners is shown in Fig. 4-5. We can easily find out that our design has better linearity and is less affected by the process variation.

In the PLLs design, the linearity of VCO is an important requirement. As we mentioned in (4-3), the proposed delay cell operates in linear region. Fig. 4-5 also shows that the linearity using the conventional inverter delay cells degrade while supply voltage goes down. At SS corner, the oscillator can hardly work below 0.3V. In addition, the gain of output frequency versus supply voltage ( $K_{VCO}$ ) is high which may lead to the large output jitter for the PLL system. As compared to the conventional one, the output swing of proposed delay cell keeps the transistors from operating in sub-threshold region at low supply voltage. The bootstrapped ring oscillator has better linearity, lower  $K_{VCO}$ , and better immunity against process variation.

#### Low-voltage multi-phase ADPLL

The proposed ADPLL, as shown in Fig.4-6, is composed of a *phase frequency detector* (PFD) to detect the phase error, a *phase selector* (PS) to reroute the signal path, a *time-to-digital converter* (TDC) to convert the phase error into digital code, a *digital loop filter* (DLF) to filter out the high frequency noise, a digitally controlled oscillator (DCO) to generate the required output frequency, and a divider (DIV) to divide and feed back the output frequency. To improve the resolution of the DCO, a 4-bit sigma-delta modulator (SDM) is used for the dithering.



Fig.4-6. Block diagram of the proposed ADPLL

The PFD produces UP and DN signals to indicate the phase error. Two signals are reroute by the PS in order to have the correct phase arrangement for the TDC. The TDC is a Vernier delay lines which requires proper phase order for the conversion. In the proposed ADPLL, a 4-bit TDC is designed with 20ps resolution at  $0.5V V_{DD}$ . The DLF is a 2<sup>nd</sup> order digital filter, whose parameters listed in Fig. 4-6 are obtained by a bilinear transformation from its analog counterpart. It produces 13 control bits for the VCO. nine integer bits control the oscillation frequency and four decimal bits go to the SDM to do 1-bit dithering to improve the resolution by a factor of 1/16. At 0.5V  $V_{DD}$ , devices suffer poor driving capability. The divide number should be designed according to the speed of the sub-circuits. In the proposed ADPLL, the output frequency is divided by 16.

Generally, oscillators consume most power in PLLs. In order to operate in the low-voltage supply region to minimize the power, a bootstrapped ring oscillator (BTRO) has been proposed. The bootstrapped inverter produces an output swing of  $-\beta \cdot V_{DD}$  to  $\beta \cdot 2V_{DD}$ , here  $\beta$  is the boosting efficiency. Such a large output swing increases the driving capability of the driving transistors  $M_{N2}$  and  $M_{P2}$ , such that high frequency oscillation is possible even in the subthreshold region. With the  $-V_{DD}$  input for example, the output is  $2V_{DD}$ . The boosted input  $V_{in}$  enhances the driving capability of INV<sub>P</sub> and INV<sub>N</sub>.  $M_{P1}$  is turned off completely by  $2V_{DD}$  to curtail the reverse current from  $N_{BP}$  to  $V_{DD}$  and speed the precharge of  $C_{BN}$  through  $M_{N1}$ . At the same time, it will push  $M_{P2}$  of the next cell into deep cutoff region to limit the leakage with a negative  $V_{SG}$ . As a result, the proposed BTRO is able to operate frequency with better power efficiency. In addition, the BTRO has high linearity and high immunity against process variation because most of the devices are operated in the triode region.



Fig. 4-7. Circuit schematic of the proposed DCO

According to our previous work, the proposed monotonic DCO is composed of a 5-stage BTRO with its supply voltage  $V_C$  connected to a digitally controlled resistance network, as shown in Fig. 4-7. The  $V_C$  of the BTRO is controlled by a resistance network. The resistance network consists of 9-bit PMOS transistor arrays, thermometer-code-to-binary-code (T2B) converters and an SDM. Fully thermometer control occupies large area with complicated wiring. Hybrid architecture of binary and thermometer control is reported and costs less chip area. Because the linearity of the BTRO highly depends on the supply voltage, the PMOS arrays are no longer binary weighted to obtain a better linearity. With a dedicated transistor sizing, the PMOS arrays are arranged in a segmented thermometer code manner which is composed of a 2-bit T2B and 3-bit T2B for coarse codes and 4-bit T2B for fine codes. Additionally, a 4-bit 1<sup>st</sup>-order SDM is used to dither the least-significant bit (LSB) of the DCO. Furthermore, in order to improve the conductivity of resistance network at sub-0.5V  $V_{DD}$ , only four PMOS transistors stacked on each array. Fig. 4-8 shows the DCO out frequency versus the coarse control code of the proposed and the binary weighted. The proposed DCO gain is 563 kHz/code of the simulated TT corner. The binary weighted control results in various tuning steps.



Fig.4-8. DCO output frequency versus coarse codes

#### 5. Test chips and the measured results

In this section, we will showcase several practical chip implementations in the project. The project has taped out three nm-process chips. The first chip is the bootstrapped driver, ALBI fabricated in 90nm process; the second one is the bootstrapped driver ISBD fabricated in 55nm process; the third one is the BTRO and ADPLL fabricated in 90nm process. This section will show the test chips and the measured results.

#### ALBI

A test chip of ALBI is implemented in 90nm 1P9M SPRVT process to demonstrate the effectiveness of the proposed design scheme. The test circuits include the reported bootstrapped circuits in JSSC (1997), in T. VLSI (2008), and the proposed design. The circuits also contain test keys to verify the interconnection model. Each bootstrapped circuit is implemented as a 10-stage cascade driver chain. In each stage, two 30fF MOM capacitors serve as bootstrap capacitors and a 200fF MOM capacitor as CL. Level shifters are used to boost the 200mV internal signal to 500mV chip I/O signal for the measurement. The total area is  $958\mu m \times 776\mu m$ , and the core area is  $566\mu m \times 102\mu m$ . Figure 5-1 shows the die photograph. The cell layout of the proposed bootstrapped inverter is  $25.8\mu m \times 4.1\mu m$ .



Fig. 5-1. Die photograph and cell layout.



Fig. 5-2.Measured waveform at 200mV core VDD (500mV I/O VDD).

Figure 5-2 shows the measured waveform. The cumulative clock peak-to peak jitter and rms jitter are 3.6ns and 504ps, respectively. The measured average total power is 1.01uW. Since the average leakage power can be evaluated by (3), the derived leakage power is 107nW under periods of 100ns and 105ns. TABLE 5-1

lists the summaries of the chip.

| Item                          | Specification (unit)             |             |                         |              |  |
|-------------------------------|----------------------------------|-------------|-------------------------|--------------|--|
| Process                       | 90nn                             | n SPRVT Low | -K CMOS Pro             | ocess        |  |
|                               | Bootstrapp                       | ed Circuits | 0.2                     | 2V           |  |
| Supply Voltage                | Level Shi                        | ift Buffer  | 0.2V,                   | 0.5V         |  |
|                               | Digital                          | Circuits    | 0.5                     | 5V           |  |
| Power Dissipation<br>@ 10 MHz | Leakage                          | e Power     | Total Power             |              |  |
|                               | Post-sim<br>(FF Corner) Measured |             | Post-sim<br>(FF Corner) | Measured     |  |
| (10 stages)                   | 133nW 107nW                      |             | 1.13uW                  | 1.01uW       |  |
| Layout Area                   | Interconnect Test<br>Circuits    |             | 575µm:                  | ×307µm       |  |
|                               | Bootstrapp                       | ed Circuits | 566μm×102μm             |              |  |
|                               | Whole                            | e Chip      | 958µm                   | 958μm× 776μm |  |

TABLE 5-1. Chip Summary

#### ISBD

A test chip of the ISBD has been designed and fabricated in 55nm 1P10M SPRVT. The test chip includes two on-chip buses, by the proposed bootstrapped repeater and the conventional one. The block diagram of both on-chip buses are shown in Fig. 5-3. Four-bit pseudo-random bit sequences (PRBS) are generated and passed into H-to-L level shifter to adjust the voltage swing to 0.1-0.3V. An extra input I/P is for the equipment to provide tunable clock signal or random data. Each on-chip bus has four channels. Each channel is 10mm long divided into 10 segments with a spacing of 90nm for ground shielding in Metal5. In each bootstrapped repeater, two 50fF MOM capacitors serve as the bootstrap capacitors. There are level shifters for the I/O. The total area is  $821\mu$ m×820 $\mu$ m and the core area is  $637\mu$ m×206 $\mu$ m. Fig. 5-4 shows the die photograph. The cell layout of the proposed bootstrapped repeater is  $16.7\mu$ m×11.8 $\mu$ m.



Fig. 5-3. Block diagram of test circuits.



Fig. 5-4. Die photo and cell layout.

Fig. 5-5 shows the measured clock waveforms (a), data eye diagram (b), and I/O transient waveforms (c) under the supply voltage of 0.11V, 0.2V, and 0.3V. The timing performance is listed in TABLE 5-2. Note that the random data is a  $2^{10}$ - 1 bit PRBS sequence and the level shifters contribute a 174ps RMS and 982ps peak-peak jitter. The performance of on-chip bus test chip is summarized in TABLE 5-3.



Fig. 5-5. Measured waveforms under 0.11V, 0.2V and 0.3V core  $V_{DD}$  (0.11–1.2V I/O  $V_{DD}$ ).

| Supply voltage     | 0.1V    | 0.11V    | 0.2V    | 0.3V    |
|--------------------|---------|----------|---------|---------|
| Clock rate         | 0.6MHz  | 1MHz     | 22.5MHz | 100MHz  |
| Clock jitter (RMS) | 22.4ns  | 12.0ns   | 0.58ns  | 132ps   |
| Clock jitter (p-p) | 206ns   | 87.3ns   | 5.15ns  | 954ps   |
| Data rate          | 0.8Mbps | 1.25Mbps | 40Mbps  | 100Mbps |
| Data jitter (RMS)  | 81.0ns  | 48.5ns   | 0.95ns  | 0.43ns  |
| Data jitter (p-p)  | 395ns   | 271ns    | 5.72ns  | 2.65ns  |
| Data latency       | 2.93us  | 1.99us   | 166ns   | 36.0ns  |

TABLE 5-2. Measured Timing Performance

TABLE 5-3. Chip Summary of ISBD

| Process                              | 55nm 1P10M SPRVT Low-K CMOS |                  |                  |  |  |
|--------------------------------------|-----------------------------|------------------|------------------|--|--|
| V <sub>th</sub>                      | NMOS:                       | 300mV; PMOS: -   | -310mV           |  |  |
| Core Supply<br>Voltage               |                             | 0.1–0.3V         |                  |  |  |
| Supply Voltage of                    | V <sub>IOL</sub>            | V <sub>IOM</sub> | V <sub>IOH</sub> |  |  |
| Level Shift Buffers                  | 0.1–0.3V                    | 0.2–0.8V         | 0.4–1.0V         |  |  |
| Supply Voltage of<br>Digital Circuit | 0.4–1.0V                    |                  |                  |  |  |
| Max, Clock Link                      | 0.6MHz                      | 22.5MHz          | 10MHz            |  |  |
| Max. CIOCK LITIK                     | @ 0.1V                      | @ 0.2V           | @ 0.3V           |  |  |
| Max Data Link                        | 0.8Mbps                     | 40Mbps           | 100Mbps          |  |  |
|                                      | @ 0.1V                      | @ 0.2V           | @ 0.3V           |  |  |
|                                      | 0.1V                        | 0.2V             | 0.3V             |  |  |
| Energy per bit                       | @ 0.6MHz                    | @ 22.5MHz        | @ 100MHz         |  |  |
|                                      | 40fJ                        | 59fJ             | 123fJ            |  |  |
| Lookago Power                        | 0.1V                        | 0.2V             | 0.3V             |  |  |
| Leakage Power                        | 0.03uW                      | 0.14uW           | 0.57uW           |  |  |
|                                      | Conventional<br>Repeater    | 637um x 183um    |                  |  |  |
| Layout Area                          | Bootstrapped<br>Repeater    | 637um >          | 206um            |  |  |
|                                      | Whole Chip                  | 821um x 820um    |                  |  |  |

## BTRO

A test chip of the BTRO is implemented in 90nm 1P9M SPRVT. It includes a supply-regulation ring oscillator circuit operated at 0.2-0.6V supply. In each bootstrapped delay cell, two 50fF MOM capacitors serve as the bootstrap capacitors. Fig. 5-6 shows the die photograph overlaid the proposed VCO layout. The VCO core area is  $31.5 \mu m \times 61.5 \mu m$ .



Fig. 5-6. Photograph of the test chip.

Fig. 5-7 shows the measured transfer curve under a 0.2-0.6 V supply. The operating frequency is a function of  $V_{DD}$ . According to the post-simulation, the measured result is close to the TT corner. The oscillation frequency is 48 MHz, at 0.2 V and 771 MHz at 0.6 V. The measured power consumptions are 0.6 uW and 87.6 uW, respectively. At a 1-MHz offset, the resulting phase noise is – 93 dBc/Hz at 0.2 V, – 90 dBc/Hz at 0.4 V and – 88.5 dBc/Hz at 0.6 V, as shown in Fig. 5-8.



Fig. 5-7. Measured transfer curve of VCO circuit.



Fig. 5-8. Measured phase noise (a) 48 MHz at 0.2 V (b) 399 MHz at 0.4 V (c) 771 MHz at 0.6 V.



Fig. 5-9. Measured 48 MHz waveform at 0.2V and the jitter histogram.

#### ADPLL

The ADPLL is implemented in 90nm 1P9M SPRVT CMOS process. The output frequency ranges from 36.8MHz to 480MHz under a supply voltage of 0.25 to 0.5V. Fig. 5-10 shows the measured results of output spectrum and phase noise. With a reference of 30MHz (2.3MHz), the measured spur at 480MHz (36.8MHz) under a 0.5V (0.25V)  $V_{DD}$  is 42.5dB (39.9dB) below the carrier. The phase noise are -96.2dBc/Hz (-91.6dBc/Hz) at 1MHz offset and -79.9dBc/Hz (-78.1dBc/Hz) at 10kHz offset when the output frequency is 480MHz (36.8MHz). Fig. 5-11 shows the testing environment and the testing PCB. Fig. 5-12 shows the chip micrograph and the chip summary. The overall active area is  $326\mu$ m×175 $\mu$ m.



Measured reference spur

Frequency offset (Hz)

480MHz @0.5V

Fig. 5-10. Measured results of the proposed ADPLL.

Frequency offset (Hz)

36.8MHz @0.25V





Fig. 5-11. Testing environment and testing PCB.



Fig. 5-12. Micrograph of the test chip and the chip summary.

## 6. Comparisons and Conclusion

Comparisons

#### ■ ALBI

TABLE 6-1 lists the comparisons of ALBI measured results with other works under the 0.2V VDD. Our proposed design is able to operate at 10MHz, and the delay time of the ten-stage driver chain is 30.1 us. Additionally, the energy per cycle is 0.1pJ, and the leakage power is 107nW. This indicates that our design is more power efficient then the others.

|                       | JSSC1997 | T.VLSI2008 | Proposed |
|-----------------------|----------|------------|----------|
| Supply voltage (V)    | 0.2      | 0.2        | 0.2      |
| Max frequency (MHz)   | 4 5      |            | 10       |
| Delay time (us)       | 47.3     | 48.2       | 30.1     |
| Total Power (uW)      | 0.74     | 1.71       | 1.01     |
| Leakage Power (nW)    | 276      | 833        | 107      |
| Energy per cycle (pJ) | 0.19     | 0.34       | 0.10     |

|            | 01.  | a   |          |
|------------|------|-----|----------|
| TABLE 6-1. | Chip | Com | parisons |

#### ■ ISBD

Fig. 6-1 shows the simulated and measured power and energy efficiencies. To match the measured results, FF process corner is used in the post-layout simulation. Overall, the measured results match the simulated ones very well except at the extreme case of  $0.1 V V_{DD}$ . This shows that the model is inaccurate in the deep subthreshold region.



Fig. 6-1. Comparisons with measured and post-simulation results.

Leakage current reduction is a distinguishing feature of the proposed design. Fig. 6-2 plots leakage power of

the measured and the simulated ones. The measured results are 30nW, 140nW, 575nW and 2.75uW under the  $0.1-0.4V V_{DD}$ . As compared with the simulated TT and FF corners, the chip result is closer to the FF corner. TABLE 6-2 lists the comparison results with some reported works. Most of them focus on low-power on-chip data communication in Gbps range. The proposed design is able to operate in subthreshold region under 0.1-0.3V supply voltage. The energy per bit is 40fJ at 0.1V, 59fJ at 0.2V, and 123fJ at 0.3V. This indicates that the proposed design is more power efficient than the others.



Fig. 6-2. Leakage power as a function of the supply voltage for the measured and post-simulation results.

|                         | TVLSI08         | TCASI08         | JSSC08          | JSSC10          | Proposed        | Proposed        | Proposed        |
|-------------------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|
| Technology              | 180nm           | 180nm           | 180nm           | 90nm            | 55nm            | 55nm            | 55nm            |
| Topology                | BT<br>repeaters | INV<br>repeater | Cap<br>coupling | Cap<br>coupling | BT<br>repeaters | BT<br>repeaters | BT<br>repeaters |
| Single/<br>Differential | Single          | Diff            | Diff            | Diff            | Single          | Single          | Single          |
| Supply<br>voltage (V)   | 0.4             | 1.0             | 1.8             | 1.2             | 0.1             | 0.2             | 0.3             |
| Spacing<br>(nm)         | N/A             | 1500            | 2 x 300         | 2 x 320         | 90              | 90              | 90              |
| Data rate<br>(Mbps)     | ★6 MHz          | 1500            | 1000            | 2000            | 0.8             | 40              | 100             |
| * FoM (pJ)              | N/A             | 1.74            | 2.24            | 0.28            | 0.04            | 0.059           | 0.123           |

 TABLE 6-2.
 COMPARISONS

★ only shows clock rate.

\* Figure of merit (FoM) =  $\frac{\text{Power}(\mu W)}{\text{Data rate (Mbps)}}$  = Energy per bit (pJ).

TABLE 6-3 lists the comparison results with some reported VCOs. Our proposed design is able to operate at only 0.2 V supply voltage, at the subthreshold supply. Additionally, the energy per cycle is 13 fJ at 0.2 V, 53 fJ at 0.4 V, and 0.114 pJ at 0.6 V. This indicates that our design is more power efficient then the others.

TABLE 6-4 compares the proposed design with recently state-of-the-art PLLs with a 0.5V VDD. Some

previous works achieve great phase noise with LC-VCO. However, these designs occupy a large die area using passive resonant elements and provide only two phases of output frequency. On the contrary, ring-VCO PLLs have area efficient and more phases of output frequency but inherent inferior phase noise. The proposed ADPLL has 10-phase output frequency and consumes  $78\mu$ W at 480MHz under a  $V_{DD}$  of 0.5V, which is occupied 53.8% by the DCO. The proposed design can work even at  $V_{DD} = 0.25$ V with a lock range of 36.8 to 44.8MHz. In terms of the *figure of merit* (FoM) in pJ/cycle, the proposed one is almost an order better than the others.

|                              | S. VLSI'04 [1]              | TCASII'07 [3]           | TCASII'07 [2]       | JSSC'09 [4]             | This work                | This work                 | This work                 |
|------------------------------|-----------------------------|-------------------------|---------------------|-------------------------|--------------------------|---------------------------|---------------------------|
| Process                      | 90 nm                       | 0.13 um                 | 90 nm               | 0.18 um                 | 90 nm                    | 90 nm                     | 90 nm                     |
| Supply Voltage<br>(V)        | 0.6-1.2                     | 0.5                     | 1                   | 1.8                     | 0.2                      | 0.4                       | 0.6                       |
| Tuning range                 | 0.3-6 GHz                   | 306-725 MHz             | 191-952 MHz         | 0.5-2.5 GHz             | 48 MHz                   | 399 MHz                   | 771 MHz                   |
| Jitter                       | 0.615<br>ps ,rms<br>@ 5 GHz | N/A                     | N/A                 | 15 ps, p-p<br>@ 1.5 GHz | 65.5 ps, p-p<br>@ 48 MHz | 61.8 ps, p-p<br>@ 399 MHz | 58.2 ps, p-p<br>@ 771 MHz |
| Phase noise<br>@1 MHz offset | -115 dBc/Hz<br>@ 5 GHz      | -95 dBc/Hz<br>@ 550 MHz | N/A                 | N/A                     | -93 dBc/Hz<br>@ 48 MHz   | -90 dBc/Hz<br>@ 399 MHz   | -88.5 dBc/Hz<br>@ 771 MHz |
| Power                        | 10 mW<br>@ 5 GHz            | 210 uW<br>@ 550 MHz     | 140 uW<br>@ 200 MHz | 1.2 mW<br>@1.5 GHz      | 0.63 uW<br>@ 48 MHz      | 21.0 uW<br>@ 399 MHz      | 87.6 uW<br>771 MHz        |
| Area                         | 0.1 mm <sup>2</sup>         | 0.017 mm <sup>2</sup>   | N/A                 | 0.093 mm <sup>2</sup>   | 0.002 mm <sup>2</sup>    | 0.002 mm <sup>2</sup>     | 0.002 mm <sup>2</sup>     |
| Figure of merit              | 2.0 pJ                      | 0.382 pJ                | 0.636 pJ            | 0.8 pJ                  | 0.013 pJ                 | 0.053 pJ                  | 0.114 pJ                  |

**TABLE 6-3 Performance Comparisons** 

\* Figure of merit (FoM) =  $\frac{Power(\mu W)}{Freq. (MHz)}$  = Energy per cycle (pJ)

| TABLE 0-4 I enormance comparisons of ADI LES |                                   |                    |                  |                    |                    |                   |
|----------------------------------------------|-----------------------------------|--------------------|------------------|--------------------|--------------------|-------------------|
|                                              | ISSCC'07                          | VLSI'07            | T.CAS2'09        | T.CAS1'11          | This work          | This work         |
| Process                                      | 90nm                              | 0.18um             | 0.13um           | 90nm               | 90nm               | 90nm              |
| Supply voltage<br>(V)                        | Analog: 0.5<br>Digital: 0.65      | 0.5                | 0.5              | 0.5                | 0.25               | 0.5               |
| Oscillator type                              | LC-VCO                            | LC-VCO             | Ring             | Ring               | Ring               | Ring              |
| Operating<br>frequency (GHz)                 | 2.4~2.6                           | 1.9~1.94           | 0.36~0.61        | 0.4~2.24           | 36.8~44.8<br>(MHz) | 0.176~0.48        |
| Output phase                                 | 2                                 | 2                  | 6                | 8                  | 10                 | 10                |
| Power (mW)                                   | 6                                 | 4.5                | 1.25<br>@0.55GHz | 2.08<br>@2.24GHz   | 0.0024<br>@36.8MHz | 0.078<br>@0.48GHz |
| RMS jitter (ps)                              | N.A.                              | N.A.               | 8.01<br>@0.55GHz | 2.22<br>@2.24GHz   | 7.8<br>@36.8MHz    | 10.8<br>@0.48GHz  |
| Reference spur<br>(dBc)                      | -52<br>@2.6GHz                    | -43.67<br>@1.92GHz | N.A.             | -40.28<br>@2.24GHz | -39.9<br>@36.8MHz  | -42.5<br>@0.48GHz |
| Area (mm <sup>2</sup> )                      | 0.14                              | 1.32<br>(w/i pads) | 0.04             | 0.074              | 0.057              | 0.057             |
| Phase noise<br>(dBc/Hz)<br>@1MHz offset      | -121<br>@2.6GHz<br>(@3MHz offset) | -120<br>@1.9GHz    | N.A.             | -87<br>@2.24GHz    | -91.6<br>@36.8MHz  | -96.2<br>@0.48GHz |
| * FoM (pJ)                                   | 2.4                               | 2.37               | 2.27             | 0.93               | 0.065              | 0.163             |

#### **TABLE 6-4 Performance Comparisons of ADPLLs**

\* Figure of merit (FoM) =  $\frac{\text{Power}(\mu W)}{\text{Frequency (MHz)}}$  = Energy (pJ).

#### Conclusions

This project proposed several key components of on-chip bus design based with DVFS concept. The main application domain includes low-voltage circuit systems, and the body of the proposed on-chip bus has a segmented buffer structure. According to the test chips and measured results, we can summarize our contribution as follows.

The ALBI describes a sub-threshold-supply bootstrapped CMOS inverter with an active leakage current reduction technique. Based on 4500 times of Monte Carlo simulations, the average delay time of the proposed design with 200fF  $C_L$  is 6.9ns with a standard deviation of 6.3ns, which achieves a 76% reduction from the conventional inverter. Measured results verify that the test chip can achieve 10MHz clock rate under 200mV  $V_{DD}$ . Due to the negative  $V_{GS}$  suppression, the measured leakage power is more than 50% improvement of the previously reported bootstrapped drivers. The power consumption is 1.01µW, and the leakage power is 107nW.

The second, we successful explore on-chip bus design with ISBD under the 0.1–0.3V supply voltage. With the proposed bootstrapped CMOS repeater insertion ISI-suppressed technique can accumulate low ISI jitter and achieve high clock/data rate even under the subthreshold-supply. In addition, the proposed bootstrapped repeater improves energy efficiency and has a  $P_{Leakage}/P_T$  ratio less than 1% even though  $V_{DD} = 0.1$ V, which is one order better than the other designs. The Measured results verified that our design achieve 100MHz (0.6MHz) clock link and 100Mbps (0.8Mbps) data link at 0.3V (0.1V)  $V_{DD}$ . Additionally, our design is energy efficient with only 123fJ (40fJ) per bit.

The third chip described a 0.2-0.6 V BTRO and a low-voltage ADPLL. The proposed delay cell improves the linearity of output frequency as function of  $V_{DD}$  by operating in the linear region. Measured results verify that the test chip can achieve 48 MHz at 0.2 V, and 771 MHz at 0.6 V. The resulting phase noise is – 93 dBc/Hz at 0.2 V, – 90 dBc/Hz at 0.4 V and – 88.5 dBc/Hz at 0.6 V at a 1-MHz offset. The active core area of the BTRO is only 0.002 mm<sup>2</sup>. As compared to other reported work, our design has higher power efficiency. The proposed ADPLL has 10-phase output frequency and consumes 78µW at 480MHz under a  $V_{DD}$  of 0.5V, which is occupied 53.8% by the DCO. The proposed ADPLL can work even at  $V_{DD}$  = 0.25V with a lock range of 36.8 to 44.8MHz. In terms of the FoM in pJ/cycle, the proposed one is almost an order better than the others. The overall active area is 326µm×175µm.

#### 7. References

- [1] A. Wang and A.P. Chandrakasan, "A 180-mV subthreshold FFT processor using a minimum energy design methodology," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 1, pp. 310-319, Jan. 2005.
- [2] J. Wang, J. Chen, Y. Wang, and C. Yeh, "A 230 mV-to-500 mV 375 KHz-to-16 MHz 32b RISC core in 0.18 μm CMOS," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Digest of Tech. Papers, Feb. 2007, pp. 294-604.
- [3] M. Seok, S. Hanson, Y. Lin, Z. Foo, D. Kim, Y. Lee, N. Liu, D. Sylvester, and D. Blaauw, "The phoenix processor: A 30 pW platform for sensor applications," *in Symp. VLSI Circuits Digest of Tech. Papers*, Jun. 2008, pp. 188-189.
- [4] Y. Pu, J. P. Gyvez, H. Corporaal, and Y. Ha, "An ultra-low-energy multi-standard JPEG co-processor in 65 nm CMOS with sub/near threshold supply voltage," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 3, pp. 668-680, Jan. 2010.
- [5] N. Verma, and A. P. Chandrakasan, "A 256 kb 65 nm 8T subthreshold SRAM employing sense-amplifier redundancy," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 1, pp. 141-149, Jan. 2008.
- [6] M. H. Tu, J. Y. Lin, M. C. Tsai, S. J. Jou, and C. T. Chuang, "Single-Ended Subthreshold SRAM With Asymmetrical Write/Read-Assist," *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 57, no. 12, pp. 3039-3047, Dec. 2010.
- [7] M. F. Chang, S. W. Chang, P. W. Chou, and W. C. Wu, "A 130 mV SRAM with expanded write and read margins for subthreshold applications" *IEEE Journal of Solid-State Circuits*, vol. 46, no. 2, pp. 520-529, Feb. 2011.
- [8] D.C. Daly, and A.P. Chandrakasan," A 6-bit, 0.2 V to 0.9 V highly digital flash ADC with comparator redundancy," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 11, pp. 3030-3038, Nov. 2009.
- [9] W. H. Ma, J. C. Kao, V. S. Sathe, and M. C. Papaefthymiou, "187 MHz sub-threshold-supply charge-recovery FIR," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 4, pp. 793-803, Apr. 2010.
- [10] S. Hanson, M. Seok, D. Sylvester, and D. Blaauw, "Nanometer device scaling in sub-threshold logic and SRAM," *IEEE Trans. on Electron Devices*, vol. 55, no. 1, pp. 175-185, Jan. 2008.
- [11] D. Bol, R. Ambroise, D. Flandre, and J. D. Legat, "Interests and limitations of technology scaling for subthreshold logic," *IEEE Trans. on Very Large Scale Integration (VLSI) Systems*, vol. 17, no. 10, pp. 1508-1519, Oct. 2009.
- [12] M. Alioto "Understanding DC behavior of subthreshold CMOS logic through closed-form analysis," *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 57, no. 7, pp. 1597-1607, Jul. 2010.
- [13] X. C Li, J. F. Mao, H. F. Huang, and Y. Liu, "A global interconnect width and spacing optimization for latency, bandwidth, and power dissipation," *IEEE Trans. on Electron Devices*, vol. 52, no. 10, pp. 2272-2279, Oct. 2005.

- [14] V. V. Deodhar and J. A. Davis, "Optimal voltage scaling, repeater insertion, and wire sizing for wave-pipelined global interconnects," *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 55, no. 4, pp. 1023-1030, May 2008.
- [15] M. Ghoneima, Y. Ismail, M. M. Khellah, J. Tschanz, and V. De, "Serial-link bus: a low-power on-chip bus architecture," *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 56, no. 9, pp. 2020-2032, Sep. 2009.
- [16] J. H. Lou and J. B. Kuo, "A 1.5-V full-swing bootstrapped CMOS large capacitive-load driver circuit suitable for low-voltage CMOS VLSI," *IEEE Journal of Solid-State Circuits*, vol. 32, no. 1, pp. 119-121, Jan. 1997.
- [17] J. Kil, J. Gu, and C. H. Kim, "A high-speed variation-tolerant interconnect technique for sub-threshold circuits using capacitive boosting," *IEEE Trans. on Very Large Scale Integration (VLSI) Systems*, vol. 16, no. 4, pp. 456-465, Apr. 2008.
- [18] X. Yuan, J. E. Park, J. Wang, E. Zhao, D. Ahlgren, T. Hook, J. Yuan, V. Chan, H. Shang, C. H. Liang, R. Lindsay, S. Park, and H. Choo, "Gate-induced-drain-leakage current in 45 nm CMOS technology," *IEEE Trans. on Device and Materials Reliability*, vol. 8, no. 3, pp. 501-508, Sep. 2008.
- [19] R. Ho, T. Ono, R. D. Hopkins, A. Chow, J. Schauer, F. Y. Liu, and R. Drost, "High speed and low energy capacitively driven on-chip wires," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 1, pp. 52–60, Jan. 2008.
- [20] E. Mensink, D. Schinkel, E. A. M. Klumperink, E. van Tuijl, and B.Nauta, "Power Efficient Gigabit Communication Over Capacitively Driven RC-Limited On-Chip Interconnects," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 2, pp. 447-457, Feb. 2010.
- [21] P. Raha, "A 0.6-1.2V Low-Power Configurable PLL Architecture for 6GHz-300MHz Applications in a 90nm CMOS Process," in Symp. VLSI Circuits Digest of Tech. Papers, pp. 232-235, Jun. 2004.
- [22] D. Sheng, C. C. Chung, and C. Y. Lee, "An Ultra-Low-Power and Portable Digitally Controlled Oscillator for SoC Applications," IEEE Trans. on Circuits System. II, vol. 54, no. 11, pp. 954-958, Nov. 2007.
- [23] Y. L. Lo, and W. B. Yang, T. S. Chao, and K. H. Cheng, "Designing an Ultralow-Voltage Phase-Locked Loop Using a Bulk-Driven Technique," IEEE Trans. on Circuits System. II, vol. 56, pp. no. 5, pp. 339-343, May 2009.
- [24] A. Arakali, S.Gondi, and P. K. Hanumolu, "Low-Power Supply-Regulation Techniques for Ring Oscillators in Phase-Locked Loops Using a Split-Tuned Architecture," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 8, pp. 2169-2181, Nov. 2009.
- [25] S. A. Yu and P. Kinget, "A 0.65V 2.5 GHz fractional-N frequency synthesizer in 90 nm CMOS," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2007, pp. 304–306.
- [26] H. H. Hsieh, C. T. Lu, and L. H. Lu, "A 0.5-V 1.9-GHz low-power phase-locked loop in 0.18-um CMOS," *in Symp. VLSI Circuits Dig. Tech. Papers*, Jun. 2007, pp. 164–165.

- [27] Y. L. Lo, and W. B. Yang, T. S. Chao, and K. H. Cheng, "Designing an Ultralow-Voltage Phase-Locked Loop Using a Bulk-Driven Technique," *IEEE Trans. on Circuits and Systems. II*, vol. 56, pp. no. 5, pp. 339-343, May 2009.
- [28] K. H. Cheng, Y. C. Tsai, Y. L. Lo, and J. S. Huang, "A 0.5-V 0.4-2.24-GHz Inductorless Phase-Locked Loop in a System-on-Chip," *IEEE Trans. on Circuits and Systems. I*, vol. 58, no. 5, pp.849-859, May. 2011.
- [29] Y. C. Ho, Y. S. Yang, and C. C. Su, "A 0.2-0.6 V Ring Oscillator Design Using Bootstrap Technique," *in Asian Solid-State Circuits Conf. (ASSCC)*, Nor. 2011 (to appear).
- [30] D. H. Oh, D. S. Kim, S. H. Kim, D. K. Jeong , and W. C. Kim, "A 2.8Gb/s All-Digital CDR with a 10b Monotonic DCO," *in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2007, pp. 222–224.
- [31] S. Lin, and S. Liu, "A 1.5GHz All-Digital Spread-Spectrum Clock Generator," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 11, pp.3111-3119, Nov. 2009.

# 國科會補助專題研究計畫成果報告自評表

請就研究內容與原計畫相符程度、達成預期目標情況、研究成果之學術或應用價值(簡要敘 述成果所代表之意義、價值、影響或進一步發展之可能性)、是否適合在學術期刊發表或申請 專利、主要發現或其他有關價值等,作一綜合評估。

1. 請就研究內容與原計畫相符程度、達成預期目標情況作一綜合評估

達成目標

- □ 未達成目標(請說明,以100字為限)
  - □ 實驗失敗
  - □ 因故實驗中斷
  - □ 其他原因

說明:

本計畫之主要乃架構於次臨界操作電壓以冀大幅降低功耗。但元件在次臨界區將面臨驅動力不足,以及嚴重的製程漂移的問題。我們在本計畫中所提出之拔靴帶式驅動器可在次臨界操作電壓下操作在線性區,得到較好的表現。綜言之,本計畫達到三大目標,(1)驗証次臨界微瓦級匯流排電路操作電壓下拔靴帶式驅動器在元件上設計的準確性;(2)成功建構拔靴帶式驅動器所組成之 10mm 微瓦級匯流排電路;(3)提出並驗証動態調整機制,另外更完成低電壓多相位鎖相迴路,來提供動態調整機制。以上皆符合計畫所預期。

2. 研究成果在學術期刊發表或申請專利等情形:

論文:■已發表 □未發表之文稿 □撰寫中 □無

專利:□已獲得 ■申請中 □無

技轉:□已技轉 □洽談中 □無

其他:

本計畫之研究成果在論文發表上也有相當好的表現。首先在學術期刊方面,具動態調整 電壓頻率機制之匯流排電路設計已發表於電機工程學刊(IJEE);次臨界微瓦級匯流排電路已 投稿在國際知名之固態電路期刊(JSSC),目前在審稿中。研討會論文的部分,有一篇國際知 名之亞洲固態電路研討會論文(ASSCC),在國內的 VLSI/CAD 的部分,我們有三篇論文,值 得一提的是,這三篇論文皆被提名為最佳論文的候選,對本計畫的成果是一種肯定。最後在 專利方面,本計畫一共提出了3項中華民國專利申請與一項美國專利的申請,這四項專利皆 在申請程序中。  請依學術成就、技術創新、社會影響等方面,評估研究成果之學術或應用價值(簡要敘述 成果所代表之意義、價值、影響或進一步發展之可能性)學術成就

近年來,次臨界操作一直是學界不斷嘗試的新領域,在類比放大器中,次臨界操作可 提供較大的增益,試用於生醫領域;數位方面可以用在低頻時脈的系統中。但是次臨界電 路讓人覺得垢病的有下列三點:(1)無法操作在高速的應用中;(2)無法提供較大的驅動 力;(3)有非常嚴重的製程漂移。本計畫中所新提出的拔靴帶式的電路技巧,可以同時解 決這三個問題,是一大突破。

■ 技術創新

本計畫中新提出的拔靴帶式的電路技巧有下列三點:(1)加速預充電能力;(2)抑制靜 態漏電流;(3)不增加電容元件。綜合這三點,本計畫中新提出的拔靴帶式,不但具有傳 統拔靴帶式的電路的驅動能力,更可以操作在高速的系統中,且具有低成本的優勢。

■ 社會影響

本計畫中對社會的影響主要有兩大方面,首先,新提出的動態調整電壓頻率機制,是 目前各方研究的熱門主題,動態調整電壓頻率機制,可以讓系統的效能與功率消耗最佳 化。這是現今眾所追求的綠能生活一致的目標。而另一方面則是我們以拔靴帶式技術所設 計的電路,這個技術,不但具有效能上的優勢,根據我們的分析,我們所提出之拔靴帶式 技術,相較以往的技術有更小的製程漂移,更適合用在量產之中。

# 國科會補助計畫衍生研發成果推廣資料表

日期: 100 年 10 月 25 日

|                 | 計畫名稱:微瓦級動態電壓與頻率調整之晶片匯流排設計                  |  |  |  |  |
|-----------------|--------------------------------------------|--|--|--|--|
| 因刘合祥山山者         | 計畫主持人:蘇朝琴                                  |  |  |  |  |
| <b>凶杆胃桶助</b> 計重 | 計畫編號:NSC 98-2221-E-009-137-MY2             |  |  |  |  |
|                 | 領域:                                        |  |  |  |  |
|                 | (中文)低雜訊、低功率拔靴帶式驅動器電路                       |  |  |  |  |
| 研發成果名稱          | (英文) Low-Noise Bootstrapped Driver Circuit |  |  |  |  |
| <b>よ田館屋藤雄</b>   | 國立交通大學 發明人 學生:何盈杰,張家齊                      |  |  |  |  |
| 风不即囱依侢          | (創作人) 教授:蘇朝琴 教授                            |  |  |  |  |
|                 | 本發明提出了一個低雜訊、低功率的拔靴帶式驅動電路,電                 |  |  |  |  |
|                 | 路包含兩部分:第一部分為前端升降壓電路,負責將輸入的數位               |  |  |  |  |
|                 | 訊號放大至三倍的擺幅,用以驅動後級電路;並且將放大後的擺               |  |  |  |  |
|                 | 幅輸出回授控制前端升降壓電路,使得預充電電流提高並消除反               |  |  |  |  |
|                 | 轉電流所造成的雜訊。第二部份為後端反相器電路,架構為傳統               |  |  |  |  |
| 11 14 14 20     | CMOS 反相器。                                  |  |  |  |  |
| 技術說明            | 本電路具有幾項特點:首先,藉由放大輸入數位訊號三倍的                 |  |  |  |  |
|                 | 擺幅,一方面加強後方反相器電路的驅動能力,另一方面同時抑               |  |  |  |  |
|                 | 制反相驅動電路的靜態消耗功率;回授機制除了可以加速預充電               |  |  |  |  |
|                 | 時間,還可以消除反轉電流,進而有效降低雜訊,使本專利能夠               |  |  |  |  |
|                 | 適用在隨機資料的傳輸;除此之外,前端的升降壓電路僅使用了               |  |  |  |  |
|                 | 少數的電晶體設計,可降低寄生負載所造成的功率消耗。                  |  |  |  |  |
|                 |                                            |  |  |  |  |

|            | A low-noise low-power bootstrapped driver is proposed in the                              |  |  |  |  |  |
|------------|-------------------------------------------------------------------------------------------|--|--|--|--|--|
|            | patent. The proposed bootstrapped driver is composed of two main                          |  |  |  |  |  |
|            | parts. First of all, the bootstrapped circuits boost the output swing                     |  |  |  |  |  |
|            | from VDD to 3 times of VDD compared with the circuit without                              |  |  |  |  |  |
|            | boosting. Due to the swing of 3 times of VDD, the driver circuits,                        |  |  |  |  |  |
|            | inverter, not only provide much more driving capability but also                          |  |  |  |  |  |
|            | suppresses the leakage current. Most of the prior works focus on the                      |  |  |  |  |  |
|            | driving capability; unfortunately, leakage current becomes the one of                     |  |  |  |  |  |
|            | the major design issues especially in nano-process circuits. What's                       |  |  |  |  |  |
|            | more, the proposed bootstrapped circuits are composed by few devices                      |  |  |  |  |  |
|            | such that the overhead of power consumptions caused by the parasitic                      |  |  |  |  |  |
|            | loads do not cost much.                                                                   |  |  |  |  |  |
|            | A feedback boosting voltage mechanism is used in the proposed                             |  |  |  |  |  |
|            | bootstrapped driver. The feedback boosting voltage not only enhances                      |  |  |  |  |  |
|            | the pre-charge current but also eliminates the reversion current. As the                  |  |  |  |  |  |
|            | result, the proposed bootstrapped driver is not just used to clock                        |  |  |  |  |  |
|            | boosting, but also can be applied to the random data transmission.                        |  |  |  |  |  |
|            |                                                                                           |  |  |  |  |  |
| 產業別        | IC 設計                                                                                     |  |  |  |  |  |
| 技術/產品應用範圍  | 低電壓晶片系統,生醫片系統                                                                             |  |  |  |  |  |
|            | 本技術所設計的低雜訊、低功率的拔靴帶式驅動電路,將升<br>降壓電路合併為一,後端不再分開控制。將傳統放大至兩倍擺幅<br>增發至故太三位的擇幅,一方面加強後方反相緊雲路的驅動能 |  |  |  |  |  |
|            | 力,另一方面同時抑制反相驅動電路的靜態消耗功率。本專利更                                                              |  |  |  |  |  |
|            | 使用放大後的擺幅輸出回授至前端升降壓電路,使得預充電流提                                                              |  |  |  |  |  |
|            | 高並消除反轉電流所造成的雜訊。這樣的新設計,將可能取代市                                                              |  |  |  |  |  |
|            | 場上傳統拔靴帶式電路的理由有下列幾點:                                                                       |  |  |  |  |  |
| 技術移轉可行性及預期 | <ol> <li>電路架構較傳統更為容易,僅使用了少數的電晶體設計成<br/>本低且功能完整。</li> </ol>                                |  |  |  |  |  |
|            | 2. 可以操作在更高速的電路中。                                                                          |  |  |  |  |  |
|            | 3. 突破傳統上僅能傳輸週期性訊號。                                                                        |  |  |  |  |  |
|            | 4. 可在極低壓操作,並維持不錯的 I <sub>ON</sub> 、I <sub>OFF</sub> 比值。                                   |  |  |  |  |  |
|            | 這樣的設計是高壓驅動器輸出級或是低功率晶片中,皆可廣泛應                                                              |  |  |  |  |  |
|            | 用。本發明在美國若能申請專利相信必然有高度的市場深度。而                                                              |  |  |  |  |  |
|            | 臺灣為晶片系統之重要設計國之一,本發明在這兩地確實有其潛                                                              |  |  |  |  |  |
|            | 在之競爭優勢與技術移轉及實施授權之潛在價值。                                                                    |  |  |  |  |  |

註:本項研發成果若尚未申請專利,請勿揭露可申請專利之主要內容。

# 國科會補助計畫衍生研發成果推廣資料表

日期: 100 年 10 月 25 日

|               | 計畫名稱:微瓦級動態電壓與頻率調整之晶片匯流排設計                                                |                                                |                                 |  |  |  |
|---------------|--------------------------------------------------------------------------|------------------------------------------------|---------------------------------|--|--|--|
|               | 計畫主持人:蘇朝琴                                                                |                                                |                                 |  |  |  |
| 國科會補助計畫       | 計畫編號:NSC 98-2                                                            | 2221-E-009-13                                  | 7-MY2                           |  |  |  |
|               | 領域:                                                                      |                                                |                                 |  |  |  |
|               | (中文)拔靴帶式環想                                                               | 型振盪器                                           |                                 |  |  |  |
| 研發成果名稱        | ( 盐 子 ) Dootstrannad                                                     | Ping Oscillator                                |                                 |  |  |  |
|               | (央文) Bootstrapped                                                        | King Oscillator                                |                                 |  |  |  |
| <b>成果歸屬機構</b> | 國立交通大學                                                                   | 發明人                                            | 學生:何盈杰,楊于昇                      |  |  |  |
|               |                                                                          | (創作人)                                          | 教授:蘇朝琴 教授                       |  |  |  |
|               | 本發明所提出的「                                                                 | 拔靴带式環型振                                        | 盪器」,使用拔靴帶式延遲                    |  |  |  |
|               | 単元架構而成,並利)                                                               | 用拔靴帶式電路的                                       | 技巧來加速電路的頻率輸                     |  |  |  |
|               | 出。相較於傳統式的                                                                | 延遲单兀,本發明                                       | 更適合在低電壓下操作,                     |  |  |  |
|               | 達到低功率的效果。                                                                | 即便在接近電晶體                                       | 臨界電壓之次臨界操作電                     |  |  |  |
|               | 壓(Subthreshold-supply                                                    | y)的操作環境下,                                      | 也可以使得電晶體免於操                     |  |  |  |
|               | 作在次臨界區。綜合市                                                               | 而言,使用拔靴带                                       | 式延遲單元之振盪器在低                     |  |  |  |
|               | 電壓操作下有以下的信                                                               | 憂點: (1) 較強的電                                   | <b>這流驅動力,適合低操作電</b>             |  |  |  |
|               | 壓設計,(2)靜態漏電抑制,提高 ION / IOFF 比值,(3)對製程敏感                                  |                                                |                                 |  |  |  |
|               | 度低,(4) 高線性度,低抖動雜訊。                                                       |                                                |                                 |  |  |  |
|               | A bootstrapped ring oscillator which can be operated under               |                                                |                                 |  |  |  |
|               | low-voltage supply even subthreshold-supply is proposed in the report.   |                                                |                                 |  |  |  |
|               | Lowering the supply v                                                    | voltage is a comme                             | on technique in low power       |  |  |  |
|               | design. However, the                                                     | supply voltage dr                              | opping especially near the      |  |  |  |
| 技術說明          | threshold voltage of the transistors results in some problems. First of  |                                                |                                 |  |  |  |
|               | all current driving                                                      | ability decreases                              | severely and the circuit        |  |  |  |
|               | ahorostoristic suffers                                                   | from process                                   | variation Our proposed          |  |  |  |
|               | characteristic suffers from process variation. Our proposed              |                                                |                                 |  |  |  |
|               | bootstrapped d ring osc                                                  | cillator using the bo                          | otstrapped delay cell, which    |  |  |  |
|               | can strengthen the driving ability and push the devices into linear      |                                                |                                 |  |  |  |
|               | region. In addition, since every boosted transistor is no longer working |                                                |                                 |  |  |  |
|               | at the sub-threshold region, the circuit won't suffer from the process   |                                                |                                 |  |  |  |
|               | variation. Besides, the                                                  | e bootstrapped te                              | chnique can suppress the        |  |  |  |
|               | leakage current and enh                                                  | nance the I <sub>ON</sub> / I <sub>OFF</sub> r | atio • Furthermore, the high    |  |  |  |
|               | linearity of the propose                                                 | d oscillator provide                           | es lower iitter interference in |  |  |  |
|               | timing aircy of the proposed oscillator provides lower juter interfet    |                                                |                                 |  |  |  |
|               |                                                                          | <i>J</i> 115.                                  |                                 |  |  |  |
|               |                                                                          |                                                |                                 |  |  |  |
| 產業別           | IC 設計                                                                    |                                                |                                 |  |  |  |
| 技術/產品應用範圍     | 低電壓晶片系統,電用                                                               | 腦,手機,等任何需                                      | 宫要時脈電路的產品                       |  |  |  |

|                  | 壓控振盪器/數位控製振盪器是時脈相關電路的主要核心電路,舉凡主機板,手持式電子產器,醫療相關設備等等,都少不<br>了時脈電路。                                                                                                          |
|------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 技術移轉可行性及預期<br>效益 | 低電壓與低功率是近來的趨勢,許多的新設計都不再走向數位<br>GHz 以上的設計,要求的是 Engery 上的效率。如何在最省能源的<br>方式下達到 100MHz 等級的時脈,該是各個領域所追求的目標,<br>尤其在 NB、手持式電子產品以及醫療電子產品上,本發明都具備<br>相當的競爭力。這樣的新設計,將可能取代市場上傳統時脈電路。 |
|                  | 臺灣為晶片系統之重要設計國之一,當地也有許多成熟的晶<br>片設計公司在發展更好更有效的時脈電路。本發明確實有其潛在<br>之競爭優勢與技術移轉及實施授權之潛在價值。                                                                                       |

註:本項研發成果若尚未申請專利,請勿揭露可申請專利之主要內容。

# 國科會補助專題研究計畫項下出席國際學術會議心得報告

日期:<u>100</u>年<u>2</u>月<u>24</u>日

| 計畫編號       | NSC 98-2221-E-009 -137-MY2                                                                                        |             |                      |  |
|------------|-------------------------------------------------------------------------------------------------------------------|-------------|----------------------|--|
| 計畫名稱       | 微瓦級動態電壓與頻率調整之晶片匯流排設計                                                                                              |             |                      |  |
| 出國人員<br>姓名 | 蘇朝琴                                                                                                               | 服務機構<br>及職稱 | 國立交通大學電機系/教授         |  |
| 會議時間       | 2011 年 2 月 16 日<br>至<br>2011 年 2 月 18 日                                                                           | 會議地點        | Innsbruck, Austrian. |  |
| 會議名稱       | 2011 第八屆國際先進科技發展委員會生醫工程研討會<br>The Eighth IASTED International Conference on Biomedical Engineering<br>Biomed 2011 |             |                      |  |
| 發表論文<br>題目 | 適用於助聽器的一個低功率與低面積的自動增益控制前端電路<br>A HEARING-AID FRONT-END CIRCUIT BASED ON LOW POWER<br>AND LOW AREA MIX MODE AGC    |             |                      |  |

一、參加會議經過

The Eighth IASTED International Conference on Biomedical Engineering Biomed 2011 February 16-18, 2011, Innsbruck, Austria.

Place: Congress Innsbruck, Innsbruck, Austrian.

Wednesday, February 16, 2011

第八屆國際先進科技發展委員會生醫工程研討會的領域,共涵蓋臨床醫學、神經醫學、健康照護、 醫學設備、醫學影像、生醫感測器、醫學儀器等,各類與醫學有關的領域。本次大會共分成十一個 Session、三個 Keynotespeech 和二個 tutorial。根據大會紀錄,本次研討會共收到一八三篇論文投稿, 接受發表八十一篇論文,發表論文比率為44.26%。本實驗室所發表的論文被安排在二月十七日下午 二點 Session 5 – Medical Device, Measurement, and Instrumention 1, 報告地點: Hall Aalborg。

因為部分 Session 舉行時間重疊,所以針對重點議題,選擇 Seesion 參加,期望能看到晶片系統設計, 未來在醫學領域的貢獻,下面即為本人三天所參加的議程經過。

1st Day-- Wednesday, February 16, 2011

07:00 Registration

08:30 12:00-Attending Session1- Biomedical Signal Processing. Place: Hall Grenoble.

1400 1530 Attending- Keynote Speaker1 "Technologies for ageing-in-place: from implantable bionics to biomonitoring" – Pro. Nigel Lovell. Place: Hall Grenoble.

1530 1615 Attending- Session 9 – Biosensors and Transducers.

1615 1700 Attebding-Session 10 Health Care Tecchnology and Telemedicine

2nd Day-- Thursday, February 17, 2011.

0830 1000 Attending- Keynote Speaker2 "From tissue engineering to in situ sensors: Application of

Nanotechnology in Biomedical Engineering." - Pro. Thomas Webster. Place: Hall Freiburg.

10:00 11:00-Attending Session11- Rehabilitration Engineering, Prosthetics, and Orthotics. Place: Hall Igls.

1100 1200-Attending Session- Biomechanics. Place: Hall Freiburg.

1400 1500 Attending and Presenting our research in Session 5- "A HEARING-AID FRONT-END CIRCUIT BASED ON LOW POWER AND LOW AREA MIX MODE AGC." Place: Hall Aalborg.

1500 1700 Attending Session 4- Bioinformatics and computional biomedicine.

3rd Day-- Friday, February 18, 2011.

0830 1000 Attending- Tutorial Session1- "Low Back Pain: the problem and methods of assessment." –Pro. Robert Allen. Place: Hall Grenoble.

10:00 12:00-Attending- Tutorial Session2- "Optical Sensors for Biomedical Applications." Dr. Martin Brandl. Place: Hall Freiburg.

1330 1430 Attending Biomed Keynote speaker 3- "MRI: From Multiparametric Contrast to Biomarker Imaging"- Prof. Rudolf Stollberger.

1500 1600 Attending Session 6- Medical Devices, Measurement, and Instrumentation 2. Place: Hall Grenoble. 1600 1700 Attending Session 8- Medical Imaging and Image Procession 2. Place: Hall Freiburg.

## 二、與會心得

In Keynote Speech 1,"Tecnology for aging-in-place: From implantable bionics to biomontoring" by Pro. Nigel Lovell.

The Pro. Nigel indicates that the work of the Bionic Vision Australia towards developing a visual prothesis will be dicussed, including the general principle of operation ,design challenges and potential benefits for implant recipients. Advanced materials and micro-technology research has led to a novel method of electrode array construction and feedthrough designs for safety encapsulating the custom-designed electronics that acts as the core of the device. The Prof. also introudce a surgical approaches and give a results from experimental and human psychophysics.

In Keynote Speech 2, "From tissue engineering to in situ sensors: Application of Nanotechnology in Biomedical Engineering."-Pro. Thomas Webster.

The Pro. Webster shows Nanotechnology is being used to develop sensors that can be placed on implant surfces for determining and controlling cellular events to ensure implants success. He also introduce some of the more significant advancements in creating better sensors for vascular, cardiovascular, and orthopedic implants through nanotechnology efforts.

In Keynote speech 3, "MRI: FROM MULTIPARAMETRIC CONTRAST TO BIOMARKER IMAGING", Pof. Stollberger from Austria indicate that MRI had been developed to a method which can be used to get functional or even metabolic information. It is now also a multifunctional imaging technique which allows detecting biophysical or metabolic parameters that can be used as biomarker for the assessment of underlying biological processes in a wide variety of tissues and diseases. Prof. Stollberger also address the hot topic of ultra high field systems for biomarker applications with a discussion of its opportunities and challenges.

In tutorial session 1, Pro. Allen give our a perspective view of tissue posed by low back pain, including epidemiology, muscle fatigue, methods of measurement for clinical investigation, and functional assessment.

He also give us a basic level of background knowlwdge in areas such as anatomy, physiology and measurement methods in clinical practice will be assumed, taking account of the different background of the delegates.

In tutorial session 2, Dr. Brandl introuduce physical principles of optical sensors including light absortion, transmission, reflection, Fluorescence dyes, Surface plasmon resonance, and fiber optical sensors.

In recent years, IC processing technologies have such a great improvement on dimension scaling down that many applications are realized. In terms of audio applications, rapid expansion of the biomedical-electronic and consumer-product market has necessitated low-power low-voltage low-area systems. Since the battery power is used for these devices, expanding the battery lifetime with low-power dissipation is very crucial. However, the threshold voltage is not scaled down linearly with the dimension. It increases the difficulty of the low voltage design. According to the power consumption law, the power is in proportion to the square of the supply voltage. In order to design low power circuit, the low voltage design is necessary.

Integrate Circuit (IC) design is moving toward lower power consumption, lower voltage, and miniaturized size, thus expanding its applications in biomedical equipment to the benefit of patients. As hearing impaired individuals greatly vary in their range of auditory loss, a hearing aid system is designed with a tunable function to satisfy specific needs. In particular, a programmable automatic gain control (AGC) system is applied to such a tunable function. Importantly, low power dissipation of the hearing aid system is crucial to expanding the battery lifetime. By using system on chip (SOC) technology, we can integrate biosensor and bio-device, then building bio-instrumentation smaller and lighter, finally makes the patient more comfortable.

三、攜回資料名稱及內容

帶回一份 a preliminary program, 和一片 CD-ROM。

#### 四、其他

- (1)本次參加會議人員大都為臨床醫生及醫學相關領域之工程師,本實驗室在 Session 5 報告上,展示一 顆 台積電 .18 um CMOS 製程,面積為 347\*413 um<sup>2</sup> 低功率低電壓自動增益控制晶片,讓與會人員 見識到奈米技術應用至醫學上的優點。
- (2)本次會議參加內植式醫學儀器設備技術研判過程中,發現現有低功率及面積小的奈米晶片設計技術,應用在內植式醫學治療上,可延長電池使用壽命,讓內植式儀器設備永不更換電池,使得臨床醫生便於治療,更讓病人感到更舒服。
- (3)本實驗室在生醫信號量測、生醫信號前端電路設計及數位信號處理技術等的開發上已有長足的進步,已可結合臨床醫學技術,進而應用在醫學工程上。同時本實驗室更可進一步整合遠端生醫信號 監控系統與現有各式通信系統,開發出一個即時的醫療照護與診斷系統,造福病患。