國立交通大學

# 電子工程學系電子研究所碩士班

# 碩士論文

動態調整頻率產生器與能量效率最佳化單位應用在太 陽能電源管理系統

Dynamic Frequency Scaling Clock Generator and Power

Efficiency Optimization Unit for Solar Cell Power

Management System Application

研究生:闞之晧

指導教授:黄威教授

中華民國九十七年六月

# 動態調整頻率產生器與能量效率最佳化單位應用在太 陽能電源管理系統

Dynamic Frequency Scaling Clock Generator and Power

Efficiency Optimization Unit for Solar Cell Power

Management System Application



碩士論文

A Thesis

Submitted to Department of Electronics Engineering & Institute of Electronics College of Electrical Engineering and Computer Engineering National Chiao Tung University in partial Fulfillment of the Requirements for the Degree of Master

in

Electronics Engineering June 2008 Hsinchu, Taiwan, Republic of China

# 中華民國九十七年六月

### 動態調整頻率產生器與能量效率最佳化單位應用在太

### 陽能電源管理系統

### 研究生: 闞之晧 指導教授: 黃威教授

#### 國立交通大學電子工程學系電子研究所

### 摘要

隨著手持裝置的廣泛應用,低功率成為電路設計的主要考量之一;同時隨著先 進製程的使用,數位取代傳統類比電路也逐漸成為趨勢。本論文研究數位形式 的頻率產生器,提出了一個雙輪出的數位頻率產生器,每個輸出都可以隨意調 整頻率和相位,總共有六種頻率倍數可以選擇。此數位頻率產生器利用了平穩 充電相位合成器來增加相位資訊,低功率的延遲單位使延遲線消耗較少的功率, 同時利用數位充電控制器來快速分辨頻率和延遲來達到快速鎖定。本論文也研 究低電壓震盪器,並利用低電壓震盪器在能量效率最佳化單位。能量效率最佳 化單位可以依據不同負載的情況來動態調整供應給1V產生器的時脈頻率,來達 到能量效率的最佳化。 能量效率最佳化單位應用在太陽能電源管理系統,此系 統接收太陽能源並且輸出500mV、-500mV 以及1V 給運算電路及記憶體電路。這 個系統在白天由太陽能源提供能量,在黑夜由電池提供能量。所有研究使用UMC

90nm CMOS 技術實現。

# Dynamic Frequency Scaling Clock Generator and Power Efficiency Optimization Unit for Solar Cell Power Management System Application

Student : Chih-Hao Kan Advisor : Prof. Wei Hwang

Department of Electronics Engineering & Institute of Electronics National Chiao-Tung University

# ABSTRACT

The portable device has been widely used, the low power consumption has become the main concern of circuit design; and the deep-submicron process also bring the trend of replacement of analog intensive architectures with more digital ones. This thesis proposed a digital dual output clock generator with dynamic frequency/phase tuning ability. Each output is independent when tuning frequency and phase and total six multiplied factors are available. The proposed clock generator uses smooth charge phase blender to increase the phase information. The low power delay cell saves the power consumption and the digital charge-detecting controller can achieve fast lock. The low voltage oscillators also had been researched and used it in power efficiency optimization unit. The power efficiency optimization unit supplies a variable frequency (33MHz~300MHz) clock to 1V generator according to the loading condition. The unit is applied in the solar cell power management system. The system accepts power from photovoltaic cell and outputs 500mV, -500mV and 1V to computation circuit and memory circuit. In daytime, the power management is supplied by solar energy and the battery is charged. At night, the battery will supply energy to power management system. All research is implemented in UMC 90nm CMOS technology.

# 致謝

我要感謝指導教授黃威老師,老師指導了我研究的方向,同時也教導了我許 多知識,更開拓了我研究領域的視野。老師提供了一個優良且舒適的的研究環境 與充足的研究資源,讓我能夠充分利用來完成這一篇論文。

我也要特別感謝張銘宏學長,帶領我接觸我的研究領域並教導我許多知識與 道理,讓我能夠完成這篇碩士論文的研究。

同時我也要感謝黃柏蒼、謝維致和楊皓義學長對於我在研究上的幫助與鼓勵。

最後我要感謝我其他的實驗室夥伴、我的朋友、與我的家人,對我的關懷幫助以及精神上的支持,讓我能夠順利的完成碩士的論文研究。

# Contents

| Chapte        | er 1                                                           |                                  |
|---------------|----------------------------------------------------------------|----------------------------------|
| Introd        | uction                                                         | 1                                |
| 1.1           | Research Motivation                                            | 2                                |
| 1.2           | Thesis Organization                                            |                                  |
| Chapte        | er 2                                                           |                                  |
| PLL/D         | LL Design Concepts                                             | 5                                |
| 2.1           | Introduction                                                   | 5                                |
| 2.2           | The Architecture of PLL                                        |                                  |
|               | 2.2.1 Analog PLL                                               |                                  |
|               | 2.2.2 All Digital PLL                                          | 7                                |
| 2.3           | The Architecture of DLL                                        |                                  |
|               | 2.3.1 Analog DLL                                               |                                  |
|               | 2.3.2 All Digital DLL                                          | 10                               |
| 2.4           | The Common Block Circuits                                      |                                  |
|               | 2.4.1 Phase Detector                                           |                                  |
|               | 2.4.2 Charge Pump                                              |                                  |
|               | 2.4.3 Loop Filter                                              |                                  |
|               | 2.4.4 Time to Digit Converter                                  |                                  |
|               | 2.4.5 Oscillator/Delay Line                                    | 16                               |
|               | 2.4.5 Frequency Divider                                        | 19                               |
| 2.5           | PLL/DLL System Noise Analysis and                              | l Design Technique20             |
|               | 2.5.1 PLL/DLL rms Jitter Analysis                              | 21                               |
|               | 2.5.2 Impedance Level Scaling Technique                        |                                  |
|               | 2.5.3 Analysis of PLL Jitter Caused by Digital Switching Noise |                                  |
| Chapte        | er 3                                                           |                                  |
| Overvi        | ew of Dynamic Frequency Scaling                                | g Technique and Proposed         |
| <b>Dual</b> C | Output Clock Generator with Dyna                               | amic Frequency/Phase             |
| Tuning        | gAbility                                                       |                                  |
| 3.1           | Dynamic Frequency Scaling System.                              |                                  |
|               | 3.1.1 The Dynamic Frequency Scal                               | ing Technique34                  |
| 3.2           | Proposed Dual Output Clock Genera                              | tor with Dynamic Frequency/Phase |
| Tur           | ning Ability                                                   |                                  |
|               | 3.2.1 The Architecture of Proposed                             | Clock Generator40                |
|               | 3.2.2 Dynamic Frequency/Phase Tu                               | ning Synthesizer42               |

|                         | Low power six-phase delay-locked loop                                 |  |  |  |
|-------------------------|-----------------------------------------------------------------------|--|--|--|
|                         | 3.2.4 Simulation Result of the Proposed Clock Generator               |  |  |  |
| 3.3                     | Conclusion of Proposed Clock Generator and Application in Advance     |  |  |  |
| Powe                    | r Management System66                                                 |  |  |  |
| Chapter                 | 4                                                                     |  |  |  |
| Low Vol                 | tage Oscillators with Wide Tuning Range69                             |  |  |  |
| 4.1                     | Introduction to Low Voltage Circuit Design                            |  |  |  |
| 4.2                     | The Type I Low Voltage Differential Oscillator with Wide Tuning Range |  |  |  |
|                         |                                                                       |  |  |  |
| 4.3                     | The Net-Bias Circuit76                                                |  |  |  |
| 4.4                     | The Type II Low Voltage Oscillator with Wide Tuning Range77           |  |  |  |
| Chapter                 | 5                                                                     |  |  |  |
| The App                 | lication of Power Efficiency Optimization Unit in Solar Cell          |  |  |  |
| Power Management System |                                                                       |  |  |  |
| 5.1                     | The Solar Cell Power Management System                                |  |  |  |
| 5.2                     | The Power Efficiency Optimization Unit90                              |  |  |  |
| 5.3                     | The Simulation Results of Power Efficiency Optimization Unit and the  |  |  |  |
| Solar                   | Cell Power Management System94                                        |  |  |  |
| Chapter                 | 6                                                                     |  |  |  |
| Conclus                 | on and Future Work102                                                 |  |  |  |
| 6.1                     | Conclusion                                                            |  |  |  |
| 6.2                     | Future Work                                                           |  |  |  |
| REFER                   | ENCE                                                                  |  |  |  |

# **List of Figures**

| Fig 2.1 The architecture of analog PLL6                                          |
|----------------------------------------------------------------------------------|
| Fig 2.2 The architecture of all digital PLL8                                     |
| Fig 2.3 The architecture of analog DLL9                                          |
| Fig 2.4 (a) Conventional phase frequency detector (b) Timing diagram10           |
| Fig 2.5 NAND gates style conventional phase frequency detector11                 |
| Fig 2.6 the state diagram12                                                      |
| Fig 2.7 charge pump12                                                            |
| Fig 2.8 (a) Single-end charge pump (b) NMOS charge pump13                        |
| Fig 2.9 (a) 1st order loop filter (b) 2nd order loop filter (c) 3rd order loop   |
| filter14                                                                         |
| Fig 2.10 Time-to-digital converter (TDC): (a) Structure (b) Quantization of the  |
| timing difference between the DCO and FREF edges15                               |
| Fig 2.11 2 level time-to-digital converter (TDC) structure16                     |
| Fig 2.12 Typical voltage controlled LC oscillator17                              |
| Fig 2.13 Differential CML type gain block oscillator17                           |
| Fig 2.14 Differential gain block18                                               |
| Fig 2.15 Binary weighted digital controlled differential delay cell19            |
| Fig 2.16 Varactor style delay cell using NAND gate19                             |
| Fig 2.17 The basic /4/5 two mode divider20                                       |
| Fig 2.18 Concept of impedance level scaling25                                    |
| Fig 2.19 Effect of impedance level scaling26                                     |
| Fig 2.20 Noise generated by digital logic couples through the substrate to an    |
| analog circuit27                                                                 |
| Fig 2.21 Three power supply schemes under investigation. (a) PLL0: common        |
| Vdd and Vss , (b) PLL1: separate analog Vdd , (c) PLL2: separate                 |
| analog Vdd and Vss29                                                             |
| Fig 2.22 Jitter measurement with PLLs having a bandwidth of 4 MHz30              |
| Fig 2.23 Triple-well processing provides a buried well that breaks the resistive |
| noise coupling path31                                                            |
| Fig 2.24 Jitter induced by NG31 into PLLs having a bandwidth of 4 MHz.           |
| Prefix 3W indicates that the block resides in a triple-well                      |
| Fig 3.1 Conventional EPIC architecture and multiple clock domain EPIC35          |
| Fig 3.2 The power consumption comparison37                                       |

| Fig 3.3 The performance comparison                                                 |
|------------------------------------------------------------------------------------|
| Fig 3.4 The EDP comparison                                                         |
| Fig 3.5 The architecture of proposed dual output clock generator with dynamic      |
| frequency/phase tuning ability41                                                   |
| Fig 3.6 The conventional phase blender43                                           |
| Fig 3.7 The smooth charge phase blender (SCPB)44                                   |
| Fig 3.8 The voltage curve of SCPB when phase difference is 400 ps45                |
| Fig 3.9 The performance comparison of SCPB and conventional phase                  |
| blender46                                                                          |
| Fig 3.10 The voltage curve of SCPB when phase difference is 500 ps47               |
| Fig 3.11 The modified dynamic controlled SCPB48                                    |
| Fig 3.12 The performance of the modified SCPB49                                    |
| Fig 3.13 The architecture of the DDPS50                                            |
| Fig 3.14 The clock multiplier and duty cycle circuit proposed in [34]51            |
| Fig 3.15 The edge combiner52                                                       |
| Fig 3.16 The toggle pulsed latch                                                   |
| Fig 3.17 The two examples of frequency synthesis54                                 |
| Fig 3.18 The rising edge pulse scheme to handle the 50% duty cycle56               |
| Fig 3.19 The low power delay cell57                                                |
| Fig 3.20 The control signal to select coarse stage                                 |
| Fig 3.21 The digital charge-detecting controller (DCD controller)59                |
| Fig 3.22 The charge-detecting line (CDL)60                                         |
| Fig 3.23 The detecting condition of coarse tune signals61                          |
| Fig 3.24 The locking procedure                                                     |
| Fig 3.25 The output frequency of six multiplied factors65                          |
| Fig 3.26 The dual clock output and dynamic frequency scaling example66             |
| Fig 3.27 The advance power management concept                                      |
| Fig 4.1 The proposed type I low voltage delay cell70                               |
| Fig 4.2 General voltage controlled delay cell72                                    |
| Fig 4.3 Type I oscillator with 7 stages72                                          |
| Fig 4.4 The delay time of the proposed type I low oscillator using different       |
| control step73                                                                     |
| Fig 4.5 The output frequency of the proposed type I low oscillator using different |
| control step74                                                                     |
| Fig 4.6 The range of frequency and delay time of proposed type I low voltage       |
| oscillator with different power supply voltage75                                   |

| Fig 4.7 The net-bias circuit76                                                     |
|------------------------------------------------------------------------------------|
| Fig 4.8 The proposed type II low voltage oscillator77                              |
| Fig 4.9 (a) The delay time of the proposed type I low oscillator using different   |
| control step (b) The output frequency of the proposed type I low                   |
| oscillator using different control step79                                          |
| Fig 4.10 The range of frequency and delay time of proposed type II low voltage     |
| oscillator with different power supply voltage80                                   |
| Fig 4.11 The power consumption of proposed type II low voltage oscillator with     |
| small inverter sizes81                                                             |
| Fig 4.12 The non-full swing condition when oscillator operates at low frequency.   |
|                                                                                    |
| Fig 4.13 The power consumption of proposed type II low voltage oscillator with     |
| big inverter sizes83                                                               |
| Fig 4.14 The power consumption comparison with different oscillators               |
| Fig 5.1 The solar cell power management system                                     |
| Fig 5.2 The control unit                                                           |
| Fig 5.3 The voltage level of output of regulator and output of 1V generator in the |
| condition of loading increase and PV cell power reduce gradually88                 |
| Fig 5.4 The architecture of power efficiency optimization unit91                   |
| Fig 5.5 The oscillating voltage detector92                                         |
| Fig 5.6 The detecting point of oscillating voltage detector versus different       |
| temperature conditions                                                             |
| Fig 5.7 The bias voltage detector94                                                |
| Fig 5.8 The power efficiency measurement of the 1V generator95                     |
| Fig 5.9 The power efficiency of the 1V generator (oscillating voltage detector)96  |
| Fig 5.10 The power efficiency of the 1V generator (bias voltage detector)97        |
| Fig 5.11 The three different output voltage with variation of current from PV cell |
|                                                                                    |
| Fig 5.12 Comparison of power management system with CU and without CU99            |
| Fig 5.13 The layout view of solar cell power management system100                  |
| Fig 6.1 The advance power management concept104                                    |

# **List of Tables**

| Table I   | Frequency/Phase combination and program signals     | 55  |
|-----------|-----------------------------------------------------|-----|
| Table II  | The performance of proposed clock generator         | 63  |
| Table III | Power management system for solar energy harvesting | 100 |



# Chapter 1 Introduction

The need for low cost, low power communication systems has motivated the use of deep-submicron CMOS processes. Technology scaling improves digital blocks, but complicates the design of RF and analog circuits. Thus the replacement of analog intensive architectures with more digital ones will become unavoidable. Analog frequency synthesizers are used in both wireless transceivers and wire line digital links. Recently All Digital Phase-Locked Loop (ADPLL) and All Digital Delay-Locked Loop (ADDLL) have appeared featuring good scalability, programmability and robustness but performance still inadequate for high end applications.

The PLL and DLL are very important clocking IPs for many digital systems such as digital communication and microprocessor. The PLL has been widely used for digital system, communication system and interconnection system. It can be used to eliminate the delay between external and internal clock signals caused by the on-chip clock delay. Among the main applications of PLLs are noise and jitter suppression in communications, skew suppressions in digital systems, data synchronization between chips, and frequency synthesis in RF transceivers. DLLs are also widely used as de-skew buffers and clock generators in microprocessors, DSPs, multi-core SoCs , DRAM interfaces and application-specified integrated circuits. In recent years, the DLL has become an important component for safe clocking of SoCs with block-based power-down mechanism, and thus low power, small jitter, and fast lock-in become three equally important design goals.

Compared to most phase-locked-loop-based clock generators and local oscillators, delay-locked loop-based counterparts exhibit less jitter and phase noise because of no jitter accumulation. This is true even under severe supply noise which is becoming common and critical in many SoCs. Furthermore, they show stable operation with process, voltage and temperature (PVT) variations, are easier to design, and occupy smaller area due to a simpler loop filter.

### 1.1 <u>Research Motivation</u>

In this thesis, a dual output clock generator with dynamic frequency/phase tuning ability is proposed first. It generates two independent clock signal sources, and the architecture is extensible to provide more sources. Two low voltage oscillators with wide tuning range are researched for applied in solar cell power management system. Finally the power efficiency optimization unit using low voltage oscillator is developed and applied to the solar cell power management system.

The fast growing IC industry makes the trend of more complicate system integration and minute operation. The dynamic frequency scaling technique is widely used in system power management aspect. Huge demand of multiple clock domain are also be desired by multi-core system, which need a clock generator capable of multi-frequency/phase ability to operate. Under this demand, clock generator with multiple clock outputs and dynamic frequency/phase tuning ability is essential for system aspect and power management.

The research motivation of proposed dual output clock generator with dynamic frequency/phase tuning ability is make the clock generator more capable of the trend of multi clock domains and dynamic scaling. The advance power management concept is also proposed as the application concept of multi outputs clock generator.

In the recent years, the market of portable devices likes notebook, cell phone, PDA and smart phone is grow up rapidly and more new portable products will be developed in the near future. In the developing of portable devices, more and more functions are integrated into a product. At the same time, people concern that whether the product can use for a long time without charging the battery in charge socket. Recently, the price of oil keeps going up. This will impact the electric bill and cost of expense. To increase the utility time and lower the cost of expense, the low power techniques are urgent need. In alternative way, people look for the new alternative energy actively. Environmental energy like solar power, heat power and wind power is used for generating electric power. Due to energy crisis and eco-awareness, the research of energy harvesting application is getting popular.

The energy harvesting system of the solar cell power management system is proposed by me and Tung-Hau Tsai as the energy platform power by solar cell. The efficient power management system harvest energy from the solar and transfer it efficiently to computation circuit and battery. The power efficiency optimization unit is also proposed to optimize the power efficiency of 1V generator according to the loading condition. The solar cell power management system is proposed in the trend of neutral energy harvesting.

4000

# 1.2 Thesis Organization

The thesis is organized as follows:

Chapter 2 gives the PLL/DLL design concepts. Including the architecture of analog/digital PLL/DLL, and the block circuit is introduced.

Chapter 3 introduces the dynamic frequency scaling system and proposed dual output clock generator with dynamic frequency/phase tuning ability. The advance power

management concept is also presented.

Chapter 4 introduces two low voltage oscillators with wide tuning range, which is researched to use in power efficiency optimization unit. The power efficiency optimization unit is applied to solar cell power management system.

Chapter 5 presents the power efficiency optimization unit and the solar cell power management system. The simulation results and layout also implemented and be presented.

Chapter 6 gives the conclusion and future work.



# Chapter 1 PLL/DLL Design Concepts

# 2.1 Introduction

Clock source is essential circuit block in digital circuits, since every digital circuit needs clock to trigger. The performance and stability of clock source has great impact to circuit operation. Although the clock source, like PLL, had been widely researched for a long time, the increasing operation frequency and more complicated system brings new challenges to clock source, include more widely frequency range supply, low jitter, low power consumption, dynamic scaling, even the fast locking. The advance deep-sub-micro process also gives more challenges to mix-signal and analog intensive circuits design, thus makes all digital style clock generator become popular.

4000

In this chapter, the conventional PLL/DLL design concepts will be discussed, Include the architecture of both structures and circuit design. The common issue of PLL/DLL will also be discussed.

In section 2.2, the architecture of PLL will be shown; also the analog type and digital type PLL will be studied. The difference of two types will be presented. In section 2.3, the architecture of DLL will be shown.

Several common block circuits will be studied in section 2.4. Phase detector which is used to capture phase difference between two signals will be studied in section 2.4.1. Charge pump is an essential circuit in analog style voltage control PLL/DLL, and time-to-digit converter is also very important circuit in digital style PLL/DLL, both will studied in section 2.4.2 and section 2.4.3.

In both PLL/DLL, the oscillator/delay cell is the key of entire circuits. Most of the jitter sources are contributed by this blocks, the range of frequency also bounds by it, and over 50% power consumption come from oscillator/delay cell. The studied of oscillator/delay cell will presented in section 2.4.4. The divider which used in some architecture will be studied in section 2.4.5.

Finally the common issue in all PLLs/DLLs will studied in section 2.5, include power issue and jitter issue.

# 2.2 <u>The Architecture of PLL</u>

### 2.2.1 Analog PLL

Phase-locked loop (PLL) is a very important clocking IP for many digital systems such as digital communication and microprocessor. It can be used for frequency synthesis, clock de-skew, and duty-cycle enhancement. The typical PLL used negative feedback loop to synchronize the output clock and the reference clock. Fig 2.1 shows the architecture of analog PLL. There are phase detector(PD) or phase frequency detector(PFD), charge pump(CP), loop filter(LPF), voltage control oscillator(VCO), and frequency divider(/N). If the PLL was locked, the phase and frequency of two periodic input signals of PD were ideally the same, and the frequency of the output of VCO(fout) will be N times of reference clock.



Fig 2.1 The architecture of analog PLL

Just like most analog circuits, the analog PLL shows superior performance over digital style PLL, especial in output frequency range. But the design of analog loop needs more effort, the loop have to be convergent. The analog circuit is sensitive to noise, thus brings the hard working condition in deep-sub-micro process. The analog feature make analog PLL can not be portable with process, and any change to the spec will easily lead to redesign the whole loop. These drawbacks lead digital style PLL to rise and widely use in many applications.

#### **2.2.2 All Digital PLL**

The advance technology makes system design to be more integrated. The system-on-chip(SOC) makes traditional circuit design to become system design, and mostly digital systems. Integrating an analog block into a digital system needs to take more design efforts, and also the deep-sub micro process environment is unfriendly to analog designs, all of these give the rising of all digital style PLL.

In a deep-submicron CMOS process, time-domain resolution of a digital signal edge transition is superior to voltage resolution of analog signals. Thus the ADPLL has the higher immunity for switching noise, and process, voltage and temperature (PVT) variations. The ADPLL can be ported to different process as a soft intellectual property (IP), and make it can be easily integrated into the system. And it also shows the better testability, programmability, and stability.

The architecture of standard all digital PLL is shown in Fig 2.2[1]. As Fig 2.2(a), there are phase detector(PD) or phase frequency detector(PFD), direction circuit, up/down counter, digitally controlled oscillator(DCO), and frequency divider(/N). The direction circuit and up/down counter can be replaced by time-to-digit converter(TDC), as Fig 2.2(b).

The direction circuit can provide signal that decide the "lead" or "lag" condition between two input signals of PD, then up/down counter tracing the direction signal until the output of DCO is synchronized to the reference clock. The time-to-digit converter can form analog timing condition directly to digital oscillator controlled signals. The loop function is just like analog PLL, but each block circuit is replaced by the digital one.



Fig 2.2 The architecture of all digital PLL

# 2.3 The Architecture of DLL

### 2.3.1 Analog DLL



Fig 2.3 The architecture of analog DLL

The delay-locked loop(DLL) is another widely used clocking source for clock de-skew, clock synchronize, and clock synthesis. Unlike PLL, the operation of DLL is used delay line to delay reference clock, when the total delay of delay line is one reference cycle, the DLL is locked, and the reference clock and the output of delay line is synchronized.

#### 440000

Fig 2.3 shows the architecture of analog DLL. There are phase detector(PD) or phase frequency detector(PFD), charge pump, loop filter, and voltage controlled delay line(VCDL). The main difference of architecture between DLL and PLL is the delay line in the DLL did not form a loop which as oscillator of PLL, it simply delayed the input reference clock. The basic function of the rest component is the same as in the PLL.

The most attractive feature of DLL is the jitter reduction, comparing to PLL, DLL shows better jitter performance over PLL since no jitter accumulation.

### **2.3.2** All Digital DLL

The all digital delay-locked loop (ADDLL), like all digital phase-locked loop (ADPLL), is a DLL using digital component. There are phase detector (PD) or phase frequency detector (PFD), tine-to-digit converter (TDC), and digital controlled delay line (DCDL). Except TDC scheme, there are counter-controlled based scheme, D flip-flop scheme, binary search scheme, and successive approximation register controlled scheme (SAR).

### 2.4 The Common Block Circuits

#### **2.4.1 Phase Detector**

There are several considerations when design a phase detector or a phase frequency detector, one is the minimum detectable phase error, another is the maximum operation frequency, the linear characteristic is also very important. Fig 2.4(a) shows the conventional phase frequency detector, it was consist of two D flip-flop and a AND gate. The timing diagram is shown in Fig 2.4(b), the "lead" and "lag" condition will be quantified to DOWN and UP signals. The UP and DOWN signals thus can provide the direction and value of phase condition.



Fig 2.4 (a) Conventional phase frequency detector (b) Timing diagram

Fig 2.5 shows another conventional phase frequency detector using nine NAND gates, and the state diagram is shown in Fig 2.6. Several issue of this PFD should be considered, there are glitch, dead-zone, and maximum operation frequency limit. To the glitch, there are two ways to reset UP signals, from point X to UP or from point Y to UP. To avoid glitch, it can insert more delay on point X to match the delay of point Y. Even the REF and FBAK is at the same phase, the minimum pulse still will happen at UP and DOWN signals. Insert more delay at the output of NAND gates can increase the width of the minimum pulse, but maximum operation frequency of the PFD will also be limited.



Fig 2.5 NAND gates style conventional phase frequency detector



Fig 2.6 the state diagram

# 2.4.2 Charge Pump

Charge pump is used to convert the PD output signals, usually the UP, DOWN signals, to the controlled voltage. The basic concept of charge pump is shown in Fig 1.7. The UP, DOWN signals works as a switch, through the switch the current Ip will charge or discharge the control voltage (Vctrl), which is going to control the oscillator in the PLL or the delay line in the DLL.



Fig 2.7 charge pump

In the analog CMOS switch and dynamic digital circuits there are many undesirable effects, like charge sharing and current leakage. In the charge pump the switch current could exceed the UP/DOWN controlled main current due to the charge sharing. When the PLL or DLL was locked and stable, PFD still generating the minimum width pulse UP/DOWN signals, suppose the width of the pulse is ts, the current mismatch( $\Delta I$ ) could be happened due to channel length modulation, the offset charge(DQ) will be DQ=(Iup-Idn)\*ts. DQ will introduce static offset phase error, to minimum this error, the current mismatch of UP/DOWN switch should be minimum.

There several conventional charge pump circuits, Fig 2.8(a), this is single-end charge pump, and the switch is on the drain of the current mirror. When DN opened, the drain voltage of M1 will be low, when DN connected, the drain voltage of M1 will rise to output control voltage. M1 will operate at linear region, until the drain voltage was high enough to put M1 into saturated region. There should be noticed that the charge sharing effect could be happened at the drain node of M1 and M2. Fig 2.8(b) shows another conventional charge pump. It used current mode circuit to achieve high operation speed. Because of the constant bias current, the voltage noise could be reduced. It only used NMOS switch to prevent the current mismatch between NMOS and PMOS.



Fig 2.8 (a) Single-end charge pump (b) NMOS charge pump

#### 2.4.3 Loop Filter

Analog style PLL or DLL used loop filter to convert the current signal of the charge pump's output to the voltage signal, and it filtered the high frequency noise. Fig 2.9 shows the first order, second order, and third order loop filter. The design of loop filter should consider the stability of the loop and the operation frequency. The passive component will occupy lots of area, especially capacitor. Area issue should be noticed when design loop filter.



Fig 2.9(a) 1st order loop filter (b) 2nd order loop filter (c) 3rd order loop filter

#### 2.4.4 Time to Digit Converter

The time to digit converter (TDC) is used in ADPLL or ADDLL to convert timing information directly to the digital code. The TDC usually consist of the delay that is identical or multiple or fractional to the single delay cell in the delay line or oscillator, the concept is let timing signal to pass this delay then extracting the information to the digital code.

Fig 2.10(a) shows the structure of a TDC[2]. It consists of several inverters to form the delay. There are flip-flops connect to each output of inverters, and operated with high frequency. The timing signal pass the delay line consists of the inverters with high frequency flip-flops to sample the information, and converted to the digital code. Since the conventional phase/frequency detector

and charge pump are replaced by the TDC, the phase-domain operation does not fundamentally generate any reference spurs thus allowing for the digital loop filter to be set at an optimal performance point between the reference phase noise and oscillator phase noise. Fig 2.10(b) shows quantization of the timing difference between the DCO and reference clock edges.



Fig 2.10 Time-to-digital converter (TDC): (a) Structure (b) Quantization of the timing difference between the DCO and FREF edges

Level structure technique could reduce circuit complexity and area. Fig 2.11 shows the two levels TDC[3]. There are several functional blocks, namely one long delay chain, one short delay chain, 1<sup>st</sup> level flash TDC, 2<sup>nd</sup> level flash TDC, path selection multiplexer, and cycle time calculator. The long delay chain consists of 32 delay cells, and these delay cells are partitioned into four sections

(Secs.0-3). In contrast to long delay chain, the short delay chain has only 8 delay cells. All delay cells used in long and short delay chain remain the same as those for DCO coarse-tuning stage. When the TDC is enabled, Ref N is sent to the long delay chain, and all outputs (DL [3:0]) are sent to the 1<sup>st</sup> level flash TDC. When the first falling edge of Ref N arrives, the 1<sup>st</sup> level flash TDC generates the section selection signal (Li\_SEL) to select one of section outputs for the short delay chain. Then the 2<sup>nd</sup> level flash TDC generates the delay selection signal (L2 SEL) based on the delay outputs (BL[7:0]). The section and delay outputs are thermometer code type that can be used to generate selection signals easily. When both LI SEL and L2\_SEL have been generated, the cycle time calculator can estimate the period of Ref N. The conversion equation can be given as

Tr= (LI\_SEL x8+L2 SEL) x2 (1)

Where Tr is the period of Ref N.



Fig 2.11 2 level time-to-digital converter (TDC) structure

#### 2.4.5 Oscillator/Delay Line

The oscillator is constructed by the LC oscillator or several delay cells to form the loop. Traditional analog PLL used LC-tank style oscillator, as shown in Fig 2.12, it's a typical voltage controlled LC oscillator. The LC oscillator under certain voltage condition could generate a sin wave. The LC oscillator has superior performance, but the analog characteristic make it hard to design in deep-sub micro environment, and the passive component would occupy huge area.



Fig 2.12 Typical voltage controlled LC oscillator

Fig 2.13 shows a differential CML type gain block oscillator[4]. Fig 2.14 shows the differential gain block. The frequency of oscillation is current-controlled by PMOS load transistors. The transistor pair (*Pla*, *Plb*) is operated into the saturation region and its current determines the lowest frequency of oscillation. *P2a* and *P2b* are operated into the triode region and their current controls the tuning range of the PLL. The transistor pair (*P3a*, *P3b*) clamps the oscillation signal towards the supply voltage.



Fig 2.13 Differential CML type gain block oscillator



Fig 2.14 Differential gain block

ATTILLED

To the digital controlled delay cells, there are delay cells used one transistor to realize the tuning capacitance, but it will limited the native resolution to the smallest transistor achievable by a process. The native resolution can be refined by increasing the driving strength, but the power consumption will be increased.

Fig 2.15 shows the binary weighted digital controlled differential delay cell[5]. One path comprises of a fixed capacitance realized with the minimum-sized transistor and the other path comprises of a tuning capacitance that is realized by adjusting the size of transistor. The difference of capacitance determines the finest delay resolution, which can be made sufficiently small. The BWDC also has two distinct features that contribute to low power. First, there is no need for large driving and so logic gates can be minimally sized. Second, the de-multiplexing gates are placed at the input side so that only the components in one path are activated.



Fig 2.15 Binary weighted digital controlled differential delay cell

Except MOS capacitor style digital controlled delay cell, there are also NAND gate style digital controlled varactor(DCV)(all dpll optical). Fig 2.16 illustrates a varactor cell using a two-input NAND gate. The gate-to-channel capacitance contributes to total gate capacitance. This method controls the capacitance between gate and source or between gate and drain. The NAND gate capacitance at CL depends on the value of the Bctr.



Fig 2.16 Varactor style delay cell using NAND gate

### 2.4.5 Frequency Divider

Frequency divider used in PLL to divide VCO or DCO output frequency. The synchronized divider will consume very high power, and it need high frequency respond. Practically PLL used two-mode divider to down the system speed.

Fig 2.17 shows the basic /4/5 two mode divider. It consists of three D flip-flops and two NAND gates. If the MOD=0, Q1,Q2 will be Fin/4 each with different phase, if the MOD=1, Q1,Q2,Q3 will be Fin/5 each with different phase. The feedback delay of Q2 should be considered, the NAND gate delay plus Q1 to Q2 delay must less than one Fin period to avoid wrong operation.



Fig 2.17 The basic /4/5 two mode divider



# 2.5 <u>PLL/DLL System Noise Analysis and Design</u> <u>Technique</u>

Higher clock rates in many applications such as video, audio, and data processors, clock recovery applications, such as data communications and disk drive read channels, as well; higher speeds require better performance from the PLLs or DLLs. In both types of applications clocks are generated to drive mixers or sampling circuits in which the random variation of the sampling instant, or jitter, is critical performance parameter.

### 2.5.1 PLL/DLL rms Jitter Analysis

Timing jitter in a ring-oscillator PLL depends on the interaction of noise in the oscillator with the dynamics of the phase-locked loop. It has been shown in [6] that the timing jitter variance at the end of a chain of inverters is given by the sum of the contributions of each stage. If each stage contributes a timing error

with variance  $\overline{\Delta t_n^2}$ , then the total jitter at the end of N stages is Nx  $\overline{\Delta t_n^2}$ .

In a ring-oscillator this timing e m determines the starting point of the next cycle and therefore creates a permanent phase shift in the output signal. If the ring-oscillator is conFigured in a phase locked-loop, however, the phase difference between the reference clock and the oscillator output is detected and compensated for by the dynamics of the loop. The phase detector will sense the shift and create an error signal to change the frequency of the ring-oscillator VCO in a way which moves the phase of the output in the right direction.

1896

Since the amount of phase adjustment is usually small, the phase error is not corrected in one clock cycle, but it is reduced gradually over the course of several cycles. The phase error may remain for up to several hundreds of cycles, depending on the bandwidth of the loop filter in the PLL. Analysis of the accumulated phase jitter and its relation to the loop bandwidth is important for both clock synthesis and clock recovery applications. In most PLL clock synthesizer designs. The reference clock comes from a very low jitter source such as crystal oscillator. Therefore the jitter in the ring-oscillator is the main source of the phase error in the synthesized clock. In this case the bandwidth of the loop filter determines how large the accumulated timing jitter gets.

To find the accumulated rms jitter, a PLL which uses a sequential phase detector and a charge-pumping circuit is represented by a simple discrete-time model as shown in Fig 1.18. The transfer function for jitter in the PLL due to the internal jitter sources is represented by eq1 in z-transform domain.

$$\Theta_{on}(z) = \frac{\Theta_n(z)}{1 + K_d K_w Z_F(z) z^{-1}} \qquad (\text{eq1})$$

Here the phase detector gain,  $K_d = \frac{I_s}{2\pi}$  and VCO gain,  $K_w = \frac{dw}{dv}$  respectively, and  $I_s$  indicates the charge pumping current.  $Z_F(z)$  is the z-transform H(s)/s, where H(s) is the transfer function of the PLL loop filter in s domain. In most PLL designs, eq1 can be re-written as eq2.

$$\Theta_{on}(z) = \frac{(1-z^{-1})}{1-(1-\varepsilon)z^{-1}}\Theta_n(z) \qquad (eq2)$$

Where  $K = K_d K_w a T$  and is actually replaced with the term  $\varepsilon$  since  $K \ll 1$ . *a* is the DC filter gain.

The phase jitter from the ring oscillator can be modeled as a sequence of unit step phase jumps with random magnitude. A single phase jump at time nT can be represented by eq3 in the z-domain.

$$\Theta_n(z) = \frac{2\pi\Delta t_n}{T(1-z^{-1})}$$
(eq3)

Here the magnitude of the error step is  $\Delta t_n$ . The variance of this error is shown in [6] to be proportional to the number of stages in the ring-oscillator, and the timing jitter variance contributed by each stage. Hence the output jitter in z-domain is,

$$\Theta_n(z) = \frac{2\pi\Delta t_n}{T(1 - (1 - \varepsilon)z^{-1})}$$
 (eq4)

For all events up to time nT, the sum of output phase shifts is represented by eq5.

$$\Theta_{tot}(nT) = \sum_{k=-\infty}^{n} \frac{2\pi\Delta t_n}{T} (1-\varepsilon)^{n-k} \quad (eq5)$$

To find the rms output jitter, the expectation of the square of the sum is calculated and given by eq6. since  $\Delta t_k$  and  $\Delta t_l$  are not correlated, the

 $E[\Delta t_k \Delta t_l] = 0$  when  $k \neq l$ . When k = l,  $E[\Delta t_k \Delta t_l]$  can be replaced by  $\Delta \tau_N^2$ .

$$E[\Theta_{tot}^2(nT)] = (\frac{2\pi}{T})^2 \frac{\Delta \tau_N^2}{\varepsilon(2-\varepsilon)} \cong (\frac{2\pi}{T})^2 (\frac{\Delta \tau_N^2}{2\varepsilon}) \quad (eq6)$$

Note that the expectation of the phase jitter is independent of nT, the time instant. Hence the rms. Phase jitter is,

$$\sqrt{E[\Theta_{tot}^2(nT)]} \approx \sqrt{\frac{1}{2\varepsilon}} \frac{2\pi\Delta\tau_{rms}}{T} = \alpha \frac{2\pi\Delta\tau_{rms}}{T}$$
 (eq7)

where  $\Delta \tau_{rms}$  is  $\sqrt{\Delta \tau_N^2}$ , and  $\alpha = \sqrt{\frac{1}{2K_d K_w aT}}$  is defined as the accumulation

factor. The result in eq7 is the rms. Phase jitter for a ring-oscillator PLL. From

the result, the rms timing jitter in a phase-locked-loop is seen to be  $\alpha$  times larger than the intrinsic jitter in the delay chain. The accumulation factor  $\alpha$  is inversely proportional to the square-root of  $K_d K_w aT$  and in this case shows little dependency on Cl and Cp. Therefore, as long as stability requirements are met in [7], the jitter accumulation factor can be lowered by increasing the bandwidth of the loop filter.

An alternative scheme for clock synthesis is to use a delay-locked loop [8]. In this case, the reference clock is fed to the input of the delay line, and the rising edge of the output of the delay line is compared to that of the reference clock. Since the rising edge of the reference clock reaches the output of the delay line after passing through all delay cells, the total delay is driven to be the same as one period of the reference clock. Also, since the output of the loop filter just changes the phase of the output of the delay line, the loop does not have any extra poles as a PLL does.

Therefore, the stability problem is relaxed and a simple capacitor loop filter can be used without any stability consideration. In a DLL, phase jitter is not passed on from one period of the clock to the next since the output of the delay-line is not fed back to the input. Therefore we expect the jitter in a DLL to be much smaller than in a ring-oscillator based PLL. To show this quantitatively we proceed with an analysis similar to that in the previous section but with the simplified discrete time DLL model. In this case, the transfer function for output phase noise in terms of the internal jitter from the delay line is represented by eq8.

$$\Theta_{on}(z) = \frac{\Theta_n(z)}{1 + K_d K_P T Z_F(z) z^{-1}}$$
(eq8)

Here the phase detector gain,  $K_d = \frac{I_s}{2\pi}$  and phase gain  $K_p = \frac{d\theta}{dv}$  when voltage controlled delay line is assumed. If the loop filter in the DLL is a single capacitor and given by  $\frac{1}{sC} = \frac{a}{s}$ , the transfer function becomes eq9.

$$\Theta_{on}(z) = \frac{(1-z^{-1})}{1+(\varepsilon-1)z^{-1}}\Theta_n(z)$$
 (eq9)

where  $K = K_d K_w a T$  and is actually replaced with the term  $\varepsilon$ . The jitter introduced by the delay line is represented by eq10 in the z-domain since in the time domain the effect of one pass down the chain is just an error impulse.

$$\Theta_n(z) = \frac{2\pi\Delta t_n}{T} \tag{eq10}$$

Therefore, the variance of the total output jitter can be shown to be

$$E[\Theta_{tot}^2(nT)] = (\frac{2\pi}{T})^2 \overline{\Delta \tau_N^2} (1 + \frac{\varepsilon}{(2 - \varepsilon)}) \cong (\frac{2\pi}{T})^2 \overline{\Delta \tau_N^2} \quad (\text{eq11})$$

and the rms output jitter is therefore given by eq12.

$$\sqrt{E[\Theta_{tot}^2(nT)]} \approx \frac{2\pi\Delta\tau_{rms}}{T}$$
 (eq12)

This expression is very similar to the result for the PLL, given in eq7, except now there is no noise enhancement factor  $\alpha$ . Therefore a DLL provides superior timing jitter performance. How much better depends on the size of  $\alpha$ .

This analysis has shown that, including the results of [6], the jitter in a ring-oscillator is proportional to three factors; the number of stages, the jitter contribution per stage, and a PLL accumulation factor  $\alpha$ , which is inversely proportional to the square-root of the bandwidth of the PLL. For a DLL the result is the same, except the noise enhancement factor is 1. Therefore in applications such as clock synthesis, where a DLL can be used, it is the better choice for jitter performance. To reduce the jitter enhancement in a PLL a larger loop bandwidth should be used. For applications such as clock-recovery, however, this bandwidth cannot be increased too much or it will enhance the jitter seen in the input signal.

### 2.5.2 Impedance Level Scaling Technique

It is a well-known fact that increasing the area of on-chip MOS-transistors improves the matching properties of those transistors [9]. The same also goes for the matching of resistors and capacitors on an IC [10]. This leads us to investigate the effect of increasing the area of a complete circuit in a systematic manner that we call impedance level scaling. The concept of impedance level scaling is fairly simple, yet leads to very useful design considerations. This technique enables a decoupled optimization of the noise and mismatch properties of a circuit independent of other properties such as speed and linearity, thus, simplifying the task of the designer. Starting from a circuit that has been optimized with respect to specifications other than noise and mismatch, one can scale the width of every component of that circuit by a certain factor  $\alpha$ . This is shown conceptually in Fig. 2.18, where the effect on the component values is also shown.



Fig 2.18 Concept of impedance level scaling

Using the analogy that scaling is similar to putting identical circuits in parallel, as illustrated in Fig. 2.18,  $\alpha = 2$ , it is easy to deduce that the node voltages of the scaled circuit are equal to those of the original circuit, provided the circuit is no heavily loaded externally. From this analogy it is also clear that the scaling will not change linearity and speed of the circuit.

A fact that is familiar to many designers is that impedance level scaling will improve the signal to noise ratio of the circuit at the cost of increased power usage. More precisely, scaling the circuit by a factor  $\alpha$  will decrease the
rms-value of the noise voltages by a factor  $\sqrt{\alpha}$  while increasing the power usage by a factor  $\alpha$ , meaning there is a direct tradeoff between power usage and noise.

A less familiar but important property of impedance level scaling is the effect it has on the mismatch errors of a circuit. Assume the relative change in the value of a certain component changes some circuit parameter ( for example, the offset voltage, or the delay of a delay cell) linearly. This is reasonable as long as mismatch changes the value of a component just slightly. The same relative change of the corresponding component in the scaled circuit will result in the same change of the output parameter, which can again be understood by the scaling analogy depicted in Fig. 2.19 the mismatch of the component value of the scaled circuit will reduce by a factor  $\sqrt{\alpha}$ , which means the sensitivity of circuit parameters such as offset and delay errors will be  $\sqrt{\alpha}$  times less in the scaled circuit than in the starting circuit, at the cost of increased power usage. For a delay cell, the implication of the impedance level scaling is that increasing the power by a factor  $\alpha$  yields a stochastic jitter reduction of  $\sqrt{\alpha}$  (which also follows from the jitter analysis in [11]). Also the mismatch of the delay between different cells will improve by a factor  $\sqrt{\alpha}$ .

| Starting circuit                                                                                                                                                                                                                                        | After Impedance Level<br>Scaling                                                                                                                                                                                                                                                                                             |  |  |  |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
| $ \begin{array}{c} \sigma_{v_n} = \sqrt{\frac{4kT\gamma}{g_m}} \Delta f \\ \hline \\ \hline \\ W/L \\ W/L \\ \end{array} \qquad \qquad$ | $g'_{m} = g_{m} \cdot \alpha$ $\downarrow^{v'_{n}}_{nW/L}  \sigma'_{v_{n}} = \sigma_{v_{n}} \cdot \frac{1}{\sqrt{\alpha}}$ $\downarrow^{u'_{n}}_{nW/L}  \sigma'_{\Delta\beta\beta} = \sigma_{\Delta\beta\beta} \cdot \frac{1}{\sqrt{\alpha}}$ $\sigma'_{\Delta v_{T}} = \sigma_{\Delta v_{T}} \cdot \frac{1}{\sqrt{\alpha}}$ |  |  |  |
| $ \begin{array}{c} v_n \bigoplus \\ R \bigoplus \\ \sigma_{v_n} = \sqrt{4kTR\Delta f} \\ \sigma_{\Delta R/R} = \frac{A_R}{\sqrt{Area}} \end{array} $                                                                                                    | $ \begin{array}{c} v'_{n} \bigoplus & \sigma'_{v_{n}} = \sigma_{v_{n}} \cdot \frac{1}{\sqrt{\alpha}} \\ nR \bigoplus & \sigma'_{\Delta R/R} = \sigma_{\Delta R/R} \cdot \frac{1}{\sqrt{\alpha}} \end{array} $                                                                                                                |  |  |  |
| $C = \frac{A_{C}}{\sqrt{\text{Area}}}$                                                                                                                                                                                                                  | $\stackrel{nC}{=}  \sigma'_{\Delta C'C} = \sigma_{\Delta C'C'} \frac{1}{\sqrt{n}}$                                                                                                                                                                                                                                           |  |  |  |

Fig 2.19 Effect of impedance level scaling

## 2.5.3 Analysis of PLL Jitter Caused by Digital Switching Noise

When combining an analog chip and a digital chip into one mixed-mode design, a particular area of concern is on-chip noise coupling from the digital to the analog circuitries which do not exist in either of the two original chips. In the PLL, the noise generated by digital logic couples through the substrate to an analog circuit will result to large jitter.

The principle of substrate noise coupling is shown in Fig. 2.20. An MOS transistor in the digital section of the chip turns on and discharges a capacitive load generating a brief current pulse in the Vss network. The current pulse is forced through the inductive bonding wire in the Vss path and generates a voltage bounce on Vss. This noise couples through the resistive substrate or through a shared supply network into the analog section of the chip. The amount of noise reaching the analog circuitry is proportional to the inductance and the amplitude of the current spike, but inversely proportional to the impedance of the connecting path. Here it is assumed that noise coupling due to resistance/inductance/capacitance of the on-chip power distribution network can be avoided by proper layout.



Fig 2.20 Noise generated by digital logic couples through the substrate to an analog circuit

There are several techniques attempting to reduce the noise injected into the analog circuitry ([12]–[14]). Each of the techniques focuses on one of the three components above(inductance of power supply path, amplitude of current pulse, impedance of connecting path), but often several techniques are combined to achieve a better noise reduction.

To demonstrate to effect of digital switching noise coupling, the phase-locked loop (PLL) is analyzed for three different power supply schemes. The main mechanisms for noise coupling are identified by comparing different PLLs and varying their bandwidths. The three different power supply distribution schemes in Fig. 2.21 will be studied in the following. In the first case, the digital circuitry and the analog PLL share both Vdd and Vss. Since the switching noise appears on both Vdd and Vss with opposite phase [14], we can expect large noise coupling resulting in large PLL jitter. In the second case, a separate Vdd is used for the analog PLL. Intuitively, we would expect a noise reduction by a factor of two, since half of the noise (the noise coupled through Vdd) is eliminated. With large amount of decoupling capacitance between Vdd and Avdd, the noise would turn into common-mode noise that cannot disturb the PLL, ideally giving no jitter. However, even with ideal capacitive Avdd-Vss coupling, resonance effects where Avdd and Vss resonate at opposite phase will turn the common-mode noise into differential- mode noise [14]. In the third case, the PLL is supplied with both separate Vdd and Vss. With infinite-impedance substrate, this would completely eliminate any noise coupling, but in the case of standard low-resistivity substrate, there is still some coupling. In all three cases, the local Vss is used for substrate contacts. The three power supply schemes are compared in a single chip containing several PLLs and noise generators.



Fig 2.21 Three power supply schemes under investigation. (a) PLL0: common Vdd and Vss , (b) PLL1: separate analog Vdd , (c) PLL2: separate analog Vdd and Vss

ALL DE LE

The measured jitter at 4-MHz PLL bandwidth is plotted in Fig. 2.22. It shows the RMS jitter as function of the NCk frequency which is the noise generator frequency. In this case, the reference clock was 100 MHz and the division ratio in the PLLs was set to 8, such that the VCOs were running at 800 MHz. For a PLL that shares Vdd and Vss with the digital circuitry, the main jitter source is supply coupling into the VCO. Using separate Vdd and Vss for a PLL causes substrate noise to couple into the loop filter node. The reason for this coupling is parasitic resistances in the epi layer below the MOS transistor used as filter capacitor. A PLL with separate Vdd but sharing Vss with the digital circuitry exhibits far less jitter than the other PLLs. The main cause of jitter in this case is delay variations in the feedback divider that mixes the PLL reference frequency into a low-frequency beat note. If this beat note frequency is lower than the PLL bandwidth, the PLL tracks the beat note. This occurs only when a harmonic of the clock of the noise generating digital circuitry is close to the reference clock driving the PLL.



Fig 2.22 Jitter measurement with PLLs having a bandwidth of 4 MHz

allie,

The same PLL was also fabricated in a triple-well process. Triple-well technology [15]–[17] is a relatively cheap process enhancement which can be made compatible with standard CMOS processing. It requires an additional implant layer, which is the buried well that extends the standard PMOS tub underneath the NMOS devices. This well breaks the resistive path from the digital noise source into the analog circuits, indicating that it can work as a noise blocking feature when using separate Vdd and Vss networks for the digital and analog circuits. However, there is still finite impedance between the digital and analog sections, since the triple-well has capacitive coupling to the substrate. The cross section in Fig. 2.23 shows a triple-well beneath the analog circuit, but it can also be located under the digital circuitry or both.



Fig 2.23 Triple-well processing provides a buried well that breaks the resistive noise coupling path

Test chip was processed in a triple-well technology enabling measurements of three different triple-well conFigurations. For a PLL with separate supplies residing in a triple-well, the jitter in Fig. 2.24(a) was obtained when the NG31 noise generator was active. The jitter of PLL2 in Fig. 2.22 caused by substrate noise coupling into the filter is not present for the triple-well PLL. Furthermore, no noise peaks can be observed at harmonics of the noise clock in Fig. 2.24(a), indicating that the PD/divider noise coupling also is eliminated. A buried well was added to the noise generator NG31s. The jitter of the PLL with a triple-well is shown in Fig. 2.24(b), indicating that the additional benefit in using triple-well beneath both the analog and digital circuits is minor. However, this cannot be a general statement, since the jitter when using triple-well only under PLL2 is very close to the limit of 5–6 ps that was observed without any noise source. When activating NG31s having a triple-well and observing the jitter of a PLL without triple-well, there is still significant coupling causing large jitter at harmonics of the noise clock frequency as shown in Fig. 2.24(c). From this, we conclude that it is more effective to place the PLL in a triple-well than the digital circuitry. The main difference between these two schemes is the effective area of the noise blocking triple-well. The area of the PLL was about 200 x 250um while the area of the noise generator was 1100 x 700m. The area is a direct measure of the capacitive coupling from the substrate to the triple-well. With larger area, the impedance is lower and therefore placing the triple-well under the noise generator is less efficient in blocking the noise.



Fig 2.24 Jitter induced by NG31 into PLLs having a bandwidth of 4 MHz. Prefix 3W indicates that the block resides in a triple-well



As the measurement results in [18], the jitter performance of three power supply schemes PLLs has been discussed. It is shown that with triple-well process both the noise coupling through the filter and the divider noise at harmonics are eliminated. The importance of keeping the triple-well area small to reduce capacitive coupling was also demonstrated.

## **Chapter 3**

# **Overview of Dynamic Frequency Scaling Technique and Proposed Dual Output Clock Generator with Dynamic Frequency/Phase Tuning Ability**

## 3.1 Dynamic Frequency Scaling System

ANIMAR, In recent years, the power and Energy consumption has become a critical design issue in embedded systems, which have been rapidly and widely spread, especially mobile systems and portable systems. Most embedded devices are operated using batteries so that their working duration is limited. Maintain high performance while extending the battery life is an interesting challenge for system designers. Dynamic Voltage Scaling (DVS) and Dynamic Frequency Scaling (DFS) have established itself as an important technique for saving energy on mobile embedded systems. Dynamic Voltage Scaling and Dynamic Frequency Scaling allow adjusting processor voltage and frequency at runtime to adapt to the workload demand for better energy management. Usually, higher processor voltage and frequency leads to higher system throughput while energy reduction can be obtained using lower voltage and frequency. The dynamic power consumed by a microprocessor is proportional to  $V^2 f$ , where V refers to the voltage supplied to the processor while f is the frequency at which it is clocked. The expression indicates that the power savings achieved through voltage reduction is quadratic in terms of the voltage change. Similarly a linear energy saving is obtained by reducing the frequency.

However, reduction of processor voltage and frequency increases the circuit delay, causing slowdown in the execution. So these two factors cannot be reduced arbitrarily due to primarily two reasons. First, the performance constraints of the concerned applications. Second, the maximum frequency at which the processor can be clocked is limited by the voltage level [19]. Hence, the dynamic selection of the processor voltage and frequency are essentially parts of the same problem. Furthermore the time taken to switch frequencies is typically much less than that taken for changing the voltage level of the processor. Hence, for very short intervals, only the processor frequency may be changed without changing the processor voltage level.

The amount of power or energy savings possible through dynamic voltage and frequency scaling is essentially dependent on the computational requirements of the applications. The mechanism of DFS and DVS is critical to achieve both power saving and performance. According to different system concern, the Dynamic Voltage Frequency Scaling(DVFS) mechanism may be different. In the following sections we will review several dynamic frequency scaling systems. The mechanism of DFS and performance will be discussed.

# **3.1.1 The Dynamic Frequency Scaling Technique**

Most conventional microprocessors are designed using synchronous clock distribution mode. The clock distribution network is designed very carefully to meet the constraints of clock skew. It contributes to the complexity of clock interconnection and the significant increase of microprocessor power. So the designers present asynchronous system which need not clock. But there are so many difficulties in design of the complete asynchronous signals. Globally Asynchronous Locally Synchronous (GALS) [20], which is a compromise between asynchronous systems and synchronous systems, has been focused on. Most current researches on GALS are based on superscalars [21-26], and not been applied to EPIC architecture, because there are much difficulties in its implementation. In [27], it presents multiple clock domains (MCD) EPIC which is done with GALS style. The dynamic frequency scaling technique has been applied, and the simulation results of power saving and performance has been presented.

Fig 3.1 shows the conventional EPIC architecture and the MCD partition EPIC architecture. The EPIC microprocessor is partitioned to six clock domains according to the rules 1) this partition can't change the organization structure of the microprocessor pipelines too much; 2) the domain boundaries are set between the components having a loose coupling with each other. According to the above rules the partition of six domains: a fetching instruction domain (Domain1), a dispatch domain (Domain2), a L2 Cache domain (Domain 3), a load/store domain (Domain4), an integer domain (Domain5), and a floating-point domain (Domain6). The function of Domain1 contains branch prediction, instruction address generation, and I-Cache read. Domain2 accomplishes dispatch of instructions. Domain3 includes L2 Cache read/write operation. L1D-Cache read/write is completed in Domain4. Domain5 completes the load/store operation of integer operands, and execution of arithmetic logic. Domain6 consists of the load/store operation of floating-point operands, and execution of floating-point computing. Each domain has its own local clock. The units within same domain operate in synchronous mode. The queue structures based on the hybrid clock FIFO has been used as the asynchronous communication among different domains.





The conventional EPIC architecture

Multiple clock domain EPIC



By analyzing processor resource utilization, a correlation is revealed, over an interval of instructions, between the valid entries in the input queue and the desired frequency for the domain. Queue utilization is thus an appropriate metric for dynamically determining the desired domain frequency. The dynamic, adaptive control algorithm of clock domain's frequency is based on this idea to reduce the power consumption of the clock network. The dynamic, adaptive control attack/decay algorithm [24] is used independently in each back-end domain. When the entries in the domain issue queue is in excess of 10,000-instructions interval, the hardware counts. Using the number and the corresponding number from the previous interval, the algorithm determines whether there is a significant change that threshold is 1.7 percent, in which case the algorithm uses the attack mode: The frequency changes by 7 percent. If no significant change occurs, the algorithm uses the decay mode: It reduces the domain frequency slightly by 0.17 percent.

#### AND DE LE COLORIZE

In order to evaluate it, the basic technology parameter of CACTI 0.8µm is used [28], and scaling down method to implement power consumption evaluation model [29]. Since IMPACT [30] provides a comprehensive infrastructure for modeling and simulation of EPIC microarchitecture feature, the Lsim simulator [30, 31] of the IMPACT compile framework is adopted as the simulation engine. Also, three strategies of frequency voltage control have been simulated to evaluate the dynamic scaling mechanisms:

**SVF Strategy:** Each of clock domains works on condition that the voltage is same, and frequency is also same, called SVF for simple, which represents same voltage and frequency.

**DVF Strategy:** Each of clock domains works on condition that the voltage may be different, and frequency may be different, too. The frequency and voltage are not adjusted during the system running. This strategy is called DVF for simple, which denotes different voltage and frequency. The voltage and the frequency of each of domains are set as Table 3.1, according to the architecture characteristic of Itanium 2.

**DVF+DAA Strategy:** Each clock domain works on condition that voltage and frequency maybe different with each other domain. They are also set as Table 2. Furthermore, during the system running, the frequency of each back-end domains is adjusted dynamically and adaptively as the attack/decay algorithm

Fig 3.2 shows the power consumption of the MCD EPIC relative to the basic EPIC processor under three strategies described above. All three MCD EPIC strategies can decrease the microprocessor's power consumption. Comparing with SVF and DVF, Using DVF+DAA strategy can more effectively decrease the power consumption of the microprocessor, which decreases the power by 40 percent, as a result of using the fine-grained dynamic adaptive frequency adjustment.



Fig 3.2 The power consumption comparison

Fig 3.3 shows the impact on the performance that using the different MCDE strategy. The SVF strategy results in a slight degradation, within 1 percent. Comparing with SVF, DVF and DVF+DAA result in more performance degradation owing to clock frequency being decreased. For DVF, the average performance degradation is approximately 6.5 percent. For DVF+DAA, the average performance degradation is about 7.3 percent.



Fig 3.3 The performance comparison

ALL DA

delay product (EDP) is a popular metric to evaluate Energy comprehensively the performance and the power consumption. Fig 3.4 shows the EDP of MCD EPIC relative to basic EPIC, corresponding to three strategies. The results of the experiment indicate that, for DVF+DAA, although the performance degradation is slightly significant, a significant overall EDP improvement is achieved, about 17 percent, owing to immensely exploit the potential for MCDE reducing the power consumption of the microprocessor. Comparing with basic EPIC, all three MCD EPIC strategies are obvious to improve the EDP. Thus, there is a significant advantage of decreasing the power consumption by using MCD EPIC, and the dynamic control strategy can achieve the most power saving.



#### Fig 3.4 The EDP comparison

The dynamic scaling technique using in the system can greatly reduce the power consumption. But in the system concern the frequency can not scale down infinitely. The system performance has to be considered. To achieve low power consumption and high performance is the challenge of the next generation power management.

## 3.2 <u>Proposed Dual Output Clock Generator with</u> <u>Dynamic Frequency/Phase Tuning Ability</u>

Dynamic voltage scaling (DVS) and dynamic frequency scaling (DFS) have been explored and have shown to provide significant energy savings while meeting performance requirements, especially in the portable devices which have widely varying workloads. To support dynamic frequency scaling, the clock generator should provide dynamic switching ability, and also the switching time should be short. The output clock range to be selected should be as wide as possible, and the fractional multiples of reference clock can also be considered, since it provides more selections to the systems.

The proposed clock generator is designed in the motivation to handle the wide range dynamic frequency scaling ability and the advance power management application. The multiple clock outputs with dynamic frequency/phase tuning ability are essential for system aspect and power management. The proposed clock DLL-based generator is capable of low power and wide frequency range operation from 142.5MHz to 2.7GHz (with TSMC 0.13um technology). It generates two independent clock signal sources, and is extensible to provide more sources by inserting additional edge combiners. Six frequency multiplier can be selected for each outputs, and combining the phase switching ability. The dynamic frequency/phase scaling operates instantly without relocking process.

## 3.2.1 The Architecture of Proposed Clock Generator

The architecture of proposed clock generator is DLL-based, since DLL-based provided more precise phase signals and less jitter because of no jitter accumulation. As Fig. 3.5 shows, the digital delay line (DDL) provided six-phase outputs which equally lied within one periods of REFCLK when locking. The input capacitance on every node of the DDL has been designed to ensure accurate phase difference. The dual output dynamic frequency/phase scaling synthesizer consist of the smooth charge phase blender which combined every adjacent two phase signals of the 6 outputs, and generated 12 phase signals each separated 1/12 period of REFCLK. The smooth charge phase blender also works like buffer to provide sufficient driving strength for edge combiner, which may be extensible to more than two. The edge combiner would followed the control vector (Si[11:0]) to choose proper combination of twelve input phase, (Pi). Each edge combiner has independent programmability to choose desire frequency/phase. The digital charge-detecting controller using charge style detection to measure the delay condition and adjusts the delay of digital delay line to trace the REFCLK.



The dual outputs frequency/phase tuning synthesizer

Fig 3.5 The architecture of proposed dual output clock generator with dynamic frequency/phase tuning ability

### 3.2.2 Dynamic Frequency/Phase Tuning Synthesizer

The dynamic frequency/phase tuning synthesizer receives timing information from the delay line then synthesized the desire frequency and phase dynamically according to the control vector (Si[11:0]). The dynamic frequency/phase scaling synthesizer is consisted of two parts, the smooth charge phase blender and two edge combiners.

#### A. Smooth Charge Phase Blender

To increase multiphase density, increase delay cell stage is a direct way (each delay cell provide equal delay), but The delay line resolution is reduce since delay line resolution is (number of stage)x(minimum fine resolution). The more stages being adopted the lower the total delay line resolution will be. With the help of smooth charge phase blender, the phase resolution is double without adding any delay cell stage. The delay line adopted six delay cell stages to generate six phases, additional six phases are generated from the phase blender, thus it provide total twelve phases each separated 1/12 period of REFCLK.

The conventional phase blenders are used to combine two different phases to generate a new phase signal between these two phases, as Fig. 3.6 shown [32], interpolated input signals by carefully choosing inverter sizes, which is suitable to combine signals with specified phase difference. But the when the phase difference of two input signals were chosen to be combined is change within a range, not fixed, the error would be happened. This is mainly because the short current in the node X, which is not linear charge/discharge current.



Fig 3.6 The conventional phase blender

The smooth charge phase blender (SCPB) has been proposed to reduce the blended phase error when phase difference of two input signals changed. As shown in Fig.3.7, the SCPB is similar to the basic conventional phase blender, but the transmission gates stage are applied. The leading phase signal (OUTA) is connected to the IN.



Fig 3.7 The smooth charge phase blender (SCPB)

1896

Fig. 3.8 shows the voltage curve of each node of proposed SCPB when phase difference is 400ps. The signal of node X has severe charge/discharge, which was due to the short circuit current. When the blend process begins, the voltage curve of X can be divided into two regions. The first region is in the situation that OUTA(IN) is "HIGH" and B is "LOW", the second region is in the situation that OUTA(IN) and B are both "HIGH". When the phase difference of A,B changes, the width of first region is also changed, thus it's hard to control the trigger point of X. but after signal X passing through the transmission gate (TG), the delay and capacitance provided by (TG) make output signal of TG,Y , become smooth and linear. Thus the almost linear charge/discharge process begins with OUTA signal arrives, and nearly ends at the transition of lagged phase (OUTB), the trigger point of OUTAB, which is triggered by the Y, is nearly in the middle of 2 phases(OUTA,OUTB), thus provide blended signal of OUTA and OUTB.



Fig 3.8 The voltage curve of SCPB when phase difference is 400 ps



phase difference of A,B (ps)

## Fig 3.9 The performance comparison of SCPB and conventional phase blender



There should be mentioned that the delay time of A to OUTA cannot exceed the phase difference time, and the stage number (m) should be considered in case of different operation range. In my design I use two stages of transmission gate. The performance comparison with conventional phase blender is shown in Fig. 3.9, the error (%) was counted by |PPx|/(0.5\*Pdiff), where P is the correct phase in the middle of the two input signals. Px is the OUTAB phase time, and Pdiff is the phase difference of the two input signals. As we can see the SCPB combined signal reduce the error over the conventional phase blender within the phase difference from range 0.3ns to 0.5ns. The proposed SCPB reduce the phase error to the acceptable region, but in the 500ps phase difference case, considerable phase error happened. Fig 3.10 shows the voltage curve of each node of proposed SCPB when phase difference is 500ps. It can see that the region1 is become wider since the phase difference is wider; the phase B is lag 100ps compare to the 400ps phase difference case. The voltage curve of X becomes more distinct, and the two stages transmission gate is not enough to convert it to the linear curve. This slight difference results to the phase error.



Fig 3.10 The voltage curve of SCPB when phase difference is 500 ps

To achieve more accurate blended phase, the modified dynamic controlled SCPB has been developed as shown in Fig 3.11. The transmission gate stages connected to the X is increase to four, and the outputs of stage two, four have connected to the MUX, the control signals from coarse tune of digital charge-detecting controller(which will be discussed in section 3.2.2 B) is used to select the desire trigger signal to trigger OUTAB. When the phase difference is below 460ps, which is the condition that REFCLK is upon 360MHz, the coarse tune signals of digital charge-detecting controller [C1,C2] will be [1,1] or [0,1], and it also sent to SCPB to select T1 as the trigger signal. Similarly when the phase differences are upon 460ps, which is the condition that REFCLK is below

360MHz, the coarse tune signals of digital charge-detecting controller will select first coarse stage of delay cell, and it will also send to SCPB to select T2 as the trigger signal. The modified dynamic controlled SCPB will need another control signals to select trigger signal dynamically according to the phase difference of A,B. But in the proposed clock generator it don't need extra control circuit to generate the controlled signals, since the coarse tune signals of digital charge-detecting controller generates the coarse signals is corresponding to the condition of phase difference. The phase difference is corresponding to the delay time of the delay cell. Fig 3.12 shows the performance of the modified SCPB. The relatively large error in the origin SCPB when phase difference is 500ps is reduced. The phase error is controlled below 8% within the 300~500ps phase difference range.



Fig 3.11 The modified dynamic controlled SCPB



modified SCPB

Fig 3.12 The performance of the modified SCPB



#### **B.** Edge Combiner

The purpose of edge combiner or frequency synthesis circuit is using multiphase clock information to synthesize output clock frequency. To support dynamic frequency scaling, edge combiner should be programmable, and dynamic synthesize different output frequency according to the control words. There are many schemes of edge combiner or frequency synthesis circuit. The Direct digital period synthesis (DDPS), proposed in [33] is a technique that allows for a low-jitter clock to be synthesized. A very interesting feature of the DDPS is its ability to multiply a reference clock frequency by any fractional number. As shown in Fig 3.13, the DDPS generates a number of different phases from an incoming clock and selects them in a given order to produce the desired clock period.

This selection process is handled by an accumulator and a phase selection module according to a user input (period multiplier). The drawback of this circuit is that the absolute maximum frequency being limited by the critical path in the finite state machine (FSM)/shifter, and the duty cycle of output clock is not 50% guarantee, thus it may not be sufficient for high performance portable devices. The clock multiplier proposed in [34] is shown in Fig 3.14, instead of using a variable input to the flip-flops, the data inputs are hardwired to logic '1'. This removes the critical path problem. The Near-50% Duty Cycle circuit handle the duty cycle of output clock to nearly 50%, but this circuit has only fixed clock multiplier, and even with variable inputs in flip-flops, the selections of frequency is narrow.



Fig 3.13 The architecture of the DDPS



Fig 3.14 The clock multiplier and duty cycle circuit proposed in [34]

The dynamic frequency/phase scaling synthesizer used edge combiner [35] as shown in Fig 3.15, Each edge combiner provides one individual clock output by making use of the twelve-phase output signals. The pulse generator generates a short period pulse at the rising edge of the phase Pi, the AND logic tree puts the short pulses together and toggles the phase of output at every negative edge of the short pulses. Thus, the multiplier output clock signal toggles at every rising edge of signal. The toggle pulsed latch (TPL) shows in Fig 3.16 receive the pulse and change the output every time the pulse arrived. \*\* The

select signal Si enable the corresponding pulse generator, by choosing the order and number of pulse generators, the edge combiner achieve frequency/phase switch. The twelve select signal Si group to form the control vector Si[11:0] and control the output frequency and phase. The frequency synthesis example is shown in Fig 3.17. Two groups of program signals both synthesized 2X REFCLK but with different phase has been shown.



Fig 3.15 The edge combiner



Fig 3.16 The toggle pulsed latch



Fig 3.17 The two examples of frequency synthesis

The table of all frequency/phase combination and the corresponding controlled program signals was shown in Table I. Six frequency multiplied factor are provided, they are 0.5X, 1X, 1.5X, 2X, 3X, 6X. Up to twelve-phases to be selected given the abundance frequency synthesis multiplied factor, including non-integer multiplied factor. Except the maximum output frequency(which is multiplied factor 6X), every multiplied factor frequency has different phases to select, this is achieve by shift the "1" of the control vector Si[11:0].

The available phase numbers of each multiplied factor is also shows in the Table. The phase shifting supports the demand of phase variety in multi-core system clock domain. Every frequency scaling or phase shifting can be operated instantly without any relock process.

| Multiplied | Si[11:0] program signals |     |     |     |     |     |     | Dhagag |     |     |      |      |         |
|------------|--------------------------|-----|-----|-----|-----|-----|-----|--------|-----|-----|------|------|---------|
| Multiplied | Si[11:0] program signais |     |     |     |     |     |     | Flases |     |     |      |      |         |
| Factor     | [0]                      | [1] | [2] | [3] | [4] | [5] | [6] | [7]    | [8] | [9] | [10] | [11] | Numbers |
|            | 1                        | 0   | 0   | 0   | 0   | 0   | 0   | 0      | 0   | 0   | 0    | 0    |         |
| 0.5X       | 0                        | 1   | 0   | 0   | 0   | 0   | 0   | 0      | 0   | 0   | 0    | 0    | 12      |
|            | :                        |     |     |     |     |     |     |        |     |     |      |      |         |
|            | 0                        | 0   | 0   | 0   | 0   | 0   | 0   | 0      | 0   | 0   | 0    | 1    |         |
|            | 1                        | 0   | 0   | 0   | 0   | 0   | 1   | 0      | 0   | 0   | 0    | 0    |         |
| 1X         | 0                        | 1   | 0   | 0   | 0   | 0   | 0   | 1      | 0   | 0   | 0    | 0    | 6       |
|            | •                        |     |     |     |     |     |     |        |     |     |      |      |         |
|            | 0                        | 0   | 0   | 0   | 0   | 1   | 0   | 0      | 0   | 0   | 0    | 1    |         |
|            | 1                        | 0   | 0   | 0   | 1   | 0   | 0   | 0      | 1   | 0   | 0    | 0    |         |
| 1.5X       | 0                        | 1   | 0   | 0   | 0   | 1   | 0   | 0      | 0   | 1   | 0    | 0    | 4       |
|            | •                        |     |     |     |     |     |     |        |     |     |      |      |         |
|            | 0                        | 0   | 0   | 1   | 0   | 0   | 0   | 1      | 0   | 0   | 0    | 1    |         |
|            | 1                        | 0   | 0   | 1   | 0   | 0   | 1   | 0      | 0   | 1   | 0    | 0    |         |
| 2X         | 0                        | 1   | 0   | 0   | 1   | 0   | 0   | 1      | 0   | 0   | 1    | 0    | 3       |
|            | 0                        | 0   | 1   | 0   | 0   | 1   | 0   | 0      | 1   | 0   | 0    | 1    |         |
|            | 1                        | 0   | 1   | 0   | 1   | 0   | 1   | 0      | 1   | 0   | 1    | 0    |         |
| <b>3X</b>  | 0                        | 1   | 0   | 1   | 0   | 1   | 0   | 1      | 0   | 1   | 0    | 1    | 2       |
| 6X         | 1                        | 1   | 1   | 1   | 1   | 1   | 1   | 1      | 1   | 1   | 1    | 1    | 1       |

 TABLE I

 FREQUENCY/PHASE COMBINATION AND PROGRAM SIGNALS

The rising edge pulse scheme of the frequency/phase scaling synthesizer provide 50% duty cycle characteristic. It maintain the 50% duty cycle even when the reference clock is not 50% duty cycle. Since the pulse only generate at every rising edge of the phase clock, the duty cycle of the reference clock will not influence the timing information of the phases. The synthesis output clock's duty cycle is dependent on the timing distance among the phases, the same phase difference will guarantee the 50% duty cycle of the synthesis output clock. Fig 3.18 shows the wave form of frequency multiplied factor two in the condition that the reference duty cycle is not 50% to illustrate the rising edge pulse scheme.



Fig 3.18 The rising edge pulse scheme to handle the 50% duty cycle

### 3.2.3 Low power six-phase delay-locked loop

The proposed low power six-phase delay-locked loop locks the reference clock and provides six phases to dynamic frequency/phase scaling synthesizer. The major concern of the DLL part is the power consumption and fast locked. There are digital delay line and digital charge detecting controller in the DLL. The digital delay line is consisted of six stages low power delay cell. In general, the delay line consumes more than 50% power consumption in all DLL, so the low power design of the delay line is an important issue. The digital charge detecting controller used charge-based detection to measure the frequency of reference clock, and trace it to lock.

#### A. The Low Power Delay Cell

Fig 3.19 shows the proposed low power delay cell, it's consisted of two parts, the coarse-tune and fine-tune. The coarse-tune is constructed by transmission gates, there are three stage in the coarse-tune, which operate by the control signal t2,t3,d1,d2,d3, t2n is the invert signal of t2, and so on. The control select is shown in Fig 3.20. Three different coarse delay times can be provided and each separate by one transmission gate delay time.

and the second



Fig 3.19 The low power delay cell

| Control                                            | Delay    | Delay   | Delay   |  |  |  |  |
|----------------------------------------------------|----------|---------|---------|--|--|--|--|
| signals                                            | Stage 1  | Stage 2 | Stage 3 |  |  |  |  |
| t2                                                 | High     | Low     | Low     |  |  |  |  |
| t2n                                                | Low      | High    | High    |  |  |  |  |
| t3                                                 | High     | High    | Low     |  |  |  |  |
| t3n                                                | Low      | Low     | High    |  |  |  |  |
| d1                                                 | Low      | High    | High    |  |  |  |  |
| d1n                                                | High     | Low     | Low     |  |  |  |  |
| d2                                                 | High     | Low     | High    |  |  |  |  |
| d2n                                                | Low      | High    | Low     |  |  |  |  |
| d3                                                 | High     | High    | Low     |  |  |  |  |
| d3n                                                | Low      | Low     | High    |  |  |  |  |
| Total delay time                                   |          |         |         |  |  |  |  |
| Short                                              | Summer 1 | FOR     | Long    |  |  |  |  |
| Fig 3.20 The control signal to select coarse stage |          |         |         |  |  |  |  |

The delay time provide by the transmission gate is relatively large, which is suitable to used as coarse-tune delay element. The transmission gate consume very low power, compare to simple inverter delay cell, the power saving is more than two orders.

1896

The fine-tune uses MOS capacitor as the fine-tune delay cell. By controlling the select signals c1,c2 ...., the total capacitor seeing from the delay cell is changed, and thus change the delay time. The larger the capacitor seeing from the delay cell, the delay time it achieved. The resolution of fine delay cell is 5ps. There should be mentioned that the fine-tune delay range should be over the delay time of single coarse-tune stage. If it is not then the total range of delay line can be locked is not continuous. The start-point and end-point of both coarse-tune and fine-tune stage should place inverter to hold the voltage level and provide driving strength to the next stage.

#### **B.** Digital Charge-Detecting Controller

Fig 3.21 shows the digital charge-detecting controller (DCD controller) which is composed of coarse and fine-tune control. The DCD controller uses charge-detecting line (CDL) as the measure element. Fig 3.22 shows the basic CDL structure. The enable signal controlled the AND gate to pass the measured-signal. The passing measured-signal then went through the capacitor node. An inverter is attached to the end of capacitor node, and the output is connected to the input of D flip-flop. The inverting measured-signal is connected to the trigger clock node of D flip-flop. So if the measured-signal is periodical signal, when it is "HIGH"(and enable signal also is "HIGH"), the capacitor node begin to charge. The D flip-flop will catch the inverter's output at the falling edge of measured-signal. If the width of "HIGH" pulse of measured-signal is long enough, the D-flip flop will catch a "LOW" signal in it, otherwise it will catch a "HIGH". The CDL thus can measure the timing information of the measured-signal. By adjusting the capacitor size, the CDL can target to different width of "HIGH" pulse of the measured-signal.



Fig 3.21 The digital charge-detecting controller (DCD controller)





Fig 3.22 The charge-detecting line (CDL)

The coarse-tune control is consisted of two CDLs, both measured-signal is the REFCLK. The two CDLs opened at the same time and specify the frequency of REFCLK into three distinct ranges. Fig 3.23 shows the detecting condition of three frequencies of REFCLK ranges. C1,C2 are the outputs of D flip-flops of two CDLs, the C1's CDL is design with smaller capacitor size(easy to charge to the threshold voltage), while the C2's CDL is design with relatively larger capacitor size(hard to charge to the threshold voltage). The three frequency range of REFCLK is thus been detected. When C1 and C2 is both "1", that means frequency of the REFCLK is in the lowest range; when C1 is "0" and C2 is "1", the frequency of REFCLK is in the relatively larger range since it charge the C1's CDL; when C1 and C2 is both "0", the frequency of REFCLK is in the largest range. The setting of three frequency ranges is corresponding to the three stage of the delay cell coarse-tune stage. The measured condition can be converted to coarse-tune control directly. The frequency of REFCLK thus can be measure directly.



Fig 3.23 The detecting condition of coarse tune signals

When coarse locking is ready, the fine locking is start. The fine-tune control has a hierarchical structure CDLs to reduce peak power consumption. A type-IV phase detector is used to measure the lead condition of the delay line output O6 and REFCLK, since the width of lead signal is proportional to the phase difference of O6 and REFCLK; the lead signal is used as measured-signal for CDLs. The CDLs is opened in order of the capacitor size. Each CDLs result is corresponding to different fine-delay time control. The largest capacitor CDLs which used to measure the largest fine-delay time is opened first to decide to activate the large fine-delay time control or not, and the corresponding fine-tune delay control is sent to the delay line, then the second CDLs with smaller capacitor compare to largest one is opened in the next cycle, and so on. The total fine-delay steps in the delay cell are 43 steps. The CDLs opened to measure in order. The totals fine-tune delay steps are decided in this one way tracking method. The locking time of DCD controller is fixed to 10 cycles.
### 3.2.4 Simulation Result of the Proposed Clock Generator

The proposed clock generator was implemented both in TSMC 130nm technology and UMC 90nm technology. In TSMC 130nm technology, the locking range of the DLL is from 285MHz to 450MHz, and the output frequency range is from 142.5MHz (0.5X multiplied factor) to 2.7GHz (6X multiplied factor). When both clock outputs operate at 2.7GHz, the power consumption is 3.3mW. If only one output activated and operates at 2.7GHz, the power consumption is 2.4mW. In UMC 90nm technology, the locking range of the DLL is from 270MHz to 500MHz, and the output frequency range is from 135MHz (0.5X multiplied factor) to 3GHz (6X multiplied factor). When both clock outputs operate at 3GHz, the power consumption is 2.3mW. If only one output activated and operates at 3.5MW. If only one output activated factor is 2.3mW. The performance of the proposed low power dual programmable clock generator is summarized in Table II.



| Output frequency – minimum  | 142.5Mhz                                   |
|-----------------------------|--------------------------------------------|
| Output frequency – maximum  | 2.7Ghz                                     |
| Peak-to-peak jitter         | 20ps@225Mhz                                |
|                             | 37ps@2.7Ghz                                |
| Number of frequency outputs | 2 (with independent frequency/phase tuning |
|                             | ability)                                   |
| Multiplication factor       | 0.5X , 1X , 1.5X , 2X , 3X , 6X            |
| Power consumption           | 3.3mW@ dual outputs both operating at      |
|                             | 2.7Ghz                                     |
|                             | 2.4mW@ only 1 outputs activates and        |
|                             | operates at 2.7Ghz                         |
| Supply voltage              | 1.2V                                       |
| Process                     | 130nm TSMC CMOS technology                 |

#### TABLE II

THE PERFORMANCE OF PROPOSED CLOCK GENERATOR

| Output frequency – minimum  | 135Mhz                                     |
|-----------------------------|--------------------------------------------|
| Output frequency – maximum  | 3Ghz                                       |
| Peak-to-peak jitter         | 20ps@250MHz                                |
|                             | 33ps@3GHz                                  |
| Number of frequency outputs | 2 (with independent frequency/phase tuning |
|                             | ability)                                   |
| Multiplication factor       | 0.5X , 1X , 1.5X , 2X , 3X , 6X            |
| Power consumption           | 2.3mW@ dual outputs both operating at 3Ghz |
|                             | 1.8mW@ only 1 outputs activates and        |
|                             | operates at 3Ghz                           |
| Supply voltage              | 1V                                         |
| Process                     | 90nm UMC CMOS technology                   |

Fig 3.24(a) shows the locking procedure when RECLK is 333MHz, and Fig 3.24(b) shows the locking procedure when RECLK is 475MHz. The locking procedure is finished in ten fixed cycles. Fig 3.25 shows the output frequency of six multiplied factors when REFCLK is 500MHz. The 0.5X multiplied factor will generate 250MHz output frequency; 1X will generate 500MHz output frequency; 1.5X will generate 750MHz output frequency; 2X will generate 1GHz output frequency; 3X will generate 1.5GHz output frequency; 6X will generate 3GHz output frequency.

Fig 3.26(a) shows operation of the dual clock output when REFCLK is 450MHz, and CLKOUT1 operates at 225MHz and CLKOUT2 operates at 2.7GHz, which is multiplier factor 0.5X and 6X. Fig 3.26(b) shows the dynamic scaling example of CLKOUT1 from 450MHz to 1.35GHz and CLKOUT2 from 225MHz to 1.5GHz.







Fig 3.25 The output frequency of six multiplied factors



Fig 3.26 The dual clock output and dynamic frequency scaling example



## 3.3 <u>Conclusion of Proposed Clock Generator and</u> <u>Application in Advance Power Management</u> System

This thesis proposed a novel low power dual clock generator with frequency/phase tuning ability for dynamic scaling and SoCs, power management applications. The frequency/phase synthesizer has wide operation range on both frequency and phase, and the dynamic frequency/ phase scaling process switch instantly without any relocked time. Each clock output is independent in any dynamic scaling operation. Six multiplied factor can be chosen for each clock output, and the phase also can be adjusted. The digital charge-detecting controller which using CDL to detect timing information has been implemented, and finished lock procedure in ten reference clock cycles. To extend an additional output operates at 3 GHz will only increase total power by 0.5mW (in UMC 90nm technology), thus provide this architecture extensible

possibility. This demonstrated that the DLL and frequency synthesizer structure is suitable for multi clock output architecture.

The proposed clock generator can be used in many applications such as data recovery, high speed serial link, SoCs applications and advance power management. The idea is illustrate as Fig 3.27. The basic power management with dynamic frequency scaling is switched the frequency of a group circuit dynamically, as shown in Fig 3.27(a). Generally the whole circuit is divided into several groups, but sometimes the group is huge, and the dynamic frequency control of whole group can only according to the critical path oh the whole group. Since the general clock generator has only one clock output, so the whole group circuit need to attach to the same clock. The advance power management concept is illustrated in Fig 3.27(b). The general clock generator is replaced by the multi output clock generator. The different sub block circuits in the group can attach to different output clock. This imply that the origin group circuit can be re-divide to sub circuits without apply another clock generator. With the detector which has only small overhead in power and easy to apply, the each sub block circuits can define the desire frequency according to the performance or power concern individually and dynamically to achieve the advance power management.





Fig 3.27 The advance power management concept

# Chapter 4 Low Voltage Oscillators with Wide Tuning Range

#### 4.1 Introduction to Low Voltage Circuit Design

The continuous growth of personal wireless communications demands low-cost low-power solutions in the design of wireless and portable systems, which require low-power design techniques to enhance their battery lifetime and to improve their portability. Aggressively scaled supply voltage has been touted as one of the most powerful mechanisms for improving reliability and reducing power consumption in nanometer technologies. Low-voltage operation may save the power consumption of the analog circuits as long as the total bias current does not need to be increased to maintain the same performance. Also in the energy harvest systems, such as solar system, the power supply is limited; the power circuits have to be able to operate at low voltage. Low voltage, however, limits the signal amplitude, and also limits the circuit operation speed. To design the low voltage circuit, except the circuits have to be able to operate at a range of low voltage, the performance also needs to meet the system requirement. This chapter proposed two types of low voltage oscillator both with wide tuning range. The oscillator is designed to applicant to the solar cell power management system.

### 4.2 <u>The Type I Low Voltage Differential Oscillator</u> <u>with Wide Tuning Range</u>

The type I low voltage oscillator is a differential voltage controlled oscillator. The design goal of type I is it have to be able operate with 0.45~0.55 supply voltage, and in each supply case generate 50MHz ~200MHz clock range as the minimum range. The design is target to solar cell power management system application which requires the oscillator to dynamically switch the frequency of clock, even with supply voltage decrease, the oscillator have to able increase the output frequency up to 200 MHz.

Fig 4.1 shows the structure of proposed type I low voltage delay cell.



Fig 4.1 The proposed type I low voltage delay cell

The NMOS with gate connect to the inn and inp are the inputs of delay cell. When the oscillator operates, the inn and inp are assume to be as differential signals. The two PMOS m1 and m2 operate like keeper style. The gates of m1 and m2 are connected to the output nodes outn and outp respectively. When the inn is "LOW" and inp should be "HIGH", the NMOS with gate connects to the inp is strongly opened compare to the NMOS with gate connects to the inn. The voltage of outn will decrease, while the outp will only slightly decrease, since the NMOS with gate connects to the inn is only slightly opened. The low voltage of outn will turn on the PMOS m1, and increase the voltage of outp; similarly the high voltage of outp will turn off the PMOS m2, and thus make the voltage of outn easier to decrease by the NMOS with gate connects to inp. The PMOS m1,m2 operate as a keeper style, and increase the differential characteristic of delay cell. When the delay cells are connected to form the oscillator, it insures that inn and inp are in different phase.

#### ALL DE LE DE

The bias node Vbias is connected to the gate of a pair of NMOS and PMOS on each side. The drain of NMOS and PMOS of the pair are connected to outp and outn on each side. When Vbias increase, the NMOS is turn on and PMOS turn off, thus decrease the supply current from Vregu and increase the NMOS current to the ground. The decrease of supply current and increase of NMOS current to the ground will decrease the circuit speed; when Vbias decrease, the NMOS is turn off and PMOS turn on, thus increase the supply current from Vregu and decrease the NMOS current to the ground. This will result to the speed up of circuit operation. To summarize, Vbias increase will slow down the circuit operation speed and increase the delay time of the delay cell, which will decrease the output frequency of the oscillator; on the other hand Vbias decrease will speed up the circuit operation and decrease the delay time of the delay cell, the output frequency of the oscillator will increase.

PMOS m3 and m4 with gate connect to ground, this two PMOS are always open. PMOS m3 and m4 are used to provide bias current. The minimum sizes is used to reduce power consumption. They are insure when Vbias increase there are sill enough current from Vregu to handle the circuit operation.

Compare to the other voltage controlled delay cell shows in Fig 4.2 which Vbias only connect to single PMOS(or NMOS), the impact of Vbias to the frequency is smaller than the proposed delay cell. When Vbias increase or decrease, the delay cell shows in Fig 4.2 will only result in decrease or increase the Vregu current. In proposed delay cell when Vbias changes it will result the current from Vregu and current to the ground simultaneously. So the control portion of Vbias in proposed delay cell is bigger than delay cell shows in Fig 4.2.



Fig 4.2 General voltage controlled delay cell





Fig 4.3 Type I oscillator with 7 stages

Fig 4.3 shows the proposed type I oscillator, it has seven stages to form the oscillator.

Fig 4.4 shows the delay time of the proposed type I low oscillator versus supply voltage with different control steps. All simulation results use UMC 90nm technology. When the supply voltage decreases, the Vbias is sated to decreases with control steps. When the power supply voltage decrease 0.025V, the Vbias also decrease according to the setting of controlled step. The decrease of supply

voltage will result operation slow down and increase delay time normally, but with the controlled Vbias, the proposed type I delay cell can decrease the delay time even with supply voltage decrease.

Using 0.05V as the control step, as the supply voltage down, the delay time of the proposed type I oscillator can still slightly decrease. With 0.075V or 0.1V control step, the delay time of proposed type I oscillator can decrease more dramatically with supply voltage decrease.



Fig 4.4 The delay time of the proposed type I low oscillator using different control step

Fig 4.5 shows the output frequency of the proposed type I low voltage oscillator versus supply voltage with different control steps. When the supply voltage decreases, the Vbias is set to decreases with control steps. The setting of controlled voltage Vbias is the same as Fig 4.4. As we can see even with supply voltage decrease, the output frequency of proposed type I low voltage oscillator is increase by the control voltage Vbias.

Using 0.05V as the control step, as the supply voltage down, the output frequency of the proposed type I oscillator can increase slightly. But the frequency is only about 90MHz at power supply voltage 0.425V, which is not fit to our design spec. With 0.075V or 0.1V control step, the output frequency of proposed type I oscillator can increase more dramatically with supply voltage decrease. With 0.1V control step, the proposed type I oscillator can output 250MHz frequency when supply voltage is 0.425V.



Fig 4.5 The output frequency of the proposed type I low oscillator using different control step

Fig 4.6 shows the range of frequency and delay time of proposed type I low voltage oscillator with different power supply voltage. Fig 4.6 (a) shows the maximum and minimum delay time of the proposed type I low voltage oscillator. The decrease of the power supply voltage will increase the delay time. The maximum delay of proposed type I low voltage oscillator is 16.5ns at power

supply voltage 0.55V, which is slightly over 50MHz in our design spec. The minimum delays are all below 3.6ns. Fig 4.6(b) shows the frequency range of proposed type I low voltage oscillator with different power supply voltage. When the voltage is upon 0.475V, the minimum frequency is slightly over 50MHz; the maximum frequencies are over 275MHz in all cases.

To summarized, the proposed type I low voltage oscillator with frequency range 46MHz~275MHz at power supply voltage 0.425V; when power supply voltage is 0.55V, the frequency range of proposed type I oscillator is 60MHz~465MHz.



Fig 4.6 The range of frequency and delay time of proposed type I low voltage oscillator with different power supply voltage

The power consumption of the proposed type I low voltage oscillator is 78uW when operates at 465MHz with power supply voltage 0.55V. When power supply voltage is 0.425V and proposed type I low voltage oscillator operates at 275MHz the power consumption is 29uW. The proposed type I low voltage

oscillator can operate with low power supply voltage and provides wide tuning frequency range. The proposed type I low voltage oscillator has good 50% duty cycle characteristic because of the differential structure, and can be used in power management system and low voltage digital system.

#### 4.3 <u>The Net-Bias Circuit</u>

Since the proposed type I low voltage oscillator needs Vbias to control the output frequency, the net-bias circuit is needed to generate Vbias. Fig. 4.7 shows the net-bias circuit. The net-bias circuit constructs of a series of PMOS with gate connected to the control signals b0~b5. The driving ability of PMOS is designed according to the weight of connected control signal bi. Signals [b5, b4, b3, b2, b1, b0] are binary words. When the words count up, the total driving ability of PMOS will decrease, and the voltage of Vbias is decrease, vice versa. The NMOS with gates and drains connect to the Vbias are work in saturation region. The NMOS is designed to have small size to reduce leakage current.



Fig 4.7 The net-bias circuit

The net-bias circuit provides 490mV~54mV when Vregu is 550mV. With six bits of control binary words, there are total 64 steps each resolution is 7mV. The maximum power consumption of net-bias circuit is 11uW when Vregu is 550mV. The average power consumption of net-bias circuit is 5uW. The

architecture of the net-bias circuit could be easily extended to 7 bits or reduced to 5 bits.

### 4.4 <u>The Type II Low Voltage Oscillator with Wide</u> <u>Tuning Range</u>

The proposed type I low voltage oscillator has wide tuning range and low operation voltage, but we hope to decrease the power consumption because power provide by the solar cell is very low. The power management system has to consume low power to provide efficient energy conversion. The type II low voltage oscillator has been developed to achieve low operation voltage, wide tuning range and ultra low power.

Fig 4.8 shows the proposed type II low voltage oscillator. The architecture of proposed type II low voltage oscillator is very simple. It consists of basic ring oscillator but with two transmission gates insert between the inverters.

and a stiller,



Fig 4.8 The proposed type II low voltage oscillator

The operation of proposed type II oscillator is described as follow; when the Vp decrease and Vn increase, the conductivity of transmission gates increase thus make the signal easy to pass. The delay time of transmission gate decrease, and the frequency of the oscillator increase; similarly when the Vp increase and

Vn decrease, the conductivity of transmission gate decrease thus make the signal hard to pass. The delay time of transmission gate increase, and the frequency of the oscillator decrease. The controlled voltages Vp and Vn are generated and controlled by the double net-bias circuits, which will be described later.

Fig 4.9(a) shows the delay time of the proposed type II low voltage oscillator versus power supply voltage. Fig 4.9(b) shows the output frequency of the proposed type II low voltage oscillator versus power supply voltage. Different steps are used to simulate. I used binary words "4", "7" and "12" as the controlled step. The setting of use binary words "4" as the controlled step is make the counter to count up "4", "8" ... The binary words of the counter would send to double net-bias circuit to generate Vp and Vn. When the binary count up, Vp will decrease and Vn will increase, thus the frequency of proposed type II low voltage oscillator will increase, vice versa. When the power supply voltage decreases 0.025V the controlled binary words would increase one step.



**(a)** 



(b)

Fig 4.9 (a) The delay time of the proposed type I low oscillator using different control step (b) The output frequency of the proposed type I low oscillator using different control step

As we can see, with the count-up control, the proposed type II low voltage oscillator can decrease the delay time dramatically even with power supply voltage decrease. With binary word "4" as controlled step, the output frequency can achieve 75MHz at power supply voltage 0.425V. With binary word "12" as controlled step, the output frequency can achieve 260MHz at power supply voltage 0.425V.

Fig 4.10 shows the frequency and delay time range of proposed type II low voltage oscillator with different power supply voltage. Fig 4.10(a) shows the maximum and minimum delay time of the proposed type II low voltage oscillator. As we can see the delay range is much larger than proposed type I low voltage oscillator. The maximum delay time of proposed type II oscillator is 43ns when power supply voltage is 0.55V, compare to the type I oscillator which maximum delay time is 16.5ns in the same case. The proposed type II oscillator increase the maximum delay time of 26.5ns compare to the proposed type I oscillator. The

minimum delay time of proposed type II oscillator is 0.872ns when power supply voltage is 0.55V, compare to the type I oscillator which minimum delay time is 2.16ns in the same case. The proposed type II oscillator extends the maximum and minimum delay time range compare to the proposed type I oscillator.



Fig 4.10 The range of frequency and delay time of proposed type II low voltage oscillator with different power supply voltage

Fig 4.10 (b) shows the frequency range of proposed type II low voltage oscillator with different power supply voltage. When power supply voltage is 0.55V, the maximum output frequency is 1.15GHz. When power supply voltage is 0.425V, the maximum output frequency is 558MHz. The minimum frequencies are below 23MHz at all power supply voltage cases. Compare to proposed type I oscillator, wider tuning range has been achieved.

Fig 4.11 shows the power consumption of proposed type II low voltage oscillator in the condition of different output frequency with small inverter sizes (Wp=0.48u, Wn=0.24u ). All results are simulated by UMC 90nm technology with power supply voltage 550mV. When proposed type II oscillator operates at 1.13GHz, the power consumption is 6.3uW. When proposed type II oscillator operates at 565MHz, the power consumption is 2.9uW. Compare to the type I oscillator, the type II oscillator reduces power greatly and extends the range of frequency.







In Fig 4.11, the power consumption is not proportional to the frequency when frequency is below 100MHz. The reason is the short current of last stage inverter, as Fig 4.12 shows, when oscillator operates at low frequency, the conductivity of the transmission gate is low. The voltage swing of the node P is not full swing, when operates at 45MHz, the voltage swing of node P is about 100mV. The PMOS of last inverter will conduct even with maximum voltage of node P. the short current of the last inverter increase the power consumption. When the frequency of proposed type II oscillator below 50MHz, the short current will happen and increase the power consumption. This phenomenon results the non- proportional frequency-power relationship when frequency is low. Although the non full-swing effect the power consumption of low frequency, the output clock of CK2 is still sharp. Increase the stage can remove this non-full swing phenomenon easily, since with more stages, the transmission gates can bias to higher conductivity to achieve the same delay time.



Fig 4.12 The non-full swing condition when oscillator operates at low frequency

Fig 4.13 shows the power consumption of proposed type II low voltage oscillator in the condition of different output frequency with big inverter sizes (Wp=4.8u, Wn=2.4u). All results are simulated by UMC 90nm technology with power supply voltage 550mV. The power consumption increases because the size increases. When the frequency roughly below 100MHz the short current happen as describe above, and the power consumption increase leading to the non-proportional frequency-power relationship.



Fig 4.13 The power consumption of proposed type II low voltage oscillator with big inverter sizes

Fig 4.14 shows the power consumption comparison when all oscillators operate in 50MHz with 550mV power supply voltage. The simulation is based on UMC 90nm technology, and all oscillators are connected to the loading of 2 times unit size inverter. The type II oscillator shows very low power consumption since the delay time provided by type II delay cell is almost no power consumption.

| Delay<br>Time<br>20ns<br>(50MHz) | Type II<br>Oscillator | Inverter<br>Style [36] | Current<br>Style [37] | Capacitor<br>Style [5] |  |
|----------------------------------|-----------------------|------------------------|-----------------------|------------------------|--|
| POWER                            | 0.612uW               | 124uW                  | 3.97uW                | 13.3uW                 |  |

Fig 4.14 The power consumption comparison with different oscillators

The net-bias circuit of proposed type II oscillator doubles the net-bias circuit of section 4.3. One used control signals b5~b0 to generated Vp, another used control signals b5n~b0n to generated Vn. When the binary words b5~b0 count up, bias voltage with control signals b5~b0 will decrease; while the bias voltage with control signals b5n~b0n will increase.

The proposed type II oscillator has very low power consumption and wide tuning range. But in the low frequency (below 100MHz), the duty cycle is distort if the control voltages are not match. To insure 50% duty cycle, it need more complicated bias circuit and control. Used the simple net-bias circuit described above, the proposed type II oscillator will generate at most 65% duty cycle in the low frequency range. The distort duty cycle is acceptable in the solar cell power management system, but maybe not acceptable for the high-performance system which needs precise 50% duty cycle characteristic. To apply type II oscillator to the high performance system, the overhead of complicated net-bias circuit and control should be take into account.

### **Chapter 5**

# The Application of Power Efficiency Optimization Unit in Solar Cell Power Management System

In the recent years, the market of portable devices likes notebook, cell phone, PDA and smart phone is grow up rapidly and more new portable products will be developed in the near future. In the developing of portable devices, more and more functions are integrated into a product. At the same time, people concern that whether the product can use for a long time without charging the battery in charge socket. Recently, the price of oil keeps going up. This will impact the electric bill and cost of expense. To increase the utility time and lower the cost of expense, the low power techniques are urgent need. In alternative way, people look for the new alternative energy actively. Environmental energy like solar power, heat power and wind power is used for generating electric power.

1111

Due to energy crisis and eco-awareness, the research of energy harvesting application is getting popular. Previously, the system of tracking maximum output power of photovoltaic cell is implemented [38]. An ultra-low voltage power management for energy harvesting applications is developed and works with a FIR filter [39]. With low output voltage of solar cell, a micro power management system is proposed [40]. The micro power management system decide the working frequency of charge pump by the room lighting environment and output the maximum power to loading circuitry. An energy harvesting application with micro battery is also implemented [41]. This power management circuit accepts energy from RF power and thermo generator power and outputs the power to micro battery as energy storage. A battery management system for solar energy applications is developed [42]. The battery management system is used to increase the service life of the battery. In this chapter, the power efficiency optimization unit is designed and applies to an efficient power management system that is powered by solar energy and outputs three different voltage level for computation circuitry and memory circuitry. The circuit of this system is designed with low power techniques. The control unit circuit to reduce power consumption when the system is powered by rechargeable battery. A power efficiency optimization unit is designed for charge pump. In light loading case, the power efficiency optimization unit outputs low frequency clock to charge pump. This will reduce the power consumption. In heavy loading case, the frequency of clock is increased for keeping the output voltage level of charge pump.

#### 5.1 The Solar Cell Power Management System

The solar cell power management system is shown in Fig. 5.1. The power management system contains a PV cell, control unit, voltage regulator, clock generator, voltage generator, battery charger and rechargeable battery. The control unit decides who will supply energy to voltage regulator. The voltage regulator outputs 500mV to clock generator, three charge pump and power efficiency optimization unit. The voltage generator and battery charger is composed of charge pump. The voltage regulator, 1V generator and - 0.5V generator will supply three voltage level to computation circuitry and memory circuitry. The power efficiency optimization unit supplies a variable frequency clock to 1V generator. The PV cell and battery are also implemented in circuit model and simulated with power management system.



Fig 5.1 The solar cell power management system



Because there are two supply voltage sources of power management, we design a control unit to increase power efficiency of overall system. The schematics of control unit are shown in Fig 5.2. When the PV cell supplies energy to voltage regulator, the direction of current flow is from node 1 to node 2. Thus the voltage of node 1 is higher then node 2. The OP will outputs "1" to inverter and the PMOS between PV cell and voltage regulator will be turned on. In this case, the battery charger is active.

When the battery supplies energy to voltage regulator, the direction of current flow is from node 2 to node 1. Thus the voltage of node 2 is higher then node 1. The OP will outputs "0" to inverter and the PMOS between PV cell and voltage regulator will be turn off. In this case, the control unit will disable the battery charger. With decreasing the current flow back to PV cell and disabling the battery charger, the energy of battery is used efficiently.







Fig 5.3 The voltage level of output of regulator and output of 1V generator in the condition of loading increase and PV cell power reduce gradually

Fig 5.3 shows the simulation voltage level of output of regulator(Vregu) and output of 1V generator(Vpump) in the condition of loading increase and PV cell power reduce gradually. The Vregu influence by the power supply of PV cell and the loading attach to it, the range of voltage level of Vregu in the simulation condition is 450mV~580mV. The 1V generator in this simulation is connected to a constant clock. As the loading increase and PV cell power reduce, the Vpump decrease dramatically. The range of voltage level 1V generator in the simulation condition is 1.1V~570mV. The output of 1V generator Vpump will break down with constant clock supply in the condition that loading increase and PV cell power reduce.

A power efficiency optimization unit is designed to optimize the power efficiency of 1V generator. The unit will supply variable clock frequency to the 1V generator according to the output condition. In light loading case, the power efficiency optimization unit outputs low frequency clock to charge pump and reduce the power consumption. When the loading is heavy, the frequency of clock is increased for keeping the output voltage level of charge pump and prevent breakdown happen.

The power supply of efficiency optimization unit and clock system is from regulator (Vregu). The system have to be worked with power supply vary as simulation above. The 1V generator has sensitive to input frequency of clock of range 50MHz ~ 200MHz. Increase the frequency of clock can increase the charge pump ability of 1V generator, but if the frequency is over 200MHz, the effect will not increase. Similarly reduce the frequency of clock can reduce the charge pump ability and increase the power efficiency when loading is light. Since the loading is light, there is no need of high frequency clock to boost up charge pump ability. Although the low frequency is desirable when lading is light, in our simulation the frequency below 50MHz will not help to increase the efficiency even with light loading. After simulation and discussion, the frequency range of clock to supply the 1V generator is from 50MHz to 200MHz, this range is the minimum frequency range of clock generator. The clock generator have to generate this minimum range with all possible power supply condition, that is, with regulator output (Vregu) vary from 450mV ~580mV. The low voltage wide tuning range oscillator in chapter4 is designed according to this constraint.

Assume the output of 1V generator is very light loading, the best frequency of clock supply to 1V generator to achieve maximum power efficiency is 50MHz; but the work condition of voltage regulator could be vary since the power supply of PV cell could be vary. The output voltage Vregu could be at any level within the range 450~580mV. The low frequency target is associated with 580mV, the clock generator have to be able to generate 50MHz in all power supply cases. The high frequency target is associated with lowest power supply is 450mV to guarantee the high frequency target 200MHz when power supply is 450mV to guarantee the high frequency target 200MHz in all power supply cases.

The proposed type I low voltage oscillator in chapter 4 has meet the high frequency target 200MHz in the power supply range, but the type I low voltage oscillator slightly miss the low frequency target 50MHz when power supply is roughly above 500mV. This performance variation, however, is still acceptable in our system. The proposed type II low voltage oscillator in chapter 4 has very wide tuning range and meet both low/high frequency targets in the power supply range perfectly. The range is much larger than the design constraint minimum range and this provide the design to handle with process variation. The type II low voltage oscillator was used to simulate with power management system, the simulation results will be shown in next section.

#### 5.2 The Power Efficiency Optimization Unit

The power efficiency optimization unit controls the clock frequency of variable clock generator and optimizes the power efficiency of 1V generator according to the loading condition. When the loading of 1V generator is changed, the transfer power efficiency will changed if the supply clock frequency is fixed. Generally, the heavy loading condition will need high frequency clock to achieve high transfer power efficiency; the light loading condition will need low frequency clock to achieve high transfer power efficiency through different loading condition, it needs a detecting mechanism to decide the loading condition. In the power efficiency optimization

unit the voltage detector is used as to decide the loading condition.

Fig 5.4 shows the architecture of power efficiency optimization unit. The voltage detector detects the output voltage of 1V generator to decide the loading condition. When loading increase to heavy, the output voltage of 1V generator will decrease, vice versa. In the solar cell power management system, the power issue is the main concern. The complicated detection and operation is not suite in this system since the power overhead is too much. Simple and efficient system is desire. The voltage detector in the power optimization unit is a one bit detector. It detect the output voltage of 1V generator is up or below to the designed voltage level. The detecting flag will generate to present these two conditions.



Fig 5.4 The architecture of power efficiency optimization unit

There are two proposed voltage generators. The first one is the oscillating voltage detector. The second one is the bias voltage detector. The following part will discuss this two voltage detectors, both two voltage detectors had been simulated with system to verify the performance.

The oscillating voltage detector is first designed with the feature that no bias voltage is needed. Fig 5.5 shows the oscillating voltage detector.



Fig 5.5 The oscillating voltage detector

The oscillating voltage detector consist of an oscillator, the power supply of the oscillator is the output of 1V generator. When the output of 1V generator decrease, the output frequency of the oscillator also decreases, and vice versa.

The output of oscillator is connected to the charge-detecting line (CDL) as present in chapter 3. The CDL will capture the charge condition and generate the detecting Flag. The CDL is designed to measure the specify frequency condition. When the output voltage of 1V generator is decrease, the output frequency of the oscillator is decrease, and the period increase. In the positive phase of the output clock, the CDL start to charge and the D flip-flop will capture the output voltage of 1V generator is, the longer the charge time will be. If the output voltage of 1V generator is lower than the threshold voltage we decided, the long charge time will make the capacitor node over the inverter threshold voltage and the CDL will capture the "LOW" detecting flag. Similarly the high output voltage of 1V generator will increase the frequency and decrease the period of oscillator, this will shorten the charge time and the CDL will capture the "HIGH" detecting flag to represent high output voltage of 1V generator.

The detecting point is defined as below; when the voltage of 1V generator's output is over the detecting point, the voltage generator will capture the "HIGH". When the voltage of 1V generator's output is below the detecting point, the voltage generator will capture the "LOW". The detecting point is the detecting threshold point of the voltage detector, and it can be set arbitrary by change the capacitor sizes. Fig 5.6 shows the detecting point of oscillating voltage detector versus different temperature conditions.

| Temperature<br>(°C)     | 45  | 40  | 35  | 30  | 25  | 20  | 15  | 10  | 5   |
|-------------------------|-----|-----|-----|-----|-----|-----|-----|-----|-----|
| Detecting point<br>(mV) | 890 | 900 | 915 | 920 | 928 | 935 | 955 | 976 | 990 |

Fig 5.6 The detecting point of oscillating voltage detector versus different temperature conditions.

Although the oscillating voltage detector can detect voltage but the power consumption is large. The oscillator is a power consuming component, and it consumes the output power of 1V generator, this will reduce power efficiency. To reduce power consumption and make voltage detector consumes less output power of 1V generator, the bias voltage detector is designed.

EIS

Fig 5.7 shows the bias voltage detector. The operation is describe as follow; when the output voltage of 1V generator decrease, vd will increase and it will be compare to the vref. The vref is generated from the reference voltage generator circuit. The compare result generates the detecting flag. The detecting point of bias voltage detector can be adjusted by the vref and the N,PMOS of lower part of Fig 5.7. The most improvement of bias voltage detector is the power consumption is reduced and it consumes very low power of the output of 1V generator, since the output of 1vV generator is connected to the gate of NMOS.



Fig 5.7 The bias voltage detector

### The power efficiency optimization unit also contain counter, it will count up or down according to the voltage detecting flag. In the design 5 bits counter was used. The binary words of counter will send to bias circuit of clock system to control the output frequency of clock. The power efficiency optimization unit can thus detecting the loading condition through the output voltage of 1V generator and dynamically change the clock frequency supply to the 1V generator.

## 5.3 <u>The Simulation Results of Power Efficiency</u> <u>Optimization Unit and the Solar Cell Power</u> <u>Management System</u>

To verify the solar cell power management system, the design is implemented in UMC 90nm CMOS technology model. The simulation of power efficiency optimization unit has been done to compare with constant clock frequency supply. Fig 5.8 shows the power efficiency measurement of the 1V generator.



Fig 5.8 The power efficiency measurement of the 1V generator

The power efficiency of 1V generator is measured as

(Power pumpout) / ((Clk Power) + (Power pumpin)) \* %

The constant clock cases are simply cut off the voltage detector feedback control, and fix the clock frequency supply to 1V generator.

Fig 5.9 shows the power efficiency comparison of the dynamic detection of the power efficiency optimization unit to constant clock frequency supply using oscillating voltage detector. The detecting point is set to 900mV. As we can see the dynamic detection of the power efficiency optimization unit has better power efficiency compare to two constant clock frequencies supply 66MHz and 180MHz. When the load current is low, the voltage detector sent the flag which represent the output voltage of 1V generator is high enough, thus it's will decrease the clock frequency to the lowest clock frequency. The dynamic detection can have better power efficiency over 66MHz because at this region the dynamic detection operates at frequency below 66MHz. When the loading increase the flag will send to clock system to increase clock frequency and handle the system operation with high loading.



Fig 5.9 The power efficiency of the 1V generator (oscillating voltage detector)



Fig 5.10 shows the power efficiency comparison of the dynamic detection of the power efficiency optimization unit to constant clock frequency supply using bias voltage detector. The detecting point is set to 900mV. The dynamic detection of power efficiency optimization unit shows the better power efficiency, and compare to Fig 5.9 we can see that the power efficiency has been improved and the maximum allowable load current also increase. This is because the bias voltage detector consumes less power of 1V generator's output.



Fig 5.10 The power efficiency of the 1V generator (bias voltage detector)



Consider the PV variation impact to the solar cell power management, we simulated it to verify. In this simulation the supply current of PV cell is varying from 4mA to 0mA. The result is shown in Fig. 5.11. The PV cell outputs zero current in 10us. The first row is output voltage of PV cell. It varies from 840mV to 177mV. The PV cell drains a little current from battery. The second row is output voltage of voltage regulator. It varies from 592mV to 482mV. The third row is 1V generator. It varies from 1.11V to 0.9V. The fourth row is -0.5V generator. It varies from -547mV to -440mV. In the simulation it also can be seen that when the PV cell voltage is too low, the solar cell power management switch the power supply from the PV cell to the battery, and keep the system to work.


Fig 5.11 The three different output voltage with variation of current from PV cell

The difference of power management system without CU and with CU is shown in Fig 5.12. In this simulation, the PV cell is set to output zero current and the power management system is supplied by battery.

If the power management system without control unit that is the PV cell is connected to voltage regulator, the output current of battery is 961uA in 30ns. With the control unit, the output current of battery is reduced to 200uA in 30ns.

The simulation result shows that with CU, the power consumption of battery will be reduced 80% compare to power management system without CU. This will increase the using time of battery.



Fig 5.12 Comparison of power management system with CU and without CU

The layout view of the solar cell power management system is shown in Fig 5.13. The specification of power management system is summarized in TABLE III.



Fig 5.13 The layout view of solar cell power management system

TABLE III

POWER MANAGEMENT SYSTEM FOR SOLAR ENERGY HARVESTING

| Technology                                                                                                     | UMC 90nm CMOS |
|----------------------------------------------------------------------------------------------------------------|---------------|
| The second s | Technology    |
| Output current of                                                                                              | 4mA~0mA       |
| PV cell                                                                                                        |               |
| Output voltage of                                                                                              | 840mV~177mV   |
| PV cell                                                                                                        |               |
| Output power of                                                                                                | 2.3mW~0mW     |
| PV cell                                                                                                        |               |
| 0.5V output                                                                                                    | 592mV~482mV   |
| 1V output                                                                                                      | 1.11V~0.9V    |
| -0.5V output                                                                                                   | -547mV~-440mV |
| Maximum total power                                                                                            | 69%           |
| efficiency                                                                                                     |               |

An efficient power management system for solar energy harvesting applications is implemented in UMC 90nm CMOS technology. The output current of PV cell varies from 4mA to 0mA and its output voltage varies from 840mV~177mV. The power management system outputs voltage of 0.5V, 1V and -0.5V to loading circuitry. The 0.5V output is vary from 592mV~482mV. The 1V output is vary from 1.11V~0.9V. The -0.5V output is vary from -547mV~ -440mV. The power efficiency optimization unit is designed and applied to the 1V generator to make the transfer power more efficient when the loading is varied. With control unit and power efficiency optimization unit, the battery supplies energy efficiently and the power consumption is down to one fifth (20%). The maximum total power efficiency is 69%.



# Chapter 6 Conclusion and Future Work

### 6.1 <u>Conclusion</u>

In this thesis, a dual output clock generator with dynamic frequency/phase tuning ability is proposed first. The smooth charge-based phase blender (SCPB) has been developed to enhance the total phase resolution. The SIPB can operate with input phase difference from 0.3ns to 0.5ns with less error compare to conventional phase blender. The frequency/phase synthesizer has wide operation range on both frequency and phase, and the dynamic frequency/ phase scaling process switch instantly without any relocked time. Each clock output is independent in any dynamic scaling operation. The low power delay cell consumes low power and digital charge-detecting controller could finish lock procedure in 10 reference clock cycles. The proposed clock generator is implemented both in TSMC 130nm CMOS technology and UMC 90nm CMOS technology. Using UMC 90nm CMOS technology, the output frequency range is from 135MHz to 3GHz, and consumes 2.3mW when dual output both operate at 3GHz. To extend an additional output operates at 3 GHz will only increase total power by 0.5mW, thus provide this architecture extensible possibility. The proposed clock generator can be used in data recovery, high speed serial link, and SoCs applications. The research motivation is applied it to advanced power management concept with frequency/phase dynamic scaling demand.

Two low voltage wide tuning range oscillators is researched for solar cell power management application. The design target is to have the wide tuning range to meet the high/low frequency target with a range of power supply voltage. The proposed type I differential oscillator has wide tuning range but the power consumption is relatively higher. The proposed type II oscillator has even wider tuning range and the power consumption is greatly reduced.

Finally the power efficiency optimization unit and solar cell power management system is proposed. In the solar cell power management system, the PV cell outputs voltage of 177mV~840mV and system will outputs 500mV, 1V and -500mV for computation circuitry and memory circuitry. The power management system also contains a rechargeable battery. In daytime, the energy of power management system is supplied by photovoltaic (PV) cell and the battery is charged. In the night, the battery will be discharged and supplies energy to the power management system. The power efficiency optimization unit will dynamically tune the supply clock frequency to 1V generator according to the loading strength to achieve high power efficiency. With control unit, the power consumption decreased 80% compared to system without control unit. The maximum total power efficiency is 69%. All results are simulated in UMC 90nm CMOS technology model.

## 6.2 Future Work



Because of the quadratic relationship between power and voltage, supply voltage reduction has become an important method for reducing active power in VLSI systems, improving reliability of highly scaled MOSFETs, and minimizing the effects of heat dissipation in high-performance systems. In fact, several low power sub-volt consumer products, including microprocessors, have already emerged [43], [44], [45]. Device scaling can lead to high lateral and vertical fields, which can cause breakdown and increase tunneling currents, both of which are mitigated by reducing supply voltage. The International Technology Roadmap for Semiconductors (ITRS) [46] expects sub-I V nominal supply voltages for low operating power at the 90 nm technology node, decreasing to only 0.5 V at the 22 nm node. The trend of low voltage operation is rising because of power issue.

The proposed dual output clock generator can be adjusted to operate with low power voltage. The low voltage operation can further reduce the power consumption and make it easy to integrate with low voltage system. The advance power management concept also can be implemented and connected to proposed multi output clock generator. It can help to verify the advance power management concept and measure the gain and overhead to evaluate the trade off. Fig 6.1 shows the conceptual advance power management system.



Fig 6.1 The advance power management concept

The future work of the solar cell power management system is to further increase the management system functions. The proposed solar cell power management system will charge the battery any time the PV cell is power, and the output voltage nodes -0.5V and 1V is always supply. The more flex control scheme can be designed to control the output nodes to be on or off and control the charge battery decision. Also, the solar cell power management can be designed to be able to work with different kinds of solar cell. Different has different specs and I-V characteristic. The more wide operation range of the solar

cell power management system will make it easy to applied to different solar cell product and extend the adaptability and flexibility. The computation circuit attach to the solar cell power management system can be design and integrate to the system. This will form a whole system from energy harvest, DC/DC converter, the main computation circuit. This can be the prototype of complete specify portable system, and various portable system can be designed by attaching different computation circuits. The solar cell power management can be the platform connects to different portable systems.



# REFERENCE

[1] N. T. Hieu, T. W. Lee, H. H. Park, "All-Digital Phase-Locked Loop for Optical Interconnect Applications," *Advanced Communication Technology*, Vol. 3, pp.1829-1832, Feb. 2007.

[2] R.B. Staszewski, K. Muhammad, D. Leipold, "Digital RF Processor Techniques for Single-Chip Radios," *IEEE CICC*, pp.789–796, Sept. 2006.

[3] D. Sheng, C. C. Chung, C. Y. Lee, "A Fast-Lock-In ADPLL with High-Resolution and Low-Power DCO for SoC Applications," *IEEE APCCAS*, pp.105-108, Dec. 2006.

[4] M. Kossel, T. Morf, W. Baumberger, A. Biber, C. Menolfi, T. Toifl, M. Schmatz, "A Multiphase PLL for 10 Gb/s Links in SOI CMOS Technology," *IEEE RFIC Symposium*, pp. 207–210, June. 2004.

[5] J. S. Wang, Y. M. Wang, C. H. Chen, Y. C. Liu, "An ultra-low-power fast-lock-in small-jitter all-digital DLL," *IEEE ISSCC Dig.* Tech. Papers, vol. 1, pp. 422-607, Feb. 2005.

[6] T. C. Weigandt, B. Kim, P. R. Gray, "Timing Jitter Analysis for High-Frequency, Low-Power CMOS Ring-Oscillator Design", *IEEE ISCAS*, June. 1994.

[7] Floyd M. Gardner, "Charge-Pump PhaseLocked Loops". *IEEE Trans. On Commrcnications*, vol. COM-28. no. 11, Nov. 1980.

[8] J. Sonntag, R. Leonowich, "A Monolithic CMOS 10 MHz DPLL for Burst-Mode Data Retiming". *IEEE ISSCC*. vol. 33, pp. 104-105, Feb. 1990.

[9] M. J. M. Pelgrom, A. C. J. Duinmaijer, and A. P. G.Welbers, "Matching properties of MOS transistors," *IEEE JSSC*, vol. 24, pp. 1433–1440, Oct. 1989.

[10] A. Hastings, "The Art of Analog Layout. Englewood Cliffs," NJ: Prentice- Hall, 2001.

[11] T. C.Weigandt, B. Kim, and P. R. Gray, "Timing jitter analysis for high frequency,low-power CMOS ring-oscillator design," presented at the Proc. Int. Symp. Circuits and Systems, London, U.K., June. 1994.

[12] H. B. Bakoglu, "Circuits, Interconnections, and Packaging for VLSI. Reading," MA: Addison-Wesley, 1990.

[13] P. Larsson, "Power supply noise in future ICs: A crystal ball reading," *IEEE CICC*, 1999, pp. 467–474.

[14] -- "di=dt noise in CMOS integrated circuits," Analog Integrated Circuits and Signal Processing, no. 1/2, pp. 113–130, Sept. 1997.

[15] Y. Okazaki et al., "Characteristics of a new isolated p-well structure using thin epitaxy over the buried layer and trench isolation," *IEEE Trans. Electron Devices*, vol. 39, pp. 2758–2764, Dec. 1992.

[16] R. B. Merrill, W. M. Young, and K. Brehmer, "Effect of substrate material on crosstalk in mixed analog/digital integrated circuits," in *Proc. IEEE IEDM*, 1994, pp. 433–436.

[17] K. Joardar, "Substrate crosstalk in BiCMOS mixed mode integrated circuits," *Solid-State Electron.*, vol. 39, no. 4, pp. 511–516, Apr. 1996.

[18] P. Larsson, "Measurements and analysis of PLL jitter caused by digital switching noise," *IEEE JSSC*, vol. 36, pp. 1113–1119, July. 2001.

[19] A. P. Chandrakasan , R. W. Brodersen, editors. "Low- Power CMOS Design," Wiley-IEEE Press, 1997.

[20] D.M. Chapiro, "Globally Asynchronous Locally Synchronous Systems," PhD thesis, Stanford University, 1984.

[21] A. Iyer, D. Marculescu, "Power and Performance Evaluation of Globally Asynchronous Locally Synchronous Processors," *International Symposium on Computer Architecture*, pp. 158-170, May. 2002. [22] G. Magklis et al., "Profile-based Dynamic Voltage and Frequency Scaling for a Multiple Clock Domain Microprocessor," *International Symposium on Computer Architecture*, pp. 14-27, June. 2003.

[23] G. Semeraro et al., "Energy-Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling", Proc. 8th *Int'l Symp. High-Performance Computer Architecture*, pp. 29-40, 2002.

[24] G. Semeraro et al., "Dynamic Frequency and Voltage Control for a Multiple Clock Domain Microarchitecture", Proc. 35th *Ann. IEEE/ACM Int'l Symp. Microarchitecture*, pp. 356-370, 2002.

[25] Iyer A and Marculescu D., "Power efficiency of voltage scaling in multiple clock, multiple voltage cores", *International Conference on Computer-Aided Design*, pp. 379-386, 2002.

[26] Eswaran A and Chen S, "All-domain fine grain dynamic speed/voltage scaling for GALS processors", 2003, URL http://www.ece.cmu.edu/~schen1/ece743/proposal\_up.pdf.

[27] J. Rong, Z. Xianjun, C. Liang, Z. Junfeng, "The Implementation and Design of a Low-Power Clock Distribution Microarchitecture," *International Conference on Networking, Architecture,* pp. 21-30 29-31, July. 2007.

[28] Reinman G. and Jouppi N., "An Integrated Cache Timing and Power Model". Technical report 2000/7, Western Research Laboratory, USA, 2000.

[29] Wang Yongwen and Zhang Minxuan, "Microarchitecture-Level Power Modeling and Analyzing for High-Performance Microprocessors", *Chinese Journal of Computers*, Vol.27, pp.1320-1327, Nov. 2004.

[30] Wang Yongwen and Zhang Minxuan, "IMPACT XP: An Integrated Performance / Power Analysis Framework for Compiler and Architecture Research", *Chinese Journal of Electronics*, Vol.13, pp. 250-253, Nov. 2004 [31] Chang P., Mahlke S., Chen W. et al., "An Architectural Framework for Multiple-instructionissue Processors", *International Symposium on Computer Architecture*, pp. 266-275, 1991.

[32] B.W. Garlepp, K.S. Donnelly, J. Kim, P.S. Chau, J.L. Zerbe, C. Huang, C.V. Tran, C.L. Portmann, D. Stark, Y. F. Chan, T.H Lee, M.A Horowitz, "A portable digital DLL for high-speed CMOS interface circuits," *IEEE JSSC*, Vol. 34, No.5, pp.632-644, May. 1999.

[33] D.E. Calbaza, Y. Savaria, "A Direct Digital Period Synthesis Circuit", *IEEE JSSC*, Vol. 37, No.8, pp. 1039-1045, Aug. 2002.

[34] B. Pontikakis, H. T. Bui, F.R Boyer, Y. Savaria, "A Low-Complexity High-Speed Clock Generator for Dynamic Frequency Scaling of FPGA and Standard-Cell Based Designs," *IEEE International Symposium on Circuits and Systems*, pp:633–636, May. 2007.

[35] J. H. Kim, Y. H. Kwak, M. Kim, S. W. Kim, C. Kim, "A 120-MHz–1.8-GHz CMOS DLL-Based Clock Generator for Dynamic Frequency Scaling," *IEEE JSSC*, Vol 41, No.9, pp.2077-2082, Sept. 2006.

[36] R.E.Best, "Phase-locked loops: Theory, Design and Applications," New York: McGraw-Hill, 1984.

[37] M. Maymandi-Nejad et al., "A Monotonic Digitally Controlled Delay Element," *IEEE JSSC*, vol. 40, no. 11, pp. 2212-2218, Nov. 2000.

[38] C. C. Hua, J. R. Lin, and C. Shen, "Implementation of a DSP-Controlled Photovoltaic System with Peak Power Tracking," *IEEE Transactions on Industrial Electronics*, vol. 45, No. 1, pp. 99-107, Feb. 1998.

[39] C.Y. Tsui, H. Shao, W. H. Ki and F. Su, "Ultra-Low Voltage Power Management and Computation Methodology for Energy Harvesting Applications," *Symposium on VLSI Circuits* Digest of Technical Papers, pp. 316-319, 2005. [40] H. Shao, C. Y. Tsui and W. H. Ki, "A Micro Power Management System and Maximum Output Power Control for Solar Energy Harvesting Applications," *IEEE ISLPED*, pp. 298-303, Aug. 2007.

[41] H. Lhermet, C. Condemine, M. Plissonnier, R. Salot, P. Audebert, and M.
Rosset, "Efficient Power management Circuit: From Thermal Energy Harvesting to Above-IC Microbattery Energy Storage," *IEEE JSSC*, vol. 43, No. 1, pp. 246-254, Jan. 2008.

[42] M. Glavin and W.G. Hurley, "Battery Management System for Solar Energy Applications," *International Universities Power Engineering Conference*, Vol.1, pp.79-83, Sept. 2006.

[43] (2005) [Online]. http://www.transmeta.com/crusoe.

[44] (2005) [Online]. http://www.amd.com/us-en/0..3715\_12353.00.html



[45] (2005) ITRS. [Online]. http://public.itrs.net

# Vita

#### PERSONAL INFORMATION

- Birth Date: November. 04, 1983
- Birth Place: Taipei, Taiwan, R.O.C.
- Address: Department of Electronics Engineering National Chiao Tung University 1001 Ta-Hsueh Road Hsin-chu, Taiwan 30010, R.O.C.

E-Mail Address: maxto777@gmail.com

### **EDUCATION**

B.S. [2006] Department of Electronics Engineering, National Chiao-Tung University.M.A.[2008] Institute of Electronics, National Chiao-Tung University.

