

# 電子工程學系 電子研究所

### 碩士 論 文

超低動態電壓基於頻率比之製程、電壓、溫度感測器 與其應用

Ultra-low Dynamic Voltage Scaling Fequency-Ratio-Based PVT Sensor Design and Applications

研究生:林上圓

指導教授:黃 威 教授

### 中華民國一百年九月

### 超低動態電壓基於頻率比之製程、電壓、溫度感測器 與其應用

Ultra-low Dynamic Voltage Scaling Frequency-Ratio-Based

**PVT Sensor Design and Applications** 

研究生:林上圓 Student: Shang-Yaun Lin

指導教授:黃 威 教授 Advisor: Prof. Wei Hwang

國立交通大學



Submitted to Department of Electronics Engineering & Institute of Electronics

College of Electrical Engineering and Computer Engineering

National Chiao Tung University

in partial Fulfillment of the Requirements

for the Degree of

Master

in

**Electronics Engineering** 

September 2011

Hsinchu, Taiwan, Republic of China

中華民國一百年九月

## 超低動態電壓基於頻率比之製程、電壓、溫度感測器 與其應用

學生:林上圓

#### 指導教授:黃 威 教授

#### 國立交通大學電子工程學系電子研究所

#### 摘 要

隨著製程不斷微縮,高密度的積體電路造成自我加熱的問題。為了有效進行 溫度管理,本論文提出了一種適應性電壓選擇的製程電壓溫度變異感測器,其操 作的電壓範圍從 0.25 V~0.5V,具備了 2.3µW 功耗和 50K 採樣 /秒的轉換率。 接著,0.4V 的完全晶上抗製程變異溫度感測器被提出,製程變異造成的影響顯 著降低。該電路經過實現且運作在 0.4V 電源電壓下,運行的溫度範圍為 0°C 至 100°C。此溫度感測器核心面積(不包括 輸入/輸出阜)只有 990µm<sup>2</sup>。電力消耗 量的轉換率是 11.6µJ/採樣。所有這些特點使溫度感測器適用於能源有限與能源 收穫微型便攜式平台。最後,三維集成電路(3D-IC) 架構被提出。為了防止內 部各層的熱點效應與減少動態隨機存取記憶體刷新功率,我們運用抗製程變異溫 度感測器,提出了一個動態隨機存取記憶體刷新控制器。由於溫度感測器功耗很 小,刷新控制器之額外功率不大,且有效減少 67.67%之待機功耗。

I

# Ultra-low Dynamic Voltage Scaling Frequency-Ratio-Based PVT Sensor Design and Applications

Student: Shang-Yaun Lin Advisor: Prof. Wei Hwang

Department of Electronics Engineering & Institute of Electronics National Chiao-Tung University

### **ABSTRACT**

With process scaling down continuously, high level of integration introduces the problem of self-heating. To perform thermal management, this thesis proposes 0.5V~0.25V process voltage and temperature (PVT) sensors with adaptive voltage selection operating over an ultra-low supply voltage range from 0.25V~0.5V with 2.3µW power consumption and 50k samples/sec conversion rate. Next, the 0.4V fully integrated process invariant temperature sensor is proposed. The effect of process variation is significantly reduced. The realization meets the target to be capable of 0.4V supply voltage operation over the temperature range of 0°C to 100°C. The area of the sensor core (without I/O pads) is only 990µm<sup>2</sup>. The power consumption per conversion rate is 11.6pJ/sample. The high area/energy efficiency characteristics make the proposed sensor applicable for energy-limited miniature portable platforms. Finally, the heterogeneous three dimension integrated circuit (3D-IC) architecture is presented. To prevent hot spot on the intra layer and reduce refresh power on DRAM layer, we proposed a DRAM refresh controller utilizing the process invariant temperature sensor. Thanks for tiny power consumption of temperature sensors, the refresh controller reduces standby power significantly, 67.67% without much power overhead.

### Acknowledgements

感謝許多人的幫助,讓我完成了這一篇論文。

首先我要感謝我的指導教授黃威老師在這兩年來的指導和鼓勵,在研究過程 中提供了許多方向和建議,在老師指導下讓我對研究有更深入的了解,也建立了 研究的興趣。此外老師提供的研究資源與自由的研究環境,讓我能夠充分發揮自 己的能力完成論文。

感謝實驗室的學長與同學們,這兩年來的幫忙。感謝張銘宏、黃柏蒼、楊皓 義三位學長在我的研究上的協助,特別是張銘宏學長在論文和研究上的鼎力相 助。此外也感謝其他實驗室夥伴們平時的互相 Cover,讓我在眾多科目夾擊之下 可以全身而退。

最後要感謝一直支持我的家人和朋友們,讓我可以克服論文壓力,順利完成 碩士的論文研究。

### Content

| Chapter 1 Introduction                                                  | 1    |
|-------------------------------------------------------------------------|------|
| 1.1 Motivation                                                          | 1    |
| 1.2 Research Goal and Major Contributions                               | 3    |
| 1.3 Thesis Organization                                                 | 4    |
| Chapter 2 Previous Works of Temperature Sensors                         | 6    |
| 2.1 Introduction                                                        | 6    |
| 2.2 BJT Based Temperature Sensor                                        | 7    |
| 2.3 Analog CMOS Based Temperature Sensor                                | 14   |
| 2.4 Delay Based Temperature Sensors                                     | 18   |
| 2.4.1 Time-to-Digital Converter Based Temperature Sensors               | 18   |
| 2.4.2 Dual-DLL-Based All-Digital Temperature Sensor                     | 26   |
| 2.4.3 Sub-µW Embedded CMOS Temperature Sensor                           | 28   |
| 2.5 Leakage Based Temperature Sensor                                    | 32   |
| 2.6 Frequency-to-Digital Converter Based Temperature Sensor             | 35   |
| 2.7 Summary                                                             | 38   |
| Chapter 3 0.5V~0.25V Process, Voltage and Temperature Sensors with Adap | tive |
| Voltage Selection                                                       | 41   |
| 3.1 Introduction                                                        | 41   |
| 3.2 Design Principles in Ultra-Low Voltage                              | 43   |
| 3.2.1 Challenges of Temperature Sensor in Ultra-Low Voltage             | 43   |
| 3.2.2 Ultra-Low Voltage Frequency-Based Temperature Sensor              | 48   |
| 3.3 PVT Sensors with Adaptive Voltage Selection                         | 51   |
| 3.3.1 Finite State Machine                                              | 52   |
| 3.3.2 Process and Voltage Sensor                                        | 54   |
| 3.3.3 Temperature Sensor                                                | 57   |
| 3.3.4 PV-Compensation                                                   | 59   |
| 3.4 Simulation Results                                                  | 61   |
| 3.5 Summary                                                             | 63   |
| Chapter 4 0.4V Fully Integrated Process Invariant Temperature Sensor    | 64   |
| 4.1 Introduction                                                        | 64   |
| 4.2 Design Concepts of Process Invariant Temperature Sensor             | 65   |
| 4.3 Specific Architecture of Process Invariant Temperature Sensor       | 68   |
| 4.4 Simulation and Experimental Results                                 | 70   |
| 4.5 Summary                                                             | 78   |
| Chapter 5 Temperature-Aware DRAM Refresh Controller in TSV 3D-IC        | 79   |
| 5.1 Introduction                                                        | 79   |

| 5.2 Thermal Issues and Solutions in 3D-IC and DRAM Refresh | 81  |
|------------------------------------------------------------|-----|
| 5.2.1 Thermal Issues in 3D-IC                              | 81  |
| 5.2.2 The 3D-IC with Interlayer Cooling                    | 83  |
| 5.2.3 Previous Works of DRAM Refresh Control               | 89  |
| 5.3 Heterogeneous 3D Integration                           | 94  |
| 5.4 Temperature-Aware Refresh Controller of DRAM Layer     | 98  |
| 5.4.1 Data Retention Time Analysis                         | 98  |
| 5.4.2 Proposed Refresh Control Scheme                      | 101 |
| 5.4.3 Simulation Results                                   | 105 |
| 5.5 Summary                                                | 107 |
| Chapter 6 Conclusions and Future Work                      |     |
| 6.1 Conclusions                                            | 108 |
| 6.2 Future Work                                            | 109 |
| Reference                                                  |     |



### **List of Tables**

| Table 2.1 Comparison of each type temperature sensors.              | 40 |
|---------------------------------------------------------------------|----|
| Table 3.1 Temperature sensor comparisons                            | 62 |
| Table 4.1 Performance Comparison of Recent Temperature Sensors.     | 77 |
| Table 5.1 The relation between control signal and refresh frequency |    |



### **List of Figures**

| Figure 1.1 Sensor network of WBAN [1.10]2                                              |
|----------------------------------------------------------------------------------------|
| Figure 2.1 Schematic of BJT based temperature sensor [2.13]                            |
| Figure 2.2 Dual-slope integrating ADC for (a) genetic design, (b) timing diagram       |
| [2.13]                                                                                 |
| Figure 2.3 Schematic of the thermal sensing system [2.13]                              |
| Figure 2.4 Schematic of the front end sense stage with active cascode impedance        |
| enhancement, DEM and chopped currents in BJT pairs. Output $V_{BE1}$ and $V_{BE2}$ are |
| routed to the input ADC input MUX [2.13]13                                             |
| Figure 2.5 Illustration of the current chopping to up-convert the temperature signal   |
| to f <sub>chop</sub> [2.13]                                                            |
| Figure 2.6 Conventional three-transistor temperature sensor [2.17]14                   |
| Figure 2.7 Conventional three-transistor temperature sensor [2.17]16                   |
| Figure 2.8 Four-transistor, voltage output, temperature sensor [2.18]17                |
| Figure 2.9 Operating points of four-transistor temperature sensor [2.18]               |
| Figure 2.10 Block diagram of the time-to-digital temperature sensor [2.19]19           |
| Figure 2.11 Temperature-to-pulse generator [2.19]                                      |
| Figure 2.12 Width offset reduction accomplished by delay line 2 [2.19]21               |
| Figure 2.13 Operating points of four-transistor temperature sensor [2.19]              |
| Figure 2.14 Implemented architecture of the proposed smart temperature sensor          |
| [2.21]                                                                                 |
| Figure 2. 15 Sensor output linearity for (a) temperature-insensitive and (b) curvature |
| -compensating ARDL cell delay [2.21]                                                   |
| Figure 2.16 Modified temperature compensation circuit for the ARDL delay cell          |
| [2.21]                                                                                 |
| Figure 2.17 Basic architecture of DLL-based CMOS digital temperature sensor            |
| [2.22]                                                                                 |
| Figure 2.18 Calibration mode (top) and measurement mode (bottom) [2.22]28              |
| Figure 2.19 Block diagram of the proposed temperature sensor with interfacing in       |
| the RFID tag system [2.23]                                                             |
| Figure 2.20 Block diagram of the proposed temperature sensor with interfacing in       |
| the RFID tag system [2.23]                                                             |
| Figure 2.21 Simulated temperature modulated pulse width from -10 °C to 30 °C           |
| [2.23]                                                                                 |
| Figure 2.22 Sub-threshold current thermal sensor [2.24]                                |
| VII                                                                                    |

| Figure 2.23 Leakage current mechanisms in the thermal sensor [2.24]34               |
|-------------------------------------------------------------------------------------|
| Figure 2.24 Implementation of the sensor along with the logarithmic counter [2.24]. |
|                                                                                     |
| Figure 2.25 Block diagram of the temperature sensor proposed [2.25]36               |
| Figure 2.26 Enhanced performance of temperature measurement with the TIO [2.25].    |
|                                                                                     |
| Figure 2.27 Ring type oscillator with current starved delay cells [2.25]37          |
| Figure 2.28 (a) Block diagram of a FDC and (b) timing diagram of the control signal |
| [2.25]                                                                              |
| Figure 2.29 The fishbone diagram of temperature sensors                             |
| Figure 2.30 Comparison of temperature sensors                                       |

| Figure 3.1 The DVFS system       | of energy harvesting             |                            |
|----------------------------------|----------------------------------|----------------------------|
| Figure 3.2(a)                    | Temperature-to-delay-different   | ence generator.(b)         |
| Temperature-to-frequency-dif     | ference generator                | 45                         |
| Figure 3.3 The linearity         | of temperature sensitive         | delay line (TSDL) in       |
| super-threshold, near-threshol   | d and sub-threshold region.      | 46                         |
| Figure 3.4 (a) Ultra-low volta   | age frequency-based temper       | ature sensor. (b) Inverter |
| used in SB-TSRO.                 |                                  | 48                         |
| Figure 3.5 (a) The relationship  | between temperature and t        | hreshold voltage. (b) The  |
| relationship of SB-TSRO outp     | out frequency versus temperation | ature50                    |
| Figure 3.6 Architecture of pro   | posed sensor                     |                            |
| Figure 3.7 State diagram of FS   | SM                               | 53                         |
| Figure 3.8 Signal waveform d     | iagram of FSM                    | 53                         |
| Figure 3.9 Implement circuit     | of FSM                           | 54                         |
| Figure 3.10 ZTC point simula     | tion of NMOS and PMOS            | 55                         |
| Figure 3.11 Implement circuit    | of PV sensor                     | 55                         |
| Figure 3.12 (a) The relationship | p between digital output and     | process variation.(b) The  |
| Monte Carlo simulation           |                                  | 56                         |
| Figure 3.13 (a) Compensatio      | n circuits and mapping tab       | ble of voltage sensor. (b) |
| Voltage sensor digital output    | t under variation. (c) Dig       | ital output after process  |
| compensation                     |                                  | 57                         |
| Figure 3.14 Implement circuit    | of temperature sensor            |                            |
| Figure 3.15 Simulation results   | of temperature sensor            | 59                         |
| Figure 3.16 PV-compensation      | circuits and compensation v      | value tables60             |
| Figure 3.17 Simulation results   | of compensated digital outp      | put61                      |
| Figure 3.18 Simulation error of  | f proposed temperature sense     | sor62                      |

| Figure 4.1 Block diagram of the proposed ultra-low voltage frequency-based                |
|-------------------------------------------------------------------------------------------|
| temperature sensor with process variation immunity enhancement                            |
| Figure 4.2 The effect of process variation on the proposed process invariant              |
| temperature sensor                                                                        |
| Figure 4.3 The implementation of the proposed process invariant temperature sensor.       |
|                                                                                           |
| Figure 4.4 The timing diagram of the proposed process invariant temperature sensor.       |
|                                                                                           |
| Figure 4.5 (a) Digital output of sensor in post-layout simulation. (b) Simulated          |
| output error for 0°C~100°C71                                                              |
| Figure 4.6 Microphotograph of proposed process invariant temperature sensor72             |
| Figure 4.7 PCB board design73                                                             |
| Figure 4.8 Measurement environment for the test chips74                                   |
| Figure 4.9 Bare die of the test chip on PCB board75                                       |
| Figure 4.10 Measured error curves for 12 test chips76                                     |
| Figure 4.11 Measured result curves for 12 test chips76                                    |
| Figure 4.12 Measurement error curves for voltage variations                               |
|                                                                                           |
| Figure 5.1 3D circuit architecture connected to a conventional heat removal device        |
| [5.16]                                                                                    |
| Figure 5.2 Temperature increase on the top die in a 3D chip-stack caused by a             |
| $100 \times 100 \mu m2$ hot spot is approximately three times higher (red curve) than the |
| temperature increase in a 2D SoC chip (blue curve) [5.10]82                               |
| Figure 5.3 Graphical interface of the thermal compact model for 3D stacked                |
| structures [5.10]                                                                         |
| Figure 5.4 Scheme of 3D-IC stack with microchannel [5.15]                                 |
| Figure 5.5 3D-IC with TSVs and inter-layer cooling channels that is enclosed in a         |
| sealed manifold [5.15]85                                                                  |

|                                                                             | 89     |
|-----------------------------------------------------------------------------|--------|
| Figure 5.8 Self-refresh and thermometer control scheme [5.17].              | 90     |
| Figure 5.9 Block diagram of a self-refresh scheme [5.17]                    | 92     |
| Figure 5.10 Block diagram and timing diagram for the self-refresh period co | ontrol |

| with temperature sensor [5.18]93                                                              |
|-----------------------------------------------------------------------------------------------|
| Figure 5.11 Heterogeneous integration of multi-core, SRAM, DRAM, front-end                    |
| circuits stacking                                                                             |
| Figure 5.12 Detail floor planning of each layer96                                             |
| Figure 5.13 Two steps of signal amplification. (a) Selected WL turning on and                 |
| charge shared between Ccell and CBL. (b) Sense amplifier amplify small voltage                |
| swing of BL98                                                                                 |
| Figure 5.14 The data retention time of DRAM cell in different sensitivity of sense            |
| amplifier                                                                                     |
| Figure 5.15 Monte-Carlo simulation of data retention time when $\Delta VBL = 120 \text{mV}$ . |
|                                                                                               |
| Figure 5.16 (a) The DRAM layer in 3D-IC. (b) The refresh controller of DRAM                   |
| sub-block                                                                                     |
| Figure 5.17 (a) Refresh CLK generator. (b) The sense amplifier control circuit. 104           |
| Figure 5.18 The operation waveform of sub-block system105                                     |
| Figure 5.19 Standby power analysis of 2Mb DRAM106                                             |
| Figure 5.20 Standby power reduction of 2Mb DRAM107                                            |
|                                                                                               |

| Figure 6.1 Sub/near-threshold DVFS system                       | .110  |
|-----------------------------------------------------------------|-------|
| Figure 6.2 Conceptual image of proposed 3D stacked architecture | . 111 |
|                                                                 |       |
| 1896                                                            |       |
|                                                                 |       |
|                                                                 |       |

# Chapter 1 Introduction

### **1.1 Motivation**

With the evolution of CMOS process technology, the number of transistors in a digital core doubles about every two years. The increases of transistor density and operating frequency have brought the effect of shorter battery life. For some applications such as wireless body area network (WBAN) sensors, the critical consideration is life time instead of operating speed. An important application of WBAN is the vital sensor network shown in Figure 1.1. Sensor nodes that measure biomedical signals such as electrocardiogram, blood pressure, and etc, are small pieces either attached on or implanted into a human body. They use a battery with as thin and light characteristics as possible. Most of them do not have the ability to last for a long time. Thus, how to perform a low-power design and meanwhile conform to the speed and reliability requirements is an important issue.



Figure 1.1 Sensor network of WBAN [1.10].

Ultralow-power dissipation can be achieved by operating digital circuits with scaled supply voltages, albeit with degradation in speed and increased susceptibility to parameter variations. The operating voltage is scaled down to sub-threshold or near-threshold regions depending on the power and speed requirements of circuit system. There are many researches about sub/near-threshold operation. [1.1] demonstrates optimizations of sub-threshold design in device, circuit as well as architecture perspectives, which are different from the conventional super-threshold design. [1.2] gives examples to show that designing flexibility into ultralow-power (ULP) systems across the architecture and circuit levels can meet both the ULP requirements and the performance demands. It also present a method that expands on ultra-dynamic voltage scaling (UDVS) to combine multiple supply voltages with component level power switches to provide more efficient operation at any energy-delay point and low overhead switching between points. The UDVS technique is described in [1.3], which presents voltage-scalable circuits such as logic cells, SRAMs, ADCs, and dc-dc converters. Using these circuits as building blocks, some applications have been highlighted.

Furthermore, dynamic voltage and frequency scaling (DVFS), achieves extremely efficient energy saving by adjusting system supply voltage and frequency depending on workload monitor [1.4]. Because of this reason, there are many previous researches about DVFS power management for digital systems such as RISC, DSP and Video Code. On the other hand, as we continue to reduce the voltage until the transistor get into the near/sub-threshold voltage, circuits will become more sensitive to Process, Voltage and Temperature (PVT) variations than super threshold. As a result, this thesis focuses on temperature sensor research which can operate at ultra-low dynamic voltage scaling (DVS).

On the other hand, a new class of package technologies, three-dimensional integrated circuit (3D-IC) [1.5], [1.6], for multi-function integration makes on-die hot spots even worse because of increasing power density and unbalanced thermal stresses distribution. Temperature variations over time induced by those stacking structures in 3D-IC require a fast and area-efficient temperature sensor to enable real-time multiple-location hot-spot detection. Also, in order to achieve small self-refresh current, the dynamic random access memory (DRAM) required on-chip thermometer with self-refresh scheme [1.7] - [1.9]. When a thermometer is implemented in a memory chip, many factors should be considered, including number, location, accuracy, area, and power consumption.

#### **1.2 Research Goal and Major Contributions**

The goal of this research is to design and implement ultra-low dynamic voltage scaling frequency-ratio-based PVT sensor for variation-aware DRAM refresh controller in 3D-IC. It includes 0.5V~0.25V PVT sensors with adaptive voltage 3

selection, 0.4V fully integrated process invariant temperature sensor and temperature-aware DRAM refresh controller in 3D-IC.

The major contributions of this thesis are list as follow:

- The 0.5V~0.25V process, voltage and temperature sensors with adaptive voltage selection are proposed for temperature measurement in the energy harvesting DVFS systems. It composes of process, voltage and temperature sensors, providing process, voltage and temperature information. The process sensor and voltage (PV) sensor monitor the process variation and voltage variation continuously and give the variation information for temperature compensation.
- A 0.4V fully integrated process invariant frequency-based temperature sensor is proposed. The effect of process variation is significantly reduced. Moreover, it consumes tiny power and occupies small area. Highly area and energy efficient is achieved.
- 3. The heterogeneous 3D-IC architecture is presented. To prevent hot spot on the intra layer and reduce DRAM refresh power, we proposed a refresh controller utilizing the process invariant temperature sensor. The controller reduces standby power significantly without much power overhead.

#### **1.3 Thesis Organization**

This thesis includes six chapters which focus on different temperature sensor design and its application such as: 3D-IC and DRAM refresh. The following briefly introduces the content of each chapter.

Chapter 2 gives an overview of previous temperature sensors. It include conventional BJT based, analog CMOS, delay-based and frequency-based temperature sensor. Also, some of newest technique is introduced and compared.

Chapter 3 proposes 0.5V~0.25V process, voltage and temperature sensors for temperature measurement in the energy harvesting DVFS systems. It composes of process, voltage and temperature sensor. The process sensor and voltage (PV) sensor monitor the process variation and voltage variation continuously and give the variation information for temperature compensation. We will show simulation result and performance summary in the end of this chapter.

Chapter 4 presents a 0.4V fully integrated process invariant frequency-based temperature sensor. The effect of process variation is significantly reduced. We will show experimental result, chip photo and performance summary in the end of this chapter.

1896

Chapter 5 demonstrates the heterogeneous 3D-IC architecture which contains CPU, SRAM, DRAM and analog circuits. To prevent hot spot on the intra layer and reduce DRAM refresh power, we proposed a refresh controller utilizing the process invariant temperature sensor.

Chapter 6 gives the conclusion of this thesis and future work.

# Chapter 2 Previous Works of Temperature Sensors

### **2.1 Introduction**

Traditionally, the temperature sensors were constructed by proportional to absolute temperature (PTAT) and complimentary to absolute temperature (CTAT) sensors which were usually fabricated in bipolar processes. To be more compatible with standard CMOS technologies, the substrate bipolar transistor was used instead for thermal sensing [2.1]-[2.3]. For accuracy enhancement, the sensors needed extra analog -to-digital convertors (ADCs) which took up more chip area and consumed more power. Most high-accuracy and high-resolution temperature sensors are based on the temperature characteristic of parasitic bipolar transistors. The inaccuracy of state-of-art smart voltage-domain temperature sensors were only  $\pm 0.1^{\circ}C(3\sigma)$  [2.4],[2.5]. Their digital output resolution can be no less than 0.025°C. Those were achieved by using dynamic element matching, a combination of correlated double-sampling and system-level chopping for offset cancellation, precision mismatch-elimination layout, and individual trimming at room temperature after packaging. However, all these analog techniques led to complex architecture, slow conversion rate, and large area/power overhead. It is hard to fit these voltage-domain temperature sensors within the form factor/ power-budget of a miniature microwatt system.

Time-to-Digital-Converter (TDC) is usually used to replace analog circuit. Its

applications are gradually expanding such as a phase comparator of all-digital-PLL [2.6], [2.7], Temperature sensors circuit [2.8]-[2.10], jitter measurement [2.11], modulation circuit and demodulation circuit as well as a TDC-based ADC [2.12]. However, for a TDC, hundreds of inverters were required to obtain enough pulse delay to achieve sufficient temperature resolution. It has problems of occupying large area and consuming high power. The bottom line is that TDC is not suitable for near-threshold and sub-threshold, so we utilize frequency-to-digital converter (FDC) technology to make up temperature sensor and achieve small area and low power and operate at low voltage.

### 2.2 BJT Based Temperature Sensor

Most of all temperature sensors are using a technique by comparing the difference in the base–emitter voltage of two bipolar junction transistors (BJTs) at different current densities. The main objective of the temperature sensor is to generate  $I_{ref}$  current and  $I_{ptat}$  current, which are the inputs of the ADC. A simplified schematic of [2.13] that generates three currents,  $I_{ptat}$ ,  $I_{ctat}$ , and  $I_{ref}$ , is presented in Figure 2.1.  $I_{ptat}$  current increases proportional to temperature and is generated by two n-p-n vertical BJTs with a 20:1 ratio.  $I_{ctat}$  current decreases linearly with temperature and is generated by the base–emitter voltage  $V_{be}$  of the BJT. The voltages across  $R_{ptat}$  and  $R_{ctat}$  have a temperature coefficient of about 0.3 mV/°C and -2 mV/°C, respectively. An  $I_{ref}$  current, constant over temperature, can be generated by proper summation of  $I_{ptat}$  and  $I_{ctat}$ .



Figure 2.1 Schematic of BJT based temperature sensor [2.13].

Figure 2.2 (a) shows a simplified circuit diagram for an ideal dual-slope integrating 1896 ADC. It consists of an integrator, a comparator, and switches for  $I_{ptat}$  and  $I_{ref}$ . An integrator consists of an integrating capacitor and a two stage opamp and a comparator is also the same circuit topology as a two-stage opamp without the phase compensation network for faster speed. Resolution of the ADC is 9 bits and bandwidth is 32K samples per second with a 32-µs thermometer cycle. One internal clock with an 8-ns period corresponds to 1° in this design.



Figure 2.2 Dual-slope integrating ADC for (a) genetic design, (b) timing diagram [2.13].96

During the initialization period,  $I_{ptat}$  and  $I_{ref}$  are disconnected from the integrator and  $V_X$  is precharged to  $V_{ref}$  with an assumption that  $V_{OS1}$  and  $V_{OS2}$  are zero.  $V_{ref}$  is 1 V in this design but also assumed to  $V_{SS}$  in Figure 2.2 for simple analysis and description. The dual-slope ADC performs its conversion in two phases: integration phase and deintegration phase. During the integration phase,  $I_{ptat}$  is connected to the integrator for fixed time  $T_1$  (4 µs) shown in Figure 2.2(b) and  $V_X$  is charged up to a peak value that is proportional to  $I_{ptat}$  and, hence, proportional to temperature. During the deintegration time, switching input from  $I_{ptat}$  to  $I_{ref}$  discharges  $V_X$  with a constant slope set by  $I_{ref}$  until  $V_X$  falls below  $V_{ref}$ . This deintegration interval  $T_2$  shown in Figure 2.2(b) depends on the current ratio of  $I_{ptat}$  and  $I_{ref}$ . [2.14],[2.15] presents a temperature sensor in a 32 nm high-k metal gate digital CMOS process for integration in a microprocessor core. The sensor uses a ratio of currents driven into a BJT pair with current chopping to up-convert the temperature signal. A second order sigma-delta 1-bit ADC is used to digitize the chopped signal, which is then down-converted and filtered in the digital domain to obtain a temperature measurement. The sensor operates from 10 to 110°C, achieving a  $3\sigma$  resolution of 0.45 C, and < 5°C inaccuracy without calibration/trimming.

Pertijs et al. [2.5] proposed such a scheme by using a switched capacitor integrator that balances the charge ratio between and voltage generated by charge summing. Figure 2.3 shows the block diagram of the temperature measurement system with a time-multiplexed BJT sense stage, a ADC, and a digital backend. The ratiometric measurement of the  $\Sigma\Delta ADC$ ,  $(V_{BE}-V_{BE2}) / (V_{BE}+V_{BE2})$ , is non-linear with temperature. The raw ADC output is linearized to yield

$$D_{OUT} = \frac{\frac{\Delta V_{BE}}{V_{BE}}}{1 + \alpha_m \frac{\Delta V_{BE}}{V_{BE}}} = \frac{\Delta V_{BE}}{V_{BE} + \alpha_m \Delta V_{BE}} = \frac{q}{V_{BG}}$$
(2.1)

where ,  $V_{BE}=(V_{BE}+V_{BE2})$ ,  $\Delta V_{BE}=(V_{BE}-V_{BE2})$ ,  $V_{BG}$  is the bandgap voltage, *m* the current ratio in the BJT pair, and the bandgap coefficient. The digital computation of the reference,  $V_{BG}=V_{BE}+\alpha \ _{m}\Delta V_{be}$  allows easy tracking of process changes by setting  $\alpha \ _{m}$ . In [2.5], the gain factor is implemented using a capacitor ratio and multiple clock cycles. This increases the area of the first integrator.



Figure 2.3 Schematic of the thermal sensing system [2.13].

Figure 2.4 shows the simplified schematic of the BJT sense stage. The top-row current sources feed currents in to a pair of matched BJTs to develop a differential voltage. The current ratio, *m*, between the BJTs results in a differential voltage. In practice, the BJT performance changes with current density. At low current density generation-recombination dominates the BJT behavior while at high current density high-injection dominates. A region of optimal current density (flat- $\beta$ ) is selected to ensure BJTs have comparable  $\beta$  at both current ratios. In this work, the bias current of a single current source is nominally 80 uA. The current is generated using constant-gm bias circuit that uses an nMOS transistor in linear region as a resistive reference. Different current densities can be realized by current, or the BJT emitter area scaling, or both. In this work current scaling was selected as different emitter areas cannot be matched perfectly owing to interconnect-dominated mismatch as the mismatch is dominated by the interconnect. The effective emitter resistance in each BJT branch will

be different due to differences in the routing resistance connecting the constituent BJT elements. Additionally, the variable resistance to each unit BJT produces a non-uniform current in that BJT leading to variations and errors. The use of equal area BJTs allow for a layout with symmetrical metal connections to ensure matched contact resistance.

The input current in each branch is swapped between the BJTs every 128 cycles so that differential output voltage is up-converted to the chopping frequency. The selection of the current sources to chop the currents to the BJTs is illustrated in Figure 2.5. The current ratio in the BJTs must be accurate for a measurement with low variations. Each individual current source must be matched so that an exact ratio can be obtained. Careful attention to layout is necessary to eliminate systematic process induced mismatches; a common centroid layout of the unit transistors is used. A discussion of the random mismatch in a high-k metal gate CMOS process is presented. One way to improve the random mismatch is to use long channel length devices, but is area inefficient. Another way to improve matching is to use a dynamic element matching (DEM) between the current sources. A number of redundant current sources are used to generate the current ratio. In the current scheme  $N_{CS}$ , discrete current sources are used to generate the current ratio. A digital controller combines the DEM selection with the chopping logic to connect the current sources to the BJT.



Figure 2.5 Illustration of the current chopping to up-convert the temperature signal

to  $f_{chop}$  [2.13].

#### 2.3 Analog CMOS Based Temperature Sensor

With the use of bipolar transistors for temperature sensing, and advanced techniques including chopping circuit, dynamic element matching and sigma-delta ADC for noise suppression and cancellation, Pertijs et al. [2.5] developed an on-chip temperature sensor with a  $3\sigma$  inaccuracy of  $\pm 1^{\circ}$ C at the expense of increased circuit complexity. With the use of three CMOS transistors for temperature sensor was presented in [2.17]. The three-transistor temperature sensor shows in Figure 2.6, which utilizes the temperature characteristic of the threshold voltage, shows highly linear characteristics at a power supply voltage of 1.8 V. The conditions of this temperature sensor are defined as follows.



Figure 2.6 Conventional three-transistor temperature sensor [2.17].

1) All transistors operate in the saturation region.

- 2) The output voltages of each node are equal.
- 3) The sinking currents at each node are equal.

The temperature is obtained by measuring  $V_{OUT}$ , where the two currents,  $I_{OUT1}$  and  $I_{OUT2}$ , have the same value. When the substrate bias effect of the transistor M2 is

neglected to simplify the calculation, their  $I_{\text{DS}}\text{-}V_{\text{GS}}$  characteristics and the operating conditions are

$$I_{DS1} = \frac{\beta_1}{2} (V_{GS1} - V_{T1})^2$$
(2.2)

$$\beta_{1} = \mu_{eff1} C_{OX} \frac{W_{1}}{L_{1}}$$
(2.3)

$$I_{DS2} = \frac{\beta_2}{2} (V_{GS2} - V_{T2})^2$$
(2.4)

$$\beta_2 = \mu_{eff\,2} C_{OX} \, \frac{W_2}{L_2} \tag{2.5}$$

$$I_{DS3} = \frac{\beta_3}{2} (V_{GS3} - V_{T3})^2$$
(2.6)

$$\beta_3 = \mu_{ef3} C_{OX} \frac{W_3}{L_3}$$
(2.7)

$$I_{DS1} = I_{DS2} = I_{DS3}$$
(2.8)

$$V_{OUT1} = V_{GS1} , \ V_{OUT2} = V_{GS2} + V_{GS3}$$
(2.9)

$$V_{OUT1} = V_{OUT2} \tag{2.10}$$

After solving (2.3) - (2.6) for each transistor's respective  $V_{GS}$ , the results are applied to  $V_{GS2}$  and  $V_{GS3}$  in (2.8). Then,  $I_{DS2}$  and  $I_{DS3}$  are also substituted for (2.1) & (2.2) using (2.7). Finally, (2.9) is solved against  $V_{GS1}$  and we get

$$V_{OUT1} = V_{GS1} = \frac{V_{T2} + V_{T3} - (\sqrt{\beta_1 / \beta_2} + \sqrt{\beta_1 / \beta_3})V_{T1}}{1 - (\sqrt{\beta_1 / \beta_2} + \sqrt{\beta_1 / \beta_3})} \propto T$$
(2.11)

$$\frac{dV_{OUT1}}{dT} = a(\frac{dV_{T2}}{dT} + \frac{dV_{T3}}{dT}) - b\frac{dV_{T1}}{dT}$$
(2.12)

15

Where

$$a = \frac{1}{1 - (\sqrt{\beta_1 / \beta_2} + \sqrt{\beta_1 / \beta_3})} b = \frac{\sqrt{\beta_1 / \beta_2} + \sqrt{\beta_1 / \beta_3}}{1 - (\sqrt{\beta_1 / \beta_2} + \sqrt{\beta_1 / \beta_3})}$$

Since  $\sqrt{\beta_1/\beta_2} + \sqrt{\beta_1/\beta_3}$  can be assumed as constant, the variables and in (2.12) also become constant. Therefore, the output voltage corresponds to the temperature coefficients of the transistor threshold voltages.



Figure 2.7 Conventional three-transistor temperature sensor [2.17].

Figure 2.7 shows the characteristics at 1.8V and 1V supply voltages, where the intersections of and correspond to the operating points of this sensor. This method shows highly linear characteristics at a power supply voltage of 1.8V or more, which enables us to define the operating conditions well above twice the threshold voltage. But the linearity diminishes after scaling down the supply voltage to 1V using a 90-nm CMOS process. Because the temperature coefficient of the operating point's current at a 1V supply voltage is steeper than the coefficient at a 1.8V supply voltage, the operating point's current at high temperature becomes quite small and the output

voltage goes into the sub-threshold region or the cutoff region.

To improve linearity at a 1V supply voltage, an accurate four-transistor temperature sensor was designed in [2.18], and developed for thermal testing and monitoring circuits in deep submicron technologies, which is shown in Figure 2.8. Note that to operate the additional transistor in the saturation region, an extra bias voltage  $V_{GS0}$ ' is required. Of course, the bias voltage generation circuit must not possess temperature dependency, and, in some cases, this circuit becomes larger than the temperature sensor itself.



Figure 2.8 Four-transistor, voltage output, temperature sensor [2.18].

In addition, the W/L ratio of the transistors M0' and M1' should be as small as possible so that the current  $I_{OUT1}$ ' remains small. However, the smaller W/L ratio requires a longer channel, so it occupies larger chip area. Consequently, there is a tradeoff between the current consumption and the chip area.

The  $I'_{DS} - V'_{GS}$  characteristics and the operating conditions of both the proposed four-transistor sensor is the following:

$$V'_{OUT1} = \sqrt{\frac{2I'_{DS2}}{\beta'_2}} + V'_{T2} + \sqrt{\frac{2I'_{DS3}}{\beta'_3}} + V'_{T3} \propto T$$
(2.13)

$$\sqrt{\frac{2I'_{DS2}}{\beta'_2}} + \sqrt{\frac{2I'_{DS3}}{\beta'_3}}$$
 can be assumed as a constant value. Thus, (2.13) shows that

the output voltage is mainly proportional to the temperature characteristics of the threshold voltage ( $_{M2'}$  and  $_{M3'}$ ).

The output current of four- transistor temperature sensor is more high linearity with high temperature than conventional three-transistor circuit shows in Figure 2.9.



Figure 2.9 Operating points of four-transistor temperature sensor [2.18].

### 2.4 Delay Based Temperature Sensors

#### 2.4.1 Time-to-Digital Converter Based Temperature Sensors

The temperature sensor composed of temperature-to-pulse generator and cyclic time-to-digital converter, shows in Figure 2.10. Temperature-to-pulse generator, it can

generate a pulse width is linear to temperature variation. A simple circuit utilizing gate delays to generate the thermally sensitive pulse is shown in Figure 2.11. The START signal is delayed a certain amount of time by the delay line composed of even number of inverter. The high-to-low and low-to-high propagation delay time for an inverter can be expressed as [2.19]

$$t_{PHL} = \frac{2C_L V_{TN}}{K_N (V_{DD} - V_{TN})^2} + \frac{C_L}{K_N (V_{DD} - V_{TN})} \cdot \ln(\frac{1.5V_{DD} - 2V_{TN}}{0.5V_{DD}})$$
(2.14)

$$t_{PHL} = \frac{-2C_L V_{TP}}{K_P (V_{DD} + V_{TP})^2} + \frac{C_L}{K_P (V_{DD} + V_{TP})} \cdot \ln(\frac{1.5V_{DD} + 2V_{TP}}{0.5V_{DD}})$$
(2.15)



Figure 2.10 Block diagram of the time-to-digital temperature sensor [2.19].



Figure 2.11 Temperature-to-pulse generator [2.19].

Where  $k_N = \mu_N C_{OX} (W/L)_N$ ,  $k_P = \mu_P C_{OX} (W/L)_P$  and  $C_L$  are the trans -conductance parameters and effective load capacitance of the inverter. Note that we assume square-law behavior for the CMOS devices and thereby ignore the effects of velocity saturation. For an inverter with equivalent NMOS and PMOS, the propagation delay can be derived as

$$t_{P} = \frac{t_{PLH} + t_{PHL}}{2} = \frac{(L/W)C_{L}}{\mu C_{OX}(V_{DD} - V_{T})} \ln(\frac{1.5V_{DD} - 2V_{T}}{0.5V_{DD}})$$
(2.16)

Where

$$\mu = \mu_0 \left(\frac{T}{T_0}\right)^{km} , \ km = -1.2 \sim -2.0 \tag{2.17}$$

$$V_T(T) = V_T(T_0) + \alpha(T - T_0)$$
,  $\alpha = -0.5 \sim -3.0 mv/^{\circ}k$  (2.18)

As the temperature increases, the mobility (µ) and the threshold voltage (VT) will both decrease. In the case of VDD much larger than VT, the thermal effect of the propagation delay will be dominated by the mobility. That is, the thermal coefficient of the propagation delay will become positive. The major problem of the simple temperature-to-pulse generator is that the width of the output pulse at the lower bound of the measurement range is usually much larger than zero. This will cause a large DC offset at the smart temperature sensor output. The second delay line with thermal compensation for temperature sensitivity reduction is inserted in the lower transmission path of the START signal to reduce the width offset of the output pulse, which is shown in Figure 2.12. The width offset of the output can be easily reduced by adjusting the number of delay cells in delay line 2.



Figure 2.12 Width offset reduction accomplished by delay line 2 [2.19].

As shown in Figure 2.13, a simple thermal compensation circuit is used to reduce the sensitivity of the inverter in delay line2. The diode connected transistors P1, N1, and P3 serve as the core of the thermal compensation circuit. Since P1, P3, and N1 are all diode connected, they will operate in saturation if bias current is flowing. Thus, we

have



Figure 2.13 Operating points of four-transistor temperature sensor [2.19].

$$I_{DP3} = \frac{1}{2} \mu C_{OX} \left(\frac{W}{L}\right) (V_{GSP3} - V_T)^2 (1 + \lambda V_{GSP3})$$
(2.19)

By substituting (2.17) and (2.18) into (2.19), the equation becomes

$$I_{DP3} = \frac{1}{2} \mu_0 C_{OX} \left(\frac{W}{L}\right) \left(\frac{T}{T_0}\right)^{km} \left[V_{GSP3} - V_T(T_0) - \alpha(T - T_0)\right]^2 \left(1 + \lambda V_{GSP3}\right)$$
(2.20)

When the temperature is higher than 200K, a significant plateau effect can be observed for the difference between mask channel length and effective channel length. The thermal sensitivity of channel length modulation term  $(1 + \lambda V_{GSP3})$  will be neglected in the following deviations since it is much smaller than those of mobility and threshold voltage over the temperature range we are interested.

To get the minimum thermal sensitivity, let  $\frac{\partial I_{DP3}}{\partial T} = 0$ 

$$\frac{\mu_{0} \cdot C_{OX} \cdot km}{2T_{0}} (\frac{W}{L}) (\frac{T}{T_{0}})^{km-1} [V_{GSP3} - V_{T}(T_{0}) - \alpha(T - T_{0})]^{2} (1 + \lambda V_{GSP3})$$

$$= \alpha \cdot \mu_{0} \cdot C_{OX} (\frac{W}{L}) (\frac{T}{T_{0}})^{km} [V_{GSP3} - V_{T}(T_{0}) - \alpha(T - T_{0})] (1 + \lambda V_{GSP3})$$
After simplification, we have
$$V_{GSP3} = V_{T} (T_{0}) + \alpha(T - T_{0}) + 2 \frac{\alpha \cdot T_{896}}{km}$$
(2.21)

The sizes of transistors P1 and N1 are adjusted to make the gate-to-source voltage of P3 fit the requirement stated in (2.21) as closely as possible. The conduction current of transistor P3 can be found by substituting (2.21) back into (2.20) to yield

$$I_{DP3} = \frac{1}{2} \mu_0 C_{OX} \left(\frac{W}{L}\right) \left(\frac{T}{T_0}\right)^{km} \left[\frac{2\alpha T}{km}\right]^2 (1 + \lambda V_{GSP3})$$
(2.22)

When km = -2, the drain current will become totally thermal independent

$$I_{DP3} = \frac{1}{2} \mu_0 C_{OX} \left(\frac{W}{L}\right) (\alpha T_0)^2 (1 + \lambda V_{GSP3})$$

Through the help of the current mirrors (P1, P2) and (N1, N2), the drain current of

the inverter will be kept thermally insensitive as well, as will the propagation delay of delay line 2. This greatly reduces the design difficulty and enhances the tolerance to process variation.

For accuracy enhancement, a novel time-domain SAR smart temperature sensor suitable for curvature compensation is proposed in [2.20]. The corresponding architecture is shown in Figure 2.14 which evolves from the former time-domain digital thermostat [2.21]. A SAR control logic is added to speed up the set-point programming of the thermostat for adjusting the ARDL delay to approximate the TDDL delay. The final set-point value is defined as the output of the proposed sensor.



Figure 2.14 Implemented architecture of the proposed smart temperature sensor

[2.21].

More specifically, the SAR control logic, ARDL and time comparator can be viewed as an equivalent time-to-digital converter to measure the TDDL delay for any temperature under test. The accuracy enhancement is accomplished mainly by a new technique called curvature compensation by which the curvature of ARDL temperature-to-time transfer curve is designed to compensate for TDDL curvature to substantially improve the sensor's linearity.

From (2.15),(2.16) and (2.17), (2.15) can be rewritten as

$$t_p(T) = \frac{2LC_L T_0^{km}}{\mu_0 W C_{\text{OX}}} \times \frac{\ln(3 - 4V_T/V_{\text{DD}})}{1 - V_T/V_{\text{DD}}} \times \frac{1}{T^{km}} \stackrel{\Delta}{=} \beta \times T^{-km}$$
(2.23)

where the constant  $\beta$  is almost temperature-independent. Although the unit reference delay of ARDL can be easily implemented as one reference clock period, or a part of it, to be theoretically temperature-insensitive, the curvature of the smart sensor output will inevitably resemble that of curve and the sensor accuracy will be seriously limited as predicted in Figure 2. 15(a). One feasible linearization technique for the smart sensor output is to compensate for the curvature of TDDL curve by that of ARDL curve, as revealed in Figure 2. 15(b).



Figure 2. 15 Sensor output linearity for (a) temperature-insensitive and (b) curvature -compensating ARDL cell delay [2.21].

To reduce the thermal sensitivity of the ARDL delay cell to make its delay as a unit time reference, the temperature compensation circuit adopted in the former
time-domain sensor [2.19] is utilized likewise. However, the conventional temperature compensation circuit consumes continuous power. NMOS switch NS1 is added to shut down the quiescent current of the ARDL delay cell between measurements to reduce power consumption as shown in Figure 2.16 where the Stop signal is activated at the end of conversion. The other switch NS2 is inserted to match the source resistances of N1 and N2 for reducing current mirror error. To cut the delay line size and the conversion time in half, the ARDL delay cell can be theoretically implemented as a temperature-compensated NOT gate instead of a delay buffer [2.10]. In this case, however, the rise time and fall time of the NOT gate are not equal since the pull up current is usually not the same as the pull down current. This mismatch will cause additional errors between even and odd stages. Therefore, a thermally compensated buffer is used instead as the unit ARDL delay cell.



Figure 2.16 Modified temperature compensation circuit for the ARDL delay cell

#### [2.21].

#### 2.4.2 Dual-DLL-Based All-Digital Temperature Sensor

With process scaling down continuously, PVT variation will be a big problem about Time-to-digital based temperature sensors. A new type DLL-based all-digital temperature sensor [2.22] was presented. It has two improvements. First, it removes the effect of process variation on inverter delays via calibration at one temperature point, thus, reducing high volume production cost. Second, we used two fine-precision DLLs, one to synthesize a set of temperature-independent delay references in a closed loop, the other as a TDC to compare temperature-dependent inverter delays to the references. The use of DLLs simplifies sensor operation and yields a high measurement bandwidth (5kS/s) at 7bit resolution, which could enable fast temperature tracking.

We execute calibration and delay normalization using the circuit of Figure 2.17. It contains an open-loop delay line, and a DLL that synthesizes temperature -independent-delay references. This reference-DLL (R-DLL) is locked to a crystal oscillator x(t): each delay cell in the R-DLL has constant delay  $\Delta 0$ . MUX-1 taps a node in the R-DLL delay line: if the N-th cell's output is tapped, the delay from input x(t) to output d(t) of the R-DLL is D<sub>DLL</sub> = N $\Delta_0$ . This is our delay reference independent of temperature and process. N can be altered to produce different reference delays. In the open-loop line, if the M-th cell's output is tapped by MUX-2, the delay between input x(t) and output c(t) is varies with temperature and process.



Figure 2.17 Basic architecture of DLL-based CMOS digital temperature sensor



Once 1-point calibration is complete, the sensor enters measurement mode. Temperature T is unknown, thus,  $D_{OL}$  of the hardwired open-loop line is an unknown delay, which the M-DLL measures by varying the reference delay  $D_{DLL}$  of the R-DLL (bottom of Figure 2.18). MUX-1 setting N is varied until  $D_{DLL}$  equals  $D_{OL}$  at  $N = N_m$ . N<sub>m</sub> is a digital output that faithfully represents T. N<sub>m</sub> corresponds to the normalized delay seen earlier.



2.4.3 Sub-µW Embedded CMOS Temperature Sensor

In [2.23], An ultra-low power embedded CMOS temperature sensor based on serially connected sub-threshold MOS operation is implemented in a 0.18  $\mu$ m CMOS process for passive RFID food monitoring applications. Employing serially connected sub-threshold MOS as sensing element enables reduced minimum supply voltage for further power reduction, which is of utmost importance in passive RFID applications. Both proportional-to-absolute-temperature (PTAT) and complimentary-to-absolute-temperature (CTAT) signals can be obtained through proper transistor sizing. With the sensor core working under 0.5 V and digital interfacing under 1 V, the sensor dissipates

a measured total power of 119 nW at 333 samples/s and achieves an inaccuracy of  $\pm 1/$  0.8 C from 10 C to 30 C after calibration. The sensor is embedded inside the fabricated passive UHF RFID tag. Measurement of the sensor performance at the system level is also carried out, illustrating proper sensing operation for passive RFID applications.

Figure 2.19 shows the block diagram of the proposed temperature sensor in the RFID tag. The proposed temperature sensor first generates and the signals from the sensor core utilizing MOS devices operating in sub-threshold region. These signals are converted to delays through the corresponding delay generators. The resultant temperature modulated output pulse PW, which is level-shifted to 1 V swing for interfacing at the output of the delay generators, is further digitized using the clock signal. This time-domain readout scheme eliminates the use of power hungry ADC to reduce power consumption. For system level implementation, the sensor reuses the existing supply voltages and clock signals available in the tag to reduce the power and area overhead. The sensor supply voltages,  $V_{DDL} = 0.5V$  and  $V_{DDH} = 1V$ , are provided by the on-chip power management unit. The supply voltages are provided by LDOs with filtering to reduce both noise at the RF frequency and the ripple voltage, as required by the sensor and other building blocks to ensure robust tag operation. The clock generator generates the system clock, and this clock is utilized by the sensor for quantization. This quantization clock is generated through injection locking. In that case, its frequency is referenced to the incident RF input and should be weakly dependent to both process and temperature variation. The sensor control signals are generated and the digitized temperature data received by the digital baseband. In order to further reduce power consumption, a Done signal is exerted at the end of each conversion period to shut-down the analog building blocks and to acknowledge the baseband. The



digital data Dout is then ready and can be read out.

Figure 2.19 illustrates the PTAT and CTAT delay generators that convert the temperature modulated signal from voltage domain to time domain for simple and power efficient processing. Without loss of generality, we first consider the CTAT delay generator. Transistors  $M_{P7-8}$ , together with the resistor  $R_{PT}$  and the amplifier, convert the input voltage into  $V_{PTAT}$  current  $I_{PTAT}$ . Low-voltage operation is sustained by implementing the amplifier using simple current mirror architecture. Stacking of transistors is avoided by having the amplifier output directly driving. The scaled current from is mirrored through  $M_{P9-10}$ . Transistors  $M_{P10-15}$  operate as a single-slope ADC, which also performs level-shifting from 0.5 V to 1 V for interfacing with other digital  $_{30}$ 

circuits. Similarly, the PTAT delay generator converts  $V_{CTAT}$  into  $I_{CTAT}$ , followed by another single-slope ADC. At the start of each integration cycle, the  $V_{ST}$  signal is exerted, shutting down  $M_{P11}$  ( $M_{C11}$ ). The temperature modulated current signals (which is converted from  $V_{PTAT}$  or  $V_{CTAT}$ ) is integrated through capacitor  $C_{PT}$  ( $C_{CT}$ ). Upon reaching the switching threshold of  $M_{P12-13}$  ( $M_{C12-13}$ ), a rising edge is triggered and buffered by  $M_{P14-15}$  ( $M_{C14-15}$ ). As the discharging currents are temperature dependent, the delay between the two rising edges is also temperature dependent. The two rising edge signals from the CTAT and PTAT delay paths are XORed to generate a temperature modulated pulse width PW (Figure 2.21) and then further quantized using the ripple counter and the system clock. The whole block is shut-down through the feedback signal (refer to Figure 2.19), which also indicates the end of conversion.



Figure 2.20 Block diagram of the proposed temperature sensor with interfacing in the RFID tag system [2.23].



The amazing integration densities achieved by current submicron technologies pay the price of increasing static power dissipation with the corresponding rise in heat density. Dynamic Thermal Management (DTM) techniques provide thermal-efficient solutions to balance or equally distribute possible on-chip hot spots. Accurate sensing of on-chip temperature is required by optimally allocating smart temperature sensors in the silicon. [2.24] introduce an ultra low-power (1.05–65.5nW at 5 samples/s) tiny (10250  $\mu$ m<sup>2</sup>) CMOS smart temperature sensor based on the thermal dependency of the leakage current. The proposed sensor outperforms all previous works, as far as area and power consumption are concerned (more than 85% reduction in both cases), while still 32 meeting the accuracy constraints imposed by target application domains. Furthermore, a specific interface based on the use of a logarithmic counter has been implemented to digitalize the temperature sensing. These facts, in conjunction with the full compatibility of the sensor with standard CMOS processes, allow the easy integration of many of these tiny sensors in any VLSI layout, making them specially suitable for modern DTM implementations.

The structure of the sensor is displayed in Figure 2.22. Similarly to what happens in dynamic gates, when the input receives a low-to-high transition and transistor M1 goes from an "on" to an "off" condition, capacitor  $C_L$  stores a charge that ideally would remain untouched, but that actually will gradually leak away due to leakage currents.

Figure 2.23 shows the sources of leakage for transistors M1 and M2. Sources (1) and (2) are sub-threshold leakages of M2 and M1, respectively. As mentioned before, these two components will dominate the behavior of the sensor. Sources (3) and (4) are reverse-biased diode leakages of M2 and M1, respectively. Sources (1) and (3) discharge  $C_L$ , whereas sources (2) and (4) charge it. When  $C_L$  is charged, the transistor M2 drain-source voltage ( $V_{DS,M2}$ ) equals VDD and its sub-threshold current ( $I_{DSUB,M2}$ ) reaches a maximum. At the same time,  $V_{DS,M1}$  and  $I_{DSUB,M1}$  equal 0.



Figure 2.22 Sub-threshold current thermal sensor [2.24].



Figure 2.23 Leakage current mechanisms in the thermal sensor [2.24].

1896

Figure 2.24 shows the implementation of the sensor along with the logarithmic counter. A clock—clock\_in—drives the input of the sensor, after charging with a narrow low pulse, its low-to-high transition leaves the intermediate node floating and sets when the logarithmic count must start. When, due to the leakage process, crosses the threshold voltage of inverter M3-M4, the logarithmic counter receives a low-to-high transition at the load input, the count ends and is registered. Note that the loading rate of the register, i.e., the conversion rate of the sensor, is equal to the frequency of clock\_in. This frequency is bounded by the pulse width of the minimum temperature that the sensor needs to measure. An external control will decide this conversion rate depending on the precision and power requirements of the system.



Figure 2.24 Implementation of the sensor along with the logarithmic counter [2.24].

#### 2.6 Frequency-to-Digital Converter Based Temperature

### Sensor



The temperature sensor proposed by [2.25] is operating with the FDC and the controller as shown in Figure 2.25. Temperature is measured by the frequency difference between the temperature sensitive oscillator (TSO) and the temperature insensitive oscillator (TIO). Digital output consisting of 10b coarse and 3b fine binary codes are extracted from the FDC. In order to reduce power consumption, the TIO and the TSO operate alternatively depending on the signals sent by the controller, S1 and S2, which are also used to synchronize both oscillators. Controlling signals have enough margins to account for the settling time of both oscillators, which is necessary for switching them. The difference of the frequencies between the TSO and the TIO is used to measure temperature. Consequently, the countable temperature range at a given the number of bits can be increased as shown in Figure 2.26. The frequency variation due to process variation can also be canceled.



Figure 2.25 Block diagram of the temperature sensor proposed [2.25].



Figure 2.26 Enhanced performance of temperature measurement with the TIO

[2.25].

Both of the TSO and TIO are constructed from ring oscillators using a current starved delay cell shown in Figure 2.27. However, the temperature insensitive bias circuit is adopted only in the TIO. The TSO has the frequency range of 400MHz (at  $-40^{\circ}$ C) ~ 250MHz (at 110°C), which has linear relation to the temperature variation. The TIO, whose mean frequency is 200MHz, can serve as the reference for the TSO.

Figure 2.28 shows the block diagram of the FDC which consists of a 10b up-down counter, a sampler, and the fine code generator (FCG) and shows the timing diagram.

When Up- Down is high, the MUX selects the TSO output as the FDC's input clock. The counter then counts rising edges of the input clock upward. Conversely, when Up-Down is low, the MUX selects the TIO output as the FDC's input clock. The counter then counts the rising edges of the input clock downward. Therefore, the sampler can register the frequency difference between the TSO and the TIO with S3. For the next measurement, S4 resets the counter, which completes one temperature-to-digital conversion. One period of Up-Down is the same as one conversion time. In order to prevent an overflow and to maximize the capability of the counter, the frequency of Up-Down should be as in (2.24).



Figure 2.27 Ring type oscillator with current starved delay cells [2.25].



#### 2.7 Summary

Figure 2.29 shows the fishbone diagram of temperature sensor patents (US patents). There are 2770 search results in term of temperature sensor, BJT based and analog CMOS based temperature sensors belonging to the majority. These two type temperature sensors utilize BJT bias and band-gap reference to sense temperature variation. TDC based and FDC based temperature sensors are in the minority, utilizing inverter chain and ring oscillator to generate time pulse and frequency linearly with temperature. Figure 2.30 and Table2.1 compare several temperature sensors in previous 38

section. As the conversion speed goes up, the more current is needed. Therefore the power consumption is compared with respect to the measurement bandwidth. The voltage and current based temperature sensors [2.4], [2.5], [2.14]-[2.16] have large power consumption per conversion rate and low area efficiency because they make use of ADCs. Power consumption per conversion rate of TDC based temperature sensors, [2.19]-[2.21], are lower than previous one, but the delay chain occupies large area. [2.23] have less power consumption but smaller sensing range (-10°C~30°C) which is suitable for RFID. [2.24] has extreme low power because slow conversion rate, but it's effected significantly by process variation. While consuming the lowest power per conversion rate, the FDC based temperature sensor [2.25] shows moderate resolution.



Figure 2.29 The fishbone diagram of temperature sensors.



Figure 2.30 Comparison of temperature sensors.

Table 2.1 Comparison of each type temperature sensors.

| Туре      | Paper        | Advantage                 | Disadvantage                    |
|-----------|--------------|---------------------------|---------------------------------|
| BJT       | [2.5],[2.14] | High resolution and small | Large area and large power.     |
| Based     | -[2.16]      | inaccuracy.               | Can't be operated in low        |
|           |              |                           | voltage.                        |
| CMOS      | [2.17],      | Better linearity, lower   | Need additional ADC and bias    |
| Based     | [2.18]       | supply voltage and power  | circuits.                       |
|           |              | than BJT-based            |                                 |
| Delay     | [2.19]-      | Don't need additional     | Large area of delay line. Can't |
| Based     | [2.23]       | ADCs, it's easier to be   | be operated in low voltage.     |
|           |              | implemented.              |                                 |
| Leakage   | [2.24]       | Small power and area.     | Low conversion rate.            |
| Based     |              |                           | Additional log counter is       |
|           |              |                           | required. Can't be operated in  |
|           |              |                           | low voltage.                    |
| Frequency | [2.25]       | High conversion rate.     | Worse inaccuracy.               |
| Based     |              |                           |                                 |

# Chapter 3 0.5V~0.25V Process, Voltage and Temperature Sensors with Adaptive Voltage Selection

The 0.5V~0.25V process, voltage and temperature (PVT) sensors with adaptive voltage selection are proposed for temperature measurement in the energy harvesting dynamic voltage and frequency scaling (DVFS) systems. It composes of process, voltage and temperature sensor. The process sensor and voltage (PV) sensor monitor the process variation and voltage variation continuously and give the variation information for temperature compensation. The temperature sensor has six TSROs generating frequency proportional to the measurement temperature at suitable supply voltage, and converts the frequency into digital code. The sensor was designed in TSMC 65nm CMOS technology. The sensor operate over an ultra-low voltage range from 0.25V~0.5V and have  $2.3\mu$ W power consumption,  $0.15^{\circ}$ C resolution and 50k samples/sec conversion rate.

# **3.1 Introduction**

In recent years, numerous portable electronic products have been launched to the market with considerable market growth. Energy efficiency of electronic circuits is a critical concern in every application. Lowering the supply voltage and frequency is one of the attractive approaches to reduce power consumption. Furthermore, dynamic voltage and frequency scaling (DVFS) system achieves extremely efficient energy saving by adjusting system supply voltage and frequency depending on workload monitor [3.1]. As we continue to reduce the supply voltage to ultra-low voltage that the transistor reach the near/sub-threshold regions, circuits would become more sensitive to process, voltage and temperature (PVT) variations than super threshold [3.2]-[3.4].



Figure 3.1 The DVFS system of energy harvesting.

With process scaling down continuously, the high level of integration also introduces the problem of self-heating, which is the result of increased power density. The environmental variations are so large that variation-aware near-/sub-threshold circuit design is necessary to prevent functional failure [3.5]. Besides, the energy harvesting systems, which power source include solar, RF and thermo, have critical factor of power efficiency [3.6]-[3.7]. As the result, the power consumption of temperature sensors should be as low as possible to be applicable to the DVFS systems of energy harvesting. As shown in Figure 3.1, the system is constructed by several DVFS domains, DC-DC converter and DVFS controller.

Traditionally, the temperature sensors were constructed by proportional to absolute temperature (PTAT) and complimentary to absolute temperature (CTAT) sensors which were usually fabricated in bipolar processes. To be more compatible with standard CMOS technologies, the substrate bipolar transistor was used instead for thermal sensing [3.8]–[3.9]. These sensors needed extra analog-to-digital convertors (ADCs) which took up more chip area and consumed more power. However, these analog techniques led to complex architecture, slow conversion rate, and large area and power overhead. Therefore, the delay-based temperature sensors is proposed recently to replace analog circuits and the time-to-digital-converter (TDC) is utilized in [3.10]-[3.11]. But, a TDC requires hundreds of inverters to obtain enough pulse delay to achieve sufficient temperature resolution. It has problems of occupying large area and consuming high power.

The rest of this chapter is organized as follows. The design principle of temperature sensor in ultra-low voltage will be discussed in section 3.2. A novel architecture of 0.5V~0.25V PVT sensors with adaptive voltage selection will be proposed in section 3.3. Simulation results of proposed sensor will be given in section 3.4. Finally, section 3.5 concludes this chapter.

1896

### 3.2 Design Principles in Ultra-Low Voltage

### 3.2.1 Challenges of Temperature Sensor in Ultra-Low

# Voltage

A temperature-to-delay-difference generator [3.10] was designed to produce an output pulse with a width as linearly proportional to the measured temperature. As shown in Figure 3.2(a), the START signal went through two different delay lines. One was temperature sensitive, and the other was temperature insensitive. The difference of propagation delay between those two delay lines,  $T_{d1} - T_{d2}$ , was generated by the XOR gate to form temperature-dependent output pulse width. The temperature insensitive delay line (TIDL) was inserted to avoid large DC offset. However, the characteristics of temperature sensitive delay line (TSDL) become distinct as the supply voltage scaling down. There are three operation regions of the MOSFETs, including super-, near-, and sub-threshold region. The corresponding current equations are listed as follows.

Super-threshold region: 
$$(V_{GS} >> V_{th})$$
  

$$I_{D_{-}sp} = \frac{1}{2} \mu^{*} Cox \frac{W}{L} (V_{GS} - V_{th})^{2} (1 + \lambda V_{DS})$$
(3.1)

Near-threshold region:  $(V_{GS} \sim V_{th})$ 

$$I_{D_near} = \mu * Cox \frac{W}{L} V_{DS} \left( V_{GS} - V_{th} - \frac{1}{2} V_{DS} \right)$$
(3.2)



where  $V_{th}$  denotes threshold voltage and  $\mu^*$  denotes the effective channel mobility. The thermal voltage is represented by  $U_T$ . Considering the transistor figure of merit for temperature sensing, the temperature coefficient of current (TCC) [3.12] was used. For a long channel transistor, the TCC in the super-threshold region of operation based on (3.1) is given by

$$TCC_{sp} = \frac{1}{I_{D_{sp}}} \frac{dI_{D_{sp}}}{dT} = \frac{1}{\mu^*} \frac{d\mu^*}{dT} - \frac{2}{V_{GS} - V_{th}} \frac{dV_{th}}{dT}$$
(3.4)

The relative change of  $TCC_{sp}$  is a negative few thousandths per degree because the 45

negative mobility sensitivity dominates. In sub-threshold region, the TCC based on (3.3) (assuming  $V_{DS}$  is much larger than  $U_T$ ) is given by

$$TCC_{sb} = \frac{1}{I_{D_{sb}}} \frac{dI_{D_{sb}}}{dT} = \frac{1}{\mu^*} \frac{d\mu^*}{dT} + \frac{2}{T} - \frac{1}{nU_T} \left(\frac{dV_{th}}{dT} + \frac{V_{GS} - V_{th}}{T}\right)$$
(3.5)

The relative change of  $TCC_{sb}$  is now positive because the negative threshold voltage sensitivity dominates in sub-threshold region due to the exponential dependence upon it. As the transistor goes deeper into weaker inversion, yield  $TCC_{sb}$  of 6% per degree and more. Based on (3.4) and (3.5), the relationship of the TSDL propagation delay versus temperature in super-/sub-threshold region is shown in Figure 3.3. The TSDL propagation delay in super-threshold region increases with temperature whereas that in sub-threshold region decreases with temperature. However, the linearity of the TSDL propagation delay in sub-threshold region is much worse as shown in Figure 3.3. Therefore, the characteristic of the TSDL in sub-threshold region is not suitable for ultra-low voltage temperature measurement.



Figure 3.3 The linearity of temperature sensitive delay line (TSDL) in super-threshold, near-threshold and sub-threshold region.

On the other hand, the TIDL in [3.10] was also hard to implement when the supply 46 voltage is lower to near-/sub- threshold region. The design principle of TIDL was setting  $\partial I_D / \partial T = 0$  to yield the thermal independent conduction current. The first challenge is that the conduction current equation in super-threshold region is very different from that in sub-threshold region, especially the power of  $V_{th}$  term. The second one is that the relative change of TCC in sub-threshold region is several positive hundredths per degree while the relative change of TCC in super-threshold region is a negative few thousandths per degree. The third one is that the conduction current equation of sub-threshold region shown in (3.3) is affected by the thermal voltage to the power of 2,  $U_T^2$ .

In [3.13], a temperature-to-frequency-difference generator was designed to have the temperature sensitive ring oscillator (TSRO) to be the clock source for up-counting, and the temperature insensitive ring oscillator (TIRO) to be the clock source for down-counting. With the same counting period, the output of the up-down counter was equal to the frequency difference of the two oscillators,  $f_{o1} - f_{o2}$ , as shown in Figure 3.2(b). The counter output,  $f_{o1} - f_{o2}$ , was designed to be linearly proportional to the measured temperature. It adopted a modified TIRO to solve the voltage head room problem. However, the implementation of the TIRO was still based on setting  $\partial I_D / \partial T = 0$  to acquire the minimum thermal sensitivity. Adopting the TIRO in ultra-low voltage region encounters the same difficulty as the TIDL in [3.10]. To carry out temperature measurement in ultra low voltage, the frequency-based technique is adopted.

# **3.2.2 Ultra-Low Voltage Frequency-Based Temperature**

#### Sensor

A frequency-based temperature sensor is shown in Figure 3.4(a) for ultra-low voltage temperature measurement. It composes of a sub-threshold temperature sensitive ring oscillator (SB-TSRO), a 2-input AND, a counter and a fixed pulse generator. The proposed sensor is designed to have the frequency ratio between the SB-TSRO and CLK of the fixed pulse width generator proportional to the test temperature. Thus, the proposed temperature sensor can be regarded as a temperature-to-frequency-ratio generator. Once *EN* signal is inserted, the fixed pulse generator will generate an N cycles pulse, W. The SB-TSRO is designed to generate a frequency,  $f_{SB_TSRO}$ , linearly proportional to the measured temperature. Using the 2-input AND, the clock output of the SB-TSRO can only trigger the counter within the pulse period, W. Therefore, the digital output of S-bit counter is equal to N  $f_{SB_TSRO}/f_{CLK}$ .



Figure 3.4 (a) Ultra-low voltage frequency-based temperature sensor. (b) Inverter used in SB-TSRO.

One of the key components of the proposed sensor is the SB-TSRO. It should produce an output clock with frequency as linearly proportional to the measured temperature as possible. The inverter with enable function used in proposed SB-TSRO is shown in Figure 3.4(b). The frequency of SB-TSRO constructed by the inverters is proportional to the conduction current since  $f = I_D / (VDD \times C_{eq})$ .

$$f_{SB\_TSRO} \propto I_{D\_sb} \tag{3.6}$$

Noted that supply voltage,  $V_{DD}$ , and equivalent capacitor of an inverter,  $C_{eq}$ , are assumed to be temperature independent. The inversion layer effective mobility depends on temperature according to [3.14]

$$\mu^* = \mu_0 \left(\frac{T}{T_0}\right)^a \tag{3.7}$$

where a is typically between -1 and -2. Also, the thermal voltage,  $U_T$ , is equal to

$$U_T = \frac{k_B T_{96}}{q}$$
(3.8)

By substituting (3.7) and (3.8) into (3.3), the equation becomes

$$I_{D_{sb}} = \mu_0 C_{OX} \left(\frac{W}{L}\right) (m-1) \left(\frac{T}{T_0}\right)^a \left(\frac{k_B T}{q}\right)^2 \exp\left\{\frac{q[V_{GS} - V_{th}(T)]}{mk_B T}\right\}$$
(3.9)

Using Taylor series expansion for exponential function, the equation becomes

$$I_{D_{sb}} = \mu_0 C_{OX} \left(\frac{W}{L}\right) (m-1) \left(\frac{T}{T_0}\right)^a \left(\frac{k_B T}{q}\right)^2 \left\{ 1 + \frac{q [V_{GS} - V_{th}(T)]}{m k_B T} \right\}$$
(3.10)

After simplification,

$$I_{D_{sb}} \cong KT^{2+a} \left\{ 1 + \frac{q [V_{GS} - V_{th}(T)]}{m k_B T} \right\} , \qquad (3.11)$$





Figure 3.5 (a) The relationship between temperature and threshold voltage. (b) The relationship of SB-TSRO output frequency versus temperature.

1896

There are two terms within the curly brackets of (3.11). The second term is related to threshold voltage,  $V_{th}$ , and thermal voltage,  $U_T$ . Based on [3.15], the  $V_{th}$  can be expressed as

$$V_{th}(T) = V_{th}(T_0) + \alpha(T - T_0) \quad , \tag{3.12}$$

where  $\alpha$  is a negative coefficient. It represents the threshold voltage decreases as the temperature goes high as shown in Figure 3.5(a). Thus, the second term within the curly brackets of (3.11) is temperature independent since the temperature effect of threshold voltage is cancelled with *T* of thermal voltage. Finally, the temperature effect of conduction current depends on the term,  $KT^{2+a}$ . When *a* equals to -1, the conduction current will become linearly proportional to temperature. It also means the frequency of SB-TSRO is proportional to temperature based on (3.6). In order to ensure proposed SB-TSRO operates in sub-threshold region, the design principle of the proposed SB-TSRO threshold voltage is

$$V_{th}(T) = V_{DD} \quad \text{when} \quad T > T_{MAX} \tag{3.13}$$

where the supply voltage,  $V_{DD}$ , is equal to  $V_{GS}$ . The  $T_{MAX}$  represents the maximum temperature operation range of the sensor.

Based on (3.13), the threshold voltage of SB-TSRO MOSFETs at 125°C is implemented to be VDD for the design convenience. The relationship of SB-TSRO output frequency versus temperature is a strictly increasing linear function as shown in Figure 3.5(b).

# 3.3 PVT Sensors with Adaptive Voltage Selection

To perform dynamic thermo management in the DVFS system in Figure 3.1, the supply voltage and process variation must be considered besides temperature. The adaptive voltage selection range is from 250mV to 500mV, so a voltage sensor is utilized to monitor voltage variation. Because the process variation is extremely significant in the ultra-low voltage, a process sensor is required as well. Therefore, the process and voltage (PV) sensor is utilized to compensate temperature sensor as shown in Figure 3.6. The whole circuit can be simply divided into six blocks, including finite state machine (FSM), PV sensor, temperature sensor, process register, voltage mapping table and PV compensation. The FSM send *EN\_ZTC* and *EN\_TSRO* to enable PV sensor and temperature sensor respectively. The PV sensor measure process and si

voltage information and send it to process register and voltage mapping table. According to signal of process register, P[4:0], the voltage mapping table decides current voltage condition and send V[2:0] to temperature sensor. The temperature sensor measure current temperature and send temperature information, T[11:0], to PV compensation block. The PV compensation block compensate T[11:0] with P[4:0] and V[2:0] to cancel the variation of voltage and process variation.



The state diagram and signal waveform of FSM is shown in Figure 3.7 and Figure 3.8 respectively. The implement circuit of FSM, as shown in Figure 3.9, can be operated in 250mV to 500mV supply voltage. In the beginning the *RESET* signal is pulled up, FSM is initialized. Later, when *RESET* signal is set to low level the sensor start sensing process condition for one CLK cycle time. After sensing process condition,  $P_done$  signal is pulled up to alert the process register to stall the process value. Then, the sensor is idle until *EN* signal is alerted to monitor voltage and temperature. At the first cycle, the FSM reset counters of PV sensor and temperature sensor by the reset

signal *RESET\_CTR*. Then, the PV sensor is enabled by *EN\_ZTC* to sense voltage condition at the second cycle. Next, voltage mapping table decide current voltage range according to voltage condition sensed in the previous cycle and value in the process register. After knowing current voltage is within which range, the temperature sensor measure temperature for four cycles. Finally, PV compensation block calculate the temperature value according to value in the process register and voltage information. If *EN* is still high, FSM would back to first state. Otherwise, FSM will back to idle state.



Figure 3.8 Signal waveform diagram of FSM.



Figure 3.9 Implement circuit of FSM.

1896

3.3.2 Process and Voltage Sensor

The current of transistor vary with temperature, process, and voltage. Mutual compensation of mobility and threshold voltage temperature variations may result in a zero temperature coefficient (ZTC) bias point of a MOS transistor. In TSMC 65nm bulk CMOS technology, the ZTC points of NMOS and PMOS are at about 0.4V and 0.6V respectively, as shown in Figure 3.10. According to simulation results, the delay of unit inverter will not change with temperature variation at 0.5V. Because at 0.5V supply voltage, NMOS drain current decreases with temperature, PMOS drain current increases with temperature. The PMOS and NMOS mutual current compensation leads to the output frequency of ring oscillator is constant with temperature variation.



Figure 3.10 ZTC point simulation of NMOS and PMOS.

The ZTC ring oscillator is the major component of PV sensor, as shown in Figure 3.11. The inverter utilized in ZTC ring and the circuits in Figure 3.4(b) are the same structure but different size. The low threshold voltage (LVT) CMOS is adopted to construct the inverters and nand gate. The weight / length ratio of transistors are 120nm/300nm. When the FSM turn to process or voltage sensing state, the *EN\_ZTC* signal enable the 31 stages ZTC ring oscillator and the 9-bit counter is triggered by the oscillator. Therefore, the digital output of counter is also temperature invariant and only effected by process variation. In this work, we don't require too high digital output resolution, so we only utilize PV [8:4] for process measurement. The simulation results are shown in Figure 3.12, the digital output vary with process variation. When process corner is located at SS, TT and FF, the digital output is 7, 11 and 16 respectively. The digital output is not effected significantly by temperature.



Figure 3.11 Implement circuit of PV sensor.

Moreover, the PV sensor can be utilized to sense supply voltage condition when supply voltage is dynamic scaling, because the frequency of ZTC ring oscillator is also voltage dependent. When EN\_ZTC pulse period is fixed, the digital output is proportional to voltage but lightly affected by temperature, as shown in Figure 3.12 (b). However, the voltage sensing (VS) digital output is still affected by process variation, so we compensate digital output according to process information stored in process register by the circuits in right part of Figure 3.13(a). The multiplexer choose compensating value according to P[4:0] and add it to PV[8:0] as shown in left part of Figure 3.13(a). The mapping table convert compensated value to simplified digital output, V[2:0], for temperature sensor. The relationship of digital output and supply voltage is shown in Figure 3.13 (c).



Figure 3.12 (a) The relationship between digital output and process variation.(b) The Monte Carlo simulation.



Figure 3.13 (a) Compensation circuits and mapping table of voltage sensor. (b) Voltage sensor digital output under variation. (c) Digital output after process compensation.

#### **3.3.3 Temperature Sensor**

The temperature sensor is composed of six TSROs, a 12-bit counter, multiplexers and a decoder, as shown in Figure 3.14. TSROs are controlled by EN05, EN045, EN04, EN035, EN03 and EN025 and only operated when EN\_TSRO is high. After voltage mapping state, V[2:0] signal is sent to decoder and multiplexer to choose suitable TSRO for current voltage. Based on (3.13), the threshold voltage of TSRO must equal to supply voltage at the maximum temperature operation range. Therefore, TSROs with different threshold voltage are designed for different supply voltage (i.e., 250mV to 500mV). The threshold voltage behavior can be adjusted by using multi-threshold CMOS (MTCMOS) setting and increasing the effective channel length to adjust MOSFETs threshold voltage [3.16]. Besides, the stage number of each TSRO is arranged to let digital output slope ratio be the same for resolution improvement. The simulation results of temperature sensor can be roughly separated into 3 parts, FF, TT and SS corner, as shown in Figure 3.15. For each corner, digital output slopes of different TSROs is roughly identical for convenience of compensation.



Figure 3.14 Implement circuit of temperature sensor.



# **3.3.4 PV-Compensation**

When FSM turn to PV compensation state, digital output of temperature sensor is compensated according to P[4:0] stored in process register and V[2:0] sent from voltage mapping table, as shown in Figure 3.16. The 12-bit adder and subtracter are utilized to calculate temperature digital output. The multiplexers will choose appropriate compensating value base on current process and voltage information. Simulation results of compensated digital output are shown in the Figure 3.17. Compared with Figure 3.15, voltage and process variation of digital output are reduced significantly. As the results, the temperature sensor is accomplished under process and voltage variation in ultra-low voltage.



Figure 3.16 PV-compensation circuits and compensation value tables.


#### **3.4 Simulation Results**

The proposed process, voltage and temperature sensors are implemented via TSMC general purpose 65-nm CMOS technology. The supply voltage is adaptive scaled from 0.25V to 0.5V. The temperature simulation error is  $-1.76^{\circ}$ C  $-+1.96^{\circ}$ C for adaptive voltage range, as shown in Figure 3.18. The maximum error occurs when supply is 0.25V and 0.5V. The effective resolution is  $0.15^{\circ}$ C /LSB at 50k samples/sec conversion rate. The minimum power consumption is about 2.3 $\mu$ W at 0.25V supply voltage. Table 3.1 lists the comparison of recent temperature sensors [3.8], [3.11], [3.13], [3.17]-[3.18] and proposed temperature sensor. The proposed temperature

sensor has ultra-low DVS operation ability, high conversion rate and ultra-low power consumption. The temperature inaccuracy of proposed temperature sensor is sufficient for dynamic thermal management applications.



Figure 3.18 Simulation error of proposed temperature sensor.

 Table 3.1 Temperature sensor comparisons

896

| Sensor   | Technology | Sensor Type   | Supply(V) | Power | Conv. Rate | Resolution | Inaccuracy  | Temp.     |
|----------|------------|---------------|-----------|-------|------------|------------|-------------|-----------|
|          |            |               |           | (µW)  | (sample/s) | (°C)       | (°C)        | Range(°C) |
| [3.8]    | 0.7µm      | BJT-Based     | 2.5-5.5   | 62.5  | 10         | 0.025      | ±0.25(3o)   | -70~130   |
| [3.11]   | 0.35µm     | Delay-Based   | 3.3       | 36.7  | 2          | 0.092      | -0.25~0.35  | 0~90      |
| [3.13]   | 65nm       | Freq Based    | 1.2       | 400   | 366k       | 0.043      | -2.9~2.75   | -40~110   |
| [3.17]   | 32nm       | BJT - Based   | 1.05      | 1600  | 1k         | 0.45       | <5          | -10~110   |
| [3.18]   | 0.35µm     | Leakage-Based | 3.3       | 0.265 | 5          | 0.28       | ±1.97(3o)   | 20~100    |
| Proposed | 65nm       | FreqBased     | 0.25-0.5  | 2.3   | 50k        | 0.15       | -1.75~+1.96 | -25~125   |

## **3.5 Summary**

The frequency-based process, voltage and temperature sensors without any ADC are proposed for on-chip temperature measurement in the DVFS systems of energy harvesting. It composed of process, voltage and temperature sensor. The process sensor and voltage (PV) sensor monitor the process variation and voltage variation continuously and give the variation information for temperature compensation. The temperature sensor has six TSROs generating frequency proportional to the measurement temperature at suitable supply voltage, and converts the frequency into digital code. The sensor was designed in TSMC 65nm CMOS technology. It operate over an ultra-low supply voltage range from 0.25V~0.5V. The power consumption is  $2.3\mu$ W at 0.25V supply voltage and 50k samples/sec conversion rate. The above characteristics make the proposed sensor special applicable for energy-harvesting miniature portable platform.

# Chapter 4 0.4V Fully Integrated Process Invariant Temperature Sensor

# **4.1 Introduction**

This chapter describes an voltage fully integrated process invariant frequency-based temperature sensor. The proposed temperature sensor utilizes two temperature sensitive ring oscillators (TSROs) to build a temperature-to- frequency-ratio generator capable of operating at 0.4V supply voltage. One is operated in near-threshold region, named Near-TSRO, to generate a fixed pulse width forming the denominator. The other one is operated in sub-threshold region, named SB-TSRO, to provide the required frequency as the numerator. The ratio of the SB-TSRO frequency to the Near-TSRO frequency is implemented to be linearly increasing with the measured temperature. Because of the different MOSFETs near-/sub-threshold conduction current characteristics, the effect of process variation is significantly reduced by the proposed temperature sensor.

The rest of this chapter is organized as follows. The design concepts of ultra-low voltage process invariant temperature sensor will be discussed in section 4.2. The specific architecture of temperature sensor will be revealed in section 4.3. Simulation and experiment results will be given in section 4.4. Finally, section 4.5 would conclude this chapter.

# 4.2 Design Concepts of Process Invariant Temperature

#### Sensor

Although the process sensor in previous chapter can sense the process corner information and compensate temperature sensor, the process variation standards is not fine enough. As shown in Figure 3.4, the digital output of temperature sensor still suffers from process variation. As the result, we have to sacrifice some function, adaptive voltage scaling, to let the design be simple.

In order to remove the effect of process variation, the *CLK* in Figure 3.4 is replaced by a near-threshold temperature sensitive ring oscillator (Near-TSRO) as shown in Figure 4.1. The frequency of the Near-TSRO is  $f_{o1}$ . The S-bit counter is still triggered by the SB-TSRO with  $f_{o2}$  frequency. Hence, the output pulse width of fixed pulse width generator becomes  $2^{N-1}/f_{o1}$ . The corresponding digital output of S-bit counter will be  $2^{N-1}f_{o2}/f_{o1}$ .



Figure 4.1 Block diagram of the proposed ultra-low voltage frequency-based temperature sensor with process variation immunity enhancement.

There are two temperature sensitive ring oscillator (TSRO) in the proposed temperature-to-frequency-ratio generator with process variation immunity. One TSRO is operated in sub-threshold region, called SB-TSRO, and its frequency is proportional to the conduction current,  $I_{D_sb}$ . The other TSRO is operated in near-threshold region, called Near-TSRO, and its frequency is also proportional to the conduction current based on  $f=I_D/(V_{DD}\times C_{eq})$ .

$$f_{Near\_TSRO} \propto I_{D\_near} \tag{4.1}$$

Based on (4.1), the digital output of S-bit counter can be represented by

$$2^{N-1} f_{o2} / f_{o1} \propto 2^{N-1} I_{D_{sb}} / I_{D_{near}}$$
(4.2)

By substituting (3.2) and (3.3), the equation becomes

$$\frac{I_{D\_sb}}{I_{D\_near}} = \frac{(m-1)U_T^2 \exp\left(\frac{V_{GS} - V_{th2}}{mU_T}\right)}{V_{DS}(V_{GS} - V_{th1} - V_{DS}/2)} , \qquad (4.3)$$

where  $V_{th1}$  is device threshold voltage in Near-TSRO, and  $V_{th2}$  is the device threshold voltage in SB-TSRO. Noted that the  $\mu * Cox(\frac{W}{L})$  term is cancelled. Given  $V_{GS}=V_{DS}=V_{DD}$ , the above equation can be simplified as

$$\frac{I_{D\_sb}}{I_{D\_near}} = \frac{(m-1)\left(\frac{k_BT}{q}\right)^2 \exp\left\{\frac{q[V_{DD} - V_{th2}(T)]}{mk_BT}\right\}}{V_{DD}\left[\frac{1}{2}V_{DD} - V_{th1}(T)\right]}$$
(4.4)

where  $U_T = k_B T/q$ . Using Taylor series expansion for exponential function, the equation becomes

$$\frac{I_{D\_sb}}{I_{D\_near}} = \frac{(m-1)\left(\frac{k_BT}{q}\right)^2 \left\{1 + \frac{q[V_{DD} - V_{th2}(T)]}{mk_BT}\right\}}{V_{DD}\left[\frac{1}{2}V_{DD} - V_{th1}(T)\right]}$$
(4.5)

66

According to (3.12)

$$V_{DD} - V_{th2}(T) \propto T \tag{4.6}$$

$$\frac{1}{2}V_{DD} - V_{th1}(T) \propto T \tag{4.7}$$

There are two terms within the curly brackets of the numerator in (4.5). Similarly, the second term is temperature independent since the temperature effect of threshold voltage in (4.6) is cancelled with that of thermal voltage. The remaining terms of the numerator in (4.5) is proportional to  $T^2$ . Meanwhile, the denominator of (4.5) is proportional to T based on (4.7). Therefore, the output of the proposed temperature sensor with enhanced process variation immunity becomes

$$2^{N-1} \frac{f_{o2}}{f_{o1}} \propto \frac{I_{D\_sb}}{I_{D\_near}} \propto T$$

$$(4.8)$$

Equation (4.8) is only valid provided that  $f_{o2}$  is generated in sub-threshold region whereas  $f_{o1}$  is generated in near-threshold region. In order to ensure the SB-TSRO ( $f_{o2}$ ) and the Near-TSRO ( $f_{o1}$ ) operate in sub-threshold and near-threshold region, respectively, the design principles of the device threshold voltage within the two TSROs for the proposed temperature sensor with enhanced process variation immunity are

$$V_{th2}(T) = V_{DD} \quad \text{when} \quad T > T_{MAX} \tag{4.9}$$

$$V_{th1}(T) = \frac{1}{2} V_{DD} \quad \text{when} \quad T < T_{MIN} \tag{4.10}$$

where  $T_{MAX}$  and  $T_{MIN}$  represent the maximum and minimum temperature operation range of the sensor respectively.

On the other hand, the enhanced process variation immunity is achieved by the temperature-to-frequency-ratio structure. Some process parameters of  $I_{D_sb}$  are cancelled with those of  $I_{D_near}$ , including inversion layer mobility, gate oxide capacitance, effective channel width, and effective channel length. The simulation results of the proposed temperature sensor under process variation are shown in Figure 4.2. The effect of process variation is reduced significantly.



#### 4.3 Specific Architecture of Process Invariant Temperature

#### Sensor

An ultra-low voltage process invariant frequency domain temperature sensor is implemented in TSMC 65nm bulk CMOS technology. The block diagram is shown in Figure 4.3. The SB-TSRO uses regular threshold voltage (RVT) CMOS. For the design convenience, the device effective length of the RVT CMOS is adjusted for having its threshold voltage equals to VDD at 125°C satisfying (4.9). The clock of the fixed pulse width generator is provided by the Near-TSRO instead of system clock. The low threshold voltage (LVT) CMOS is adopted to construct the inverters within the Near-TSRO. The device effective length of the LVT CMOS is adjusted for having its threshold voltage identical to one half of VDD at -25°C based on (4.10). In order to achieve sufficient temperature resolution, the Near-TSRO has 51 stages; while the SB-TSRO has 13 stages.



Figure 4.3 The implementation of the proposed process invariant temperature sensor.

With 0.4V supply voltage, the proposed temperature sensor has two input signals, CLK and START. The CLK is provided from the system clock directly, and it is very flexible, and the only requirement of it is faster than 500kHz. That is sufficient for the control unit since the simulated maximum conversion rate of the proposed temperature sensor is 50kHz. The START triggers the sensor to perform on-chip temperature measurement. Each positive edge of the START can enable the measurement one time, and have the Q of the D flip-flop inserted. The S<sub>rst</sub> is then inserted one CLK cycle to reset 11-bit digital output counter, and RDY is reset to 0. Also, the PW then becomes 1

to enable both the SB-TSRO and the Near-TSRO. The SB-TSRO is used for the clock signal of the 11-bit digital output counter; while the Near-TSRO is used for clock signal of the 10-bit counter. The 10-bit counter of the fixed pulse width generator continues counting until the most significant bit,  $Q_{msb}$ , is inserted. It will reset the D flip-flop to make the Q become 0. The control unit then resets PW to 0, and has N<sub>rst</sub> inserted to reset the 10-bit fixed pulse width generator. Meanwhile, the RDY is inserted after several CLK periods to ensure the 11-bit digital output, TS, is ready. The TS equals to  $512 \times f_{o2}/f_{o1}$ , and it is proportional to temperature according to (4.8). The timing diagram of the proposed temperature sensor is shown in Figure 4.4.



Figure 4.4 The timing diagram of the proposed process invariant temperature sensor.

#### 4.4 Simulation and Experimental Results

To verify effectiveness and capabilities of the proposed temperature sensor with enhanced process variation immunity, it was designed by full-custom EDA tools and fabricated in a TSMC general purpose 65-nm one-poly ten-metal (1P10M) CMOS process. Also, the impact of process/voltage variations on the proposed temperature sensor is evaluated in this section. The area of the proposed sensor core is only  $55\mu m \times$ 

18µm without I/O pads as shown in Figure 4.6. The proposed process invariant temperature sensor is composed of a near-threshold ring oscillator, a sub-threshold ring oscillator, a fixed pulse width generator, counters, and a control unit. The double guard ring surround the near-threshold ring oscillator and sub-threshold ring oscillator to prevent other circuit from interference but increase area slightly. Figure 4.5(a) shows digital output TS[10:0] remains almost the same across corners in post-layout simulation. The measurement error over 0°C~ 100°C is within -2.8°C~ +3.0°C as shown in Figure 4.5(b), which demonstrates good process immunity for the proposed sensor. The effective resolutions for all test chips spread over  $0.25^{\circ}$ C.





71



Figure 4.6 Microphotograph of proposed process invariant temperature sensor

The proposed sensor shared I/O pads with other designs within the 0.94mm × 0.94mm chip. For measuring convenience, we design PCB board as shown in Figure 4.7. Several regulator circuits are set for filter bouncing noise. Besides, there are a jumper and a switch for selection between DC-DC converter and temperature. The SMA terminal is utilized to receive *START* signal, because sample frequency is higher than normal condition.



Figure 4.7 PCB board design.

The measurement environment was set up as shown in Figure 4.8. Before measuring each test chip, the temperature of the programmable temperature and humidity chamber EZ040- 72001 was set to 0°C first and one hour was waited for the chamber temperature to be stable. For 0°C measurement, CLK signal was generated by pulse/function generator 8116A for the control unit of the test chip. Meanwhile, START signal was issued to reset the test chip and activate the proposed sensor conversion. After the counters of the test chip complete one operation, RDY signal will be inserted by the control unit of the test chip. The 11-bit digital output TS signals were then recorded by logic analyzer 16900A. It is worth noticing that the test chips were not firmly packaged and the bare die could be seen as shown in Figure 4.9. Such setting can help stabilize the core temperature of the test chips during measurement. The measurement of the proposed sensor was done in 5°C steps over 0°C~100°C temperature range. A 0.5°C/min heating slope was set to increase chamber temperature smoothly. Each temperature measurement was recorded after holding desired temperature point for 10 minutes.



Figure 4.8 Measurement environment for the test chips.



Figure 4.9 Bare die of the test chip on PCB board.

The supply voltage for the test chips is 0.4V. The measurement errors are  $-1.81^{\circ}C \sim +1.52^{\circ}C$  for 12 test chips after one-point calibration, as shown in Figure 4.10. To ease chip realization, one-point calibration was fulfilled offline by linear curve fitting with the digital outputs of 80°C. The corresponding 3 $\sigma$  inaccuracy is  $-2.79^{\circ}C \sim +2.78^{\circ}C$ . The average effective resolution of the test chips is measured to be 0.49°C/LSB. The average power consumption is 520nW at 0.4V supply voltage and 45k samples/sec conversion rate. The measurement results of 12 test chips are shown in Figure 4.11 having an excellent linearity. Also, the ability of the proposed sensor suppressing the effect of process variation is demonstrated. To reveal the effect of voltage variation, the corresponding measurement errors are depicted in Figure 4.12 for 0.36V~0.44V (10% supply voltage variation). The inaccuracy of temperature measurement under voltage variation is within  $-6^{\circ}C \sim +8^{\circ}C$ .



Figure 4.10 Measured error curves for 12 test chips.



Figure 4.11 Measured result curves for 12 test chips.



Figure 4.12 Measurement error curves for voltage variations.

In Table 4.1, the achieved performance of proposed ultra-low voltage process invariant frequency-domain temperature sensor is compared with recent temperature sensors. The ultra-low voltage operation ability of the proposed sensor achieves extreme low power consumption per conversion rate of only 11.6pJ/sample.

| Sensor    | Technology | Power          | Conv. Rate | Power /  | Resolution | Inaccuracy          | Temp.     |
|-----------|------------|----------------|------------|----------|------------|---------------------|-----------|
|           |            |                | (sample/s) | Conv.    | (°C)       | (°C)                | Range(°C) |
|           |            |                |            | Rate     |            |                     |           |
|           |            |                |            |          |            |                     |           |
| [4.1]     | 0.7µm      | 25μA@2.5V-5.5V | 10         | 8.25     | 0.025      | $\pm 0.25(3\sigma)$ | -70~130   |
| [4.2]     | 0.7µm      | 75µA@2.5V-5.5V | 10         | 24.75    | 0.01       | ±0.1 (3o)           | -55~125   |
| [4.3]     | 0.35µm     | 10µW@3.3V      | 10k        | 0.001    | 0.16       | -0.7~0.9            | 0~100     |
| [4.4]     | 0.35µm     | 36.7µW@3.3V    | 2          | 18.35    | 0.092      | -0.25~0.35          | 0~90      |
| [4.5]     | 0.13µm     | 1.2mW@1.2V     | 5k         | 0.24     | 0.66       | -1.8~2.3            | 0~100     |
| [4.6]     | 65nm       | 400µW@ 1.2V    | 366k       | 0.0013   | 0.043      | -2.9~2.75           | -40~110   |
| [4.7]     | 32nm       | 1.6mW@1.05V    | 1k         | 1.6      | 0.45       | <5                  | -10~110   |
| This work | 65nm       | 520nW@0.4V     | 45k        | 0.000012 | 0.49       | -1.81~1.52          | 0~100     |

 Table 4.1 Performance Comparison of Recent Temperature Sensors.

1906

#### 4.5 Summary

A process invariant frequency-domain temperature sensor has been presented to enable on-chip temperature measurement. The sensor was designed to achieve ultra-low voltage operation. It composed of two temperature sensitive ring oscillators (TSROs). One was operated in near-threshold region (Near-TSRO) for the clock source of the proposed fixed pulse width generator. The other one was operated in sub-threshold region (SB-TSRO) for the clock source of the digital output counter. With a 2-input AND circuit, the digital output of the proposed temperature sensor was proportional to the ratio of the SB-TSRO frequency to the Near-TSRO frequency,  $f_{o2}/f_{o1}$ . According to the different conduction current in near-/sub-threshold region, the effect of process variation on the proposed sensor could be greatly suppressed. Meanwhile, the relationship between temperature and  $f_{o2}/f_{o1}$  was linearly positive related.

The realization in TSMC general purpose 65nm CMOS technology meets the target to be capable of 0.4V supply voltage operation over the temperature range of 0°C to 100°C. The area of the sensor core (without I/O pads) is only  $990\mu m^2$ . The power consumption per conversion rate is 11.6pJ/sample, which is a hundredfold improvement over previous work [4.4], [4.6]. All these characteristics make the proposed sensor special applicable for energy-limited miniature portable platforms.

# Chapter 5 Temperature-Aware DRAM Refresh Controller in TSV 3D-IC

## **5.1 Introduction**

Though-silicon-via (TSV) has emerged as a promising solution in building 3D stacked devices. It is a technology where vertical interconnects formed through the wafer to enable communication among the stacked chips [5.1], [5.2]. There are also other wafer level processing technologies to form 3D structures including the single-crystal Si layer stacking method [5.3], [5.4]. TSV technology is believed to have the potential to open up many new horizons in the semiconductor industry in the near future. This is because it provides many benefits including high density, high band-width, low-power, and small form-factor [5.5], [5.6]. Also, as we near the limit of technology scaling, it is believed to be a promising solution to overcome the scaling limit.

Another possible application is "logic+memory" combination, where a single or multiple memories are directly stacked on top of a logic chip [5.7], [5.8]. Here, the logic chip and the memory can communicate through thousands of IOs allowing high-bandwidth with low power. Also heterogeneous integration circuits and 3D logic chip applications are expected to emerge in the future. In the former application, TSVs are used to interconnect logic, memory, analog, RF sensor and MEMS chips among others. In the latter one, a logic chip itself such as CPU, can be built 3-dimensionally [5.9]. Figure 5.1 is a conceptual schematic of a hyper-integrated 3D-IC combined with a contemporary flip chip package and heat sink technology.



Figure 5.1 3D circuit architecture connected to a conventional heat removal device [5.16].

However, for multi-level 3D-IC, high level of integration introduces the problem of thermal and self-heating, which is the result of increased power density. Although the power consumption of a die within a 3D-IC is expected to decrease due to the shorter interconnects, the heat removing of a 3D-IC is much more difficult than that of a 2D-IC. The cause is that the ambient environment of the die of a 2D-IC is the cooling material, but the ambient environment of a die within a 3D-IC may be another die which also generates heat. Therefore, the thermal issue of a 3D-IC is much severer than that of a 2D-IC. This feature makes the circuits in 3D-IC must operate adaptively according to the thermal condition of each layer.

This chapter proposes a temperature-aware refresh controller of the dynamic random access memory (DRAM) in intra layer of 3D-ICs. Also, previous works of

DRAM refresh mechanisms are discussed. To analyze the data retention time accurately, a 1Kb DRAM block is build up with TSMS 65nm CMOS process. Besides, a process invariant frequency-domain temperature sensor proposed in chapter 4 is utilized to measure DRAM block temperature and control the refresh frequency adaptively for DRAM thermal monitor and power consumption control.

The rest of this chapter is organized as follows. The thermal issues and solutions in 3D DRAM will be discussed in section 5.2. In section 5.3, System architecture of heterogeneous 3D Integration is build up. Next, temperature-aware refresh controller of DRAM layer in 3D-IC will be proposed in section 5.4. Simulation results of proposed architecture are given in section 5.5. Finally, section 5.6 concludes this chapter.

# 5.2 Thermal Issues and Solutions in 3D-IC and DRAM Refresh

#### 5.2.1 Thermal Issues in 3D-IC

To study the thermal impact of hot spot size and power density on 3D stack design, thermal finite element simulations were performed in [5.10]. Two simulation setups have been used. The fine grain simulation of [5.11] takes into account the complete back-end -of-line (BEOL) and layout structure whereas in the FEM simulation of [5.12] simplified models are using volume-averaged material properties. These finite element simulations have been calibrated with a test structure that consists of heaters integrated with thermal sensors (diodes). Heaters with a size of  $50 \times 50 \ \mu\text{m}^2$  and  $100 \times 100 \ \mu\text{m}^2$  are located in the metal 2 layer of the BEOL in the top tier of the 3D chip-stack, as well as in a 2D reference die. Both in the top and the bottom die of the stack, a set of five diodes at different distances from the hot spot centre are added are integrated below the heater. This configuration of diodes allows capturing the local temperature peak due to the hot spot power dissipation. The simulation results and experimental validation [5.13] (Figure 5.2) indicate that power dissipation in a 3D stacked structure approximately has a higher maximum temperature increase compared to the 2D reference case, requiring thermal-aware floor-planning to avoid thermal problems in the stack.



Figure 5.2 Temperature increase on the top die in a 3D chip-stack caused by a  $100 \times 100 \mu m^2$  hot spot is approximately three times higher than the temperature increase in a 2D SoC chip [5.10].

To implement the thermal-aware floor-planning in 3D stacks, a thermal compact model has been developed [5.14]. With this model, the temperature distribution is calculated in each die, using the power maps of the heat generation in each tier as input. This compact model allows studying the thermal interaction of heat sources in the 3D stack, both on the same die as well as on other levels of the stack. Furthermore, the compact model allows thermal optimization of the placement of the heat sources as a function of the geometrical and material properties of the interface and interconnects structures. Figure 5.3 shows the graphical interface of this thermal compact model.



Figure 5.3 Graphical interface of the thermal compact model for 3D stacked structures [5.10].

# 5.2.2 The 3D-IC with Interlayer Cooling

In CMOSAIC [5.15], a multi-disciplinary team will jointly conduct experimental research, develop the necessary modeling tools, simulate 3D-IC stacks and test various prototype stacks to develop practical methods for heat removal in high performance 3D-ICs.

Figure 5.4 depicts a simplified schematic diagram of a 3D-IC with the chips assembled on top of each other and with vertical TSVs between layers. Microchannel cooling elements are etched into the lower face of each chip to remove the heat dissipated locally by each chip. Two different types of coolants will be evaluated for heat removal: a single-phase water based nano-fluid and an environmentally friendly, two-phase evaporating refrigerant. The temperatures within the 3D-IC system have to remain below 90°C during operation to avoid damage to the chip. The objective of the coolant is to maintain the chip's temperature at or below this value while dissipating heat fluxes per layer up to 100-150 W/cm<sup>2</sup> and targeting an inlet coolant temperature of 30-40°C.



Figure 5.4 Scheme of 3D-IC stack with microchannel [5.15].

Figure 5.5 summarizes the overall objective: To build a 3D-IC chip having more than three high power-density logic layers with channels etched on the backside of the chips in between the TSV that provides very large heat transfer coefficients for removal of 100-150 W/cm<sup>2</sup> per layer in between  $15x15 \text{ mm}^2$  chips. The 3D-IC is embedded in a silicon case that provides the manifold structure for fluid input and output and that also allows external contact to a carrier using conventional C4 flip chip bonding.

Challenges to build such a system are huge and diverse, requiring development of the TSV etching and plating processes, the channel etching processes, the bonding processes between the layers, the sealing methods, the development of single-phase and two-phase compatible channel network designs, the integration of the chip stacks into a sealed case, the connection to the carrier, and a fluid delivery system.



Figure 5.5 3D-IC with TSVs and inter-layer cooling channels that is enclosed in a sealed manifold [5.15].

On the other hand, analysis is performed to simulate 3D IC cooling performance with microchannels fabricated between two silicon layers using deep reactive ion etching and wafer bonding techniques [5.16]. Figure 5.6 illustrates four different 3D stack schemes for a given flow direction. To simulate nonuniform power distributions in practical 3D ICs, the device is divided into logic circuitry and memory, where 90% of the total power is dissipated from the logic and 10% from the memory. This work assumes that heat generation represents the power dissipation comes from the junctions and interconnect Joule heating. For case (a), the logic circuit occupies the whole device layer 1, while the memory is on the device layer 2. In the other cases, each layer is equally divided into memory and logic circuitry. For case (b), a high heat generation

area is located near the inlet of the channels, while it is near the exit of channels for case (c). Case (d) has a combined thermal condition in which layer 1 has high heat flux and layer 2 has low heat dissipation near the inlet. The total circuit area is  $4 \text{ cm}^2$ , while the total power generation is 150 W.



Figure 5.6 Two-layer 3D circuit layouts for evaluating the performance of micro -channel cooling. The areas occupied by memory and logic are the same and the logic dissipates 90% of the total power consumption [5.16].

Figure 5.7 compares the thermal performance of the microchannels and conventional heat sinks and plots the predicted junction temperature distributions along the flow direction. In case of Figure 5.7 (a), the heat generation from each layer is uniform and the junction temperature profile with conventional heat sink is symmetric. The microchannel cooling has distinct characteristics of a nonuniform temperature distribution, even under a uniform heating condition. The temperature increases along the channel in the liquid phase region due to sensible heating, and decreases in the two-phase region due to decrease of the fluid saturation pressure along the channel.

The junction temperature has its peak at the onsite of boiling point due to the dramatic change in convective heat transfer coefficient from a liquid-phase region to a two-phase region. The temperature difference between layers is greatly reduced by more than 10°C using microchannels because of the small thermal resistance of direct heat removal from layers.

In cases of (b) and (c), identical junction temperature distributions are presented for conventional fin heat sinks. Using microchannels, however, the temperature distribution is quite different, because of the convection nature of flow direction dependence. In both cases, the conventional heat sink presents highly nonuniform junction temperatures of about 25 and 45°C differences for layer 1 and layer 2, respectively, due to the concentrated heat flux. With microchannels, if more heat is applied to the upstream region, boiling occurs earlier resulting in increased pressure drop in the channel. Thus case (c) has a lower pressure drop, lower average junction temperature, and more uniform temperature field than case (b). In case (c), water is gradually heated up in the upstream region, where lower power dissipation is located, and downstream water boils and absorbs heat from the higher power region with low thermal resistance. Since the length of the two-phase region in case (c) is shorter than that in case (b), the overall junction temperature is lower due to a smaller pressure drop. An interesting result for case (c) is that the junction temperature distribution is quite uniform even with highly nonuniform power dissipation, which is one of the powerful merits of the two-phase microchannel cooling.

In case (d), the microchannel heat sink has almost the same pressure drop (26.3 kPa) as in case (a). In both cases, the flow has an identical wall heat rate from the silicon wall to the fluid and the channel fluid temperature profiles are almost identical. The

junction temperature is determined by the heat flux and convective thermal resistance from the wall to the fluid. Layer 1 has a high temperature hump near the inlet due to high heat flux and low convective heat transfer coefficient in the single-phase region. The highest temperature in layer 2 is lower than that in layer 1, because of the convective nature of the flow direction dependence and high two-phase convective heat transfer. Except for the temperature hump of layer 1, the overall temperature profile with a microchannel heat sink is more uniform than that using the conventional fin heat sink. In all cases with conventional cooling, the temperature of layer 2 is always higher than that of layer 1 due to larger thermal resistance to the environment.





Figure 5.7 Comparison of junction temperatures in a two-layer stacked circuit for the cases of an integrated microchannel heat sink and a conventional heat sink. The total flow rate of the liquid water is 15 ml/min and the mass flux is  $1.36 \times 10^{-5}$  kg/s [5.16].

# 5.2.3 Previous Works of DRAM Refresh Control

In low-power DRAMs, since the thermometer is only used during the self-refresh mode and self-refresh current is very small, one thermometer in any location could be safely used. Another concern regarding usage of the thermometer consumed large current, including dc current in the analog circuits. This problem is solved by a proper control scheme shown in Figure 5.8[5.17]. Figure 5.8(a) shows a self-refresh enable 89

signal generating a burst refresh signal shown in Figure 5.8(b). The burst refresh signal starts 8K refresh cycles with 1- $\mu$ s refresh period (T<sub>1</sub>) shown in Figure 5.8(d) and the last refresh cycle stops the burst operation. The burst refresh operation at the beginning of self-refresh mode is required to initialize all the cell data to Vdd and Vss so that the cell refresh characteristics are no longer dependent on the previous data largely lost by noisy read and write operations. When the burst refresh operation is finished, thermometer is turned on and measures a temperature. Then, the refresh operation is executed according to the refresh period  $T_2$  determined by the measured temperature. The thermometer is turned on again when 8K refresh cycles are finished, and the process continues until the self-refresh mode is ended. Since the temperature is not changed abruptly, the nontemperature- measurement period T<sub>3</sub>, which is less than one second in this design, could be enough to follow temperature variation with much smaller error than 1 °C. In summary for the current issue, even though the current consumption for the thermometer during the temperature measurement period is as large as 2.4 mA, the average current is less than 1 A since one measurement cycle with 32 s ( $T_4$ ) is executed during the entire 8K refresh cycles.



Figure 5.8 Self-refresh and thermometer control scheme [5.17].

Figure 5.9 shows the concept of a proposed self-refresh scheme with an on-chip thermometer. The self-refresh scheme consists of a conventional self-refresh circuit, thermometer, temperature comparator, fuse boxes, refresh period generator, and DQ interface. The thermometer block consists of a temperature sensor that drives an analog-to-digital converter (ADC) to generate a digital representation of the on-chip temperature stored in the registers. The thermometer registers act as a counter and have fuse options to compensate for  $\pm 8$  °C of temperature offset due to errors in the sensor. A temperature comparator compares the measured temperature with respect to a reference temperature in a reference table. There are eight reference temperatures and eight 6-bit fuse boxes so that the measured temperatures can be arranged in 10°C increments between 20°C and 90°C. Once the temperature range is known, a 6-bit fuse box corresponding to the temperature is selected. The fuse boxes are preset to certain refresh times based on extensive test results or programmed through a predetermined conversion table with refresh times. The output of the fuse box sends a refresh time at the temperature and a proper refresh period according to the refresh time is chosen from various refresh period generator. The measured on-chip temperature in temperature registers can be read to DQ pads by a command of special test modes.



Figure 5.9 Block diagram of a self-refresh scheme [5.17].

Figure 5.10 depicts the circuit implementation of the self-refresh control scheme for a mobile DRAM with a temperature sensor in [5.18]. It is composed of a ring oscillator having 5-stage inverters, a level converter for full-swing signal recovery, a set of resistors and capacitors for controlling the propagation delay, and a temperature sensor for measuring on-chip temperature. After input signal Refresh\_EN becomes active, oscillator output SELF\_OSC starts oscillation with a period decided by the latencies of inverter stages. For setting a proper oscillation frequency of the oscillator under a given temperature, control signals P4~P0 and N4~N0 from the temperature sensor adjust the conductive distance between the power supply and the active devices. SELF\_OSC is then fed into the counter to generate command signal SELF\_REF for periodically invoking self-refresh operations in the DRAM core. With this configuration, the circuit can effectively change the timing period between consecutive self-refresh operations to reliably retain DRAM cell data based on on-chip temperature. For this circuit to operate reliably, an on-chip sensor for measuring the temperature needs to be designed. To achieve low cost, it must have a moderate resolution, occupy as small silicon area as possible, and consume the lowest power.



Figure 5.10 Block diagram and timing diagram for the self-refresh period control with

temperature sensor [5.18].

Ring oscillator is generally a preferred structure in DRAMs for the implementation of important building blocks such as counters and pumping circuits. Usually, the oscillation frequency of a ring oscillator controlled by a temperature sensitive bias current can be effectively utilized as a means to monitor the temperature of a chip. The CMOS temperature sensor proposed in this paper utilizes the temperature dependency of poly resistance to generate a temperature dependent bias current, and a set of ring oscillators to convert this bias current to a digital code. Using this temperature sensor, we can implement a temperature-driven, low-cost, small-area DRAM self-refresh control scheme with ultra-low standby power consumption.

#### **5.3 Heterogeneous 3D Integration**

Slow cache memory systems and low memory bandwidth present a major bottleneck in performance of modern microprocessors. As the mature TSV 3D integration grows, Jacob *et al.* discussed the advantages of moving the memory hierarchy to independent tiers on multi-core processors to mitigate the memory wall effects [5.20]. Such architecture would require multiple wide structures that are feasible only with 3D chip stacking using ultra small and dense vertical TSVs. To demonstrate the benefits discussed above, a heterogeneous 3D integration of processor memory stack is built. The heterogeneous integration consists of a 16-core processor tier, a SRAM tier, a DRAM tier, and a front-end circuit tier, as shown in Figure 5.11.

The multi-core processor with L1 cache is located on the bottom layer. A 1Mb SRAM based L2 cache, which is smaller but operates faster, is sandwiched between the processor chip and the main memory chip. And a 128Mb DRAM memory chip, which can hold more data but operate slower, stacked on top of the L2 cache. Such a 3D structure with multiple level of memory hierarchy alleviates the memory wall problem and increases the throughput of multi-core processor. Additionally, for wireless communication with antenna and the digital baseband, a front-end RF/analog module is integrated into the process-memory stack. Since the inseparable architecture of multi-core processor and the memory hierarchy, the front-end circuit stacked only suitable on the top stratum.

On the other hand, the largest VDD/VSS noise occurs in refresh modes. When a device pulls large amounts of current at once, VDD and GND bounces can occur which can result in functional failures. The noise peak is determined by the combined effect of current variation rate within the given time (di/dt), package inductance (L)

and current-resistance (IR) voltage drop. The power noise is a bigger concern in DRAM layer of 3D-IC than in normal DRAM. In [5.19], TSVs are disposed on the edge to reduce 3D power noise and ground bounce is reduced significantly. As the result, we also dispose TSV in the same way.

System floor planning of proposed 3D stack architecture is shown in Figure 5.11 and Figure 5.12. The voltage regulation module (VRM) contains voltage regulator and DC-DC converter. The power TSV deliver supply power to VRM which regulates voltage and convert high voltage to lower voltage to support power in each layer.



Figure 5.11 Heterogeneous integration of multi-core, SRAM, DRAM, front-end

circuits stacking.



Figure 5.12 Detail floor planning of each layer.
In first and second layer partition, the SRAM chip is divided into 16 cores, and each SRAM core is connected to a CPU core by a group of I/O. This system has two advantages. First, each pair of cores (i.e., a CPU core and a SRAM core) can be operated independently by using dedicated I/O. Thanks to the use of TSV technology, each CPU core can have a dedicated SRAM core, which can be independently accessed, because the number of I/O pins is increased and the pins can be placed anywhere on the chip. This memory architecture widely increases the flexibility of memory system design. The second advantage is an ultra-high-speed interface. Small TSVs are used to integrate thousands of I/O pins on a chip, and interconnect parasitic between stacked chips is greatly reduced. These features make a high speed interface between chips possible.

The third and fourth layer partitions are DRAM and RF/analog circuits respectively. The DRAM is used as main-memory for a system with limited memory capacity. The 128 Mb DRAM is divided into two memory cell arrays. And the two identified arrays are symmetrical in each side of DRAM stratum. The peripheral circuits, decoders and sense amplifiers, are placed in the middle of the chip for optimal distribution of control signals to all the arrays in this tier. The data and addresses then connected peripheral circuits of DRAM cell with L2 cache through signal TSVs. To effective monitor temperature variation and reduce refresh power, 128 Mb DRAM is divided up into several sub-blocks with proposed a temperature-aware refresh controller. The temperature-aware refresh controller will be discussed in section 5.4.

## 5.4 Temperature-Aware Refresh Controller of DRAM Layer

# 5.4.1 Data Retention Time Analysis



Figure 5.13 Two steps of signal amplification. (a) Selected WL turning on and charge shared between  $C_{cell}$  and  $C_{BL}$ . (b) Sense amplifier amplify small voltage swing of BL.

For the DRAM refresh operation, there are two steps of signal amplification in sense amplifiers, as shown in Figure 5.13. First, the selected word line (WL) turn on, thus stored charge are shared between cell capacitance  $C_{cell}$  and bit line parasitic capacitance  $C_{BL}$ . As a result of charge sharing, the signal voltage which is developed on the floating data-line can be expressed by the following equations:

$$V_{BL} = \frac{\left[C_{cell} \times V_{SN} + C_{BL} \times \left(V_{DD} / 2\right)\right]}{\left(C_{cell} + C_{BL}\right)}$$
(5.1)

$$\Delta V_{BL} = \left| V_{BL} - V_{DD} / 2 \right| \tag{5.2}$$

$$V_{BL} = \Delta V_{BL} ("1") + V_{DD} / 2$$
(5.3)

where  $V_{BL}$  is bit line voltage,  $V_{SN}$  is storage node voltage in DRAM cell and  $\Delta V_{BL}("1")$  is voltage difference on bit line after readout stored "1" cell. Because the cell stored '1' is severely subjected to leakage current, we only discuss the case that memory cell is stored '1'. Furthermore, we can derive the equation about  $V_{SN}$ :

$$V_{SN} = V_{BL} \left(1 + \frac{C_{BL}}{C_{cell}}\right) - \frac{C_{BL}}{C_{cell}} \left(\frac{V_{DD}}{2}\right)$$
$$= \left(1 + \frac{C_{BL}}{C_{cell}}\right) \Delta V_{BL} ("1") + \frac{V_{DD}}{2}$$
(5.4)

In this work, we carry out the implementation of a 1-Kb DRAM by using TSMC 65nm 1P9M CMOS process. The DRAM cell is made of MIM capacitor in logic general purpose process with 1.2V supply voltage. Also, the 1T-1C cell and folded array sense amplifier are utilized to construct the block which is 128bits per bit line and 8bits per word line architecture. We can obtain  $C_{cell}$  and  $C_{BL}$  are almost equal to 5.75fF and 17.57fF respectively by simulation. These values should be put into (5.4) and derive (5.4) to

$$V_{SN(\min)}("1") = 4.056 \times \Delta V_{BL}("1") + 0.6$$
(5.5)

where  $V_{SN(min)}("1")$  is the limitation of storage node voltage for safely read.

To obtain the data retention time, we measure the time period when  $V_{\text{SN}}$  decrease \$99\$

from  $V_{DD}$  to  $V_{SN(min)}$  ("1"). As a result, the data retention time vary as the sensitivity of sense amplifier changes. When larger  $\Delta V_{BL}$  is required, smaller data retention time will be measured. The worst case and best case are located at FF corner and SS corner respectively, so we measure the data retention time at FF corner. The simulation result is shown in Figure 5.14. The sense amplifier can operate normally when $\Delta V_{BL}$  is in the range 80mV to 140mV. However, the refresh timing period still needs to be confirmed, so we run the Monte Carlo simulation for  $\Delta V_{BL}$ =120mV to check the data retention time distribution. As shown in Figure 5.15, 5k trials are simulated in 125°C, 75°C, 25°C and -25°C respectively. The data retention time is distributed in the range 45µs to 10µs, and we will design over all system according to this simulation result. In section 5.4.2, we will discuss these circuits in detail.



Figure 5.14 The data retention time of DRAM cell in different sensitivity of sense

amplifier.



Figure 5.15 Monte-Carlo simulation of data retention time when  $\Delta$  VBL =120mV.



In order to achieve small self-refresh current, DRAM required on-chip thermometer with self-refresh scheme. When a thermometer is implemented in a memory chip, many factors should be considered, including number, location, accuracy, area, and power consumption. Among the factors, the number, location, area penalty, and power consumption of the thermometer are the main items restricting the usage of the thermometer in memory chips. The number and location of thermometer are especially important factors for the chips with high power consumption. Moreover, timing control is important to refresh DRAM cells in the right time, so refresh period generator is required. On-chip, low power and small area temperature sensor is required for DRAM thermal monitor and power consumption control. As shown in Figure 5.16(a), the DRAM array is separated to several sub-blocks which share temperature sensor with near blocks. The temperature sensors are located on the corner position of each sub-block, as shown in Figure 5.16(b). Therefore, we can use least temperature sensor to monitor largest DRAM array region effectively. On the other hand, the VRM composed of voltage regulator and DC-DC converter provides various voltages in this layer. The higher supply voltage (VDD) is in the range of 1.5V to 1.2V and provided from regulator. The lower supply voltage (VDDL) is in the range of 1.2V to 0.4V and provided from DC-DC converter.

The temperature sensor is placed in the scheme to monitor the sub-block temperature. After temperature sensor measuring the temperature information, it sends the temperature information to temperature mapping table. Because the temperature senor we proposed is operated in *VDDL* (0.4V) the level shifter is necessary to shift voltage level to DRAM operating voltage, *VDD* (1.2V). The mapping table would convert digital output of temperature sensor into *Ctrl* [1:0] signal to control the refresh CLK generator. The refresh CLK generator divides *CLK*<sub>IN</sub> into various slower frequencies according to *Ctrl* [1:0] and send divided clock signal to refresh counter as shown in Table 5.1. The refresh counter output control row decoder to choose which row should be refreshed. One word line refresh one time every 128 *CLK*<sub>REF</sub> periods.

| Temperature(°C) | Ctrl [1:0] | CLK <sub>REF</sub> (MHz) |
|-----------------|------------|--------------------------|
| 100-75          | 11         | 20                       |
| 75-50           | 10         | 10                       |
| 50-25           | 01         | 5                        |
| 25-0            | 00         | 2.5                      |

Table 5.1 The relation between control signal and refresh frequency



Figure 5.16 (a) The DRAM layer in 3D-IC. (b) The refresh controller of DRAM sub-block.

The refresh CLK generator shown in Figure 5.17(a) is able to divide the frequency to 2, 4 or 8, according to how many flip-flops are in the loop. For example, when Ctrl [1:0] is equal to 10, the clock loop will propagate through only one flip-flop, thus the output frequency is the division of CLK<sub>IN</sub> by 2. Table 5.1 lists the relation between Ctrl [1:0] and CLK<sub>REF</sub> frequency. The 20MHz CLK<sub>IN</sub> frequency is provided by a stable 103

temperature-independent clock source. In each  $CLK_{REF}$  period, the sense amplifier controller shown in Figure 5.17 (b) sends the signals, SAN, SAP and PRE to control the sense amplifier. The SAN signal swing between ground and VDD/2, and SAP signal swing between VDD/2 and VDD. The delay line made of inverter chain is utilized to control refresh timing and let sense amplifier operate normal and prevent cell data corruption.



Figure 5.17 (a) Refresh CLK generator. (b) The sense amplifier control circuit.

## **5.4.3 Simulation Results**

The proposed temperature-aware refresh controller of DRAM layer in 3D-IC is implemented in TSMC 65nm 1P9M CMOS technology. In order to verify the proposed design, we carry out the implementation of a 1-Kb DRAM block. For convenience, 1Kb block is utilized to analyze the standby power of 2Mb DRAM in third layer. The operation waveform of sub-block system is shown as Figure 5.18. The  $V_{SN00}$ ,  $V_{SN10}$  and  $V_{SN20}$  are the storage node voltage in first bit cell on WL<sub>0</sub>, WL<sub>1</sub> and WL<sub>2</sub> respectively.



Figure 5.18 The operation waveform of sub-block system.

Figure 5.19 shows standby power analysis at 25°C, 50°C, 75°C and 100°C. The standby power is dominated by 2Mb array and increase with temperature. Also, the power overhead, including temperature sensor, mapping table, sense amplifier

controller and refresh clock generator is not significant. The power overhead is about 26 % at 25°C and 15.39% at 100°C.



Figure 5.19 Standby power analysis of 2Mb DRAM.

The power reduction is shown in Figure 5.20. The line without controller adopted the refresh period based on data retention time at 100°C. The other line utilized proposed refresh control scheme with variable refresh period achieve up to 67.67% standby power reduction compared with without controller one. Therefore, the proposed temperature-aware DRAM refresh controller reduces standby power significantly.



Figure 5.20 Standby power reduction of 2Mb DRAM.

# 5.5 Summary



This chapter presents and discusses thermal issues on 3D-IC and some solution at first. Also some conventional DRAM refresh approaches are discussed. Next, the heterogeneous architecture which contains CPU, SRAM, DRAM and analog circuits is presented. To prevent hot spot on the intra layer and reduce DRAM refresh power, we proposed a refresh controller utilizing the process invariant temperature sensor of chapter 4. Thanks for tiny power consumption of temperature sensor, the controller reduces standby power significantly, 67.67% without much power overhead.

# Chapter 6 Conclusions and Future Work

# **6.1 Conclusions**

The advanced CMOS process makes it possible to integrate many designs into a single chip. As a result, certain areas of the chip involving high switching activities can generate a localized high-temperature area called a "hotspot." Furthermore, in system-in-a-package design with 3D integrated-circuit technology or stack dies, the situation will become worse than before. In this thesis, we focus on temperature sensor design to solve hotspot and thermal issue in 3D integrated-circuit technology. Also, to achieve ultra low power we proposed two type of temperature sensor which can be operated at ultra low voltage, including 0.5V~0.25V PVT sensors with adaptive voltage selection and 0.4V fully integrated process invariant temperature sensor.

The 0.5V~0.25V PVT sensors with adaptive voltage selection composes of process, voltage and temperature sensor. The process and voltage (PV) sensor monitor the process and voltage variation continuously and give the variation information for temperature auto-compensation. It operate over an ultra-low supply voltage range from 0.25V~0.5V. The power consumption is  $2.3\mu$ W at 0.25V supply voltage and 50k samples/sec conversion rate. The above characteristics make the proposed sensor special applicable for energy-harvesting miniature portable platform.

Next, the ultra-low voltage fully integrated process invariant frequency-based temperature sensor is proposed. The effect of process variation is significantly reduced. The realization meets the target to be capable of 0.4V supply voltage operation over the temperature range of 0°C to 100°C. The area of the sensor core (without I/O pads) is only 990µm<sup>2</sup>. The power consumption per conversion rate is 11.6µJ/sample. The high area/energy efficiency characteristics make the proposed sensor applicable for energy-limited miniature portable platforms.

Finally, the heterogeneous 3D integration which contains CPU, SRAM, DRAM and analog circuits is presented. To prevent hot spot on the intra layer and reduce DRAM refresh power, we proposed a refresh controller utilizing the process invariant temperature sensor. Thanks for tiny power consumption of temperature sensor, the controller reduces standby power significantly, 67.67% without much power overhead.

1896

#### 6.2 Future Work

Wireless medical micro-sensors are usually with two different operating modes: *Low-Power Mode* and *Performance Mode* because the well-known signals of the main characteristics of cardiac activity. More than 99% operating time of sensor nodes are operating in *low-power mode* to record various physiological signals throughout its life time while only less than 1% operating time in *performance mode* to process and transmit real-time informative cardiovascular parameters to a host. This *low-power-mode*-dominated scenario is capable of further reducing total energy consumption if dynamic voltage frequency scaling (DVFS) technique is applied. The benefit of DVFS technique is attributed to the quadratic savings in active  $CV_{DD}^2f_{109}$  power.

The proposed 0.5V~0.25V PVT sensors can be used for DVFS system operated in sub/near-threshold region. Figure 6.1 shows the sub/near-threshold DVFS system, it is composed of two switched-capacitor (SC) DC-DC converters, decoupling capacitors (DeCaps), the proposed clock generator, level shifters (LS), DVFS controller, PVT sensors, supply switch, and near/sub-threshold 8T SRAM-based FIFO. The PVT sensors are used to measure environment process, voltage and temperature variation information. This information will be utilized by DVFS controller to switch supply voltage and scale operating frequency.



Figure 6.1 Sub/near-threshold DVFS system.

The process invariant temperature sensor can be utilized to monitor temperature variation in 3D IC. The process invariant property would make it special suitable to sense temperature in different layer without much inaccuracy. The conceptual image of heterogeneous 3D integration is shown in Figure 6.2. Based on discussion in section 5.2.2, the thermal issue is taken into account. Microchannel cooling elements

are set between face of each chip to remove the heat dissipated locally by each chip. The cold fluid is injected into microchannel and heat of chips is taken away by hot fluid. If there is some approach to control the fluid strength with the proposed temperature sensor, we can make temperature in the 3D-ICs as stable as possible. In this way, the power consumption of system would be reduced significantly.



# Reference

#### Chapter 1

- [1.1] S. K. Gupta, A. Raychowdhury and K. Roy, "Digital computation in subthreshold region for ultralow-power operation: a device-circuit-architecture codesign perspective," in *Proceeding of the IEEE*, pp. 160-190, Feb. 2010.
- [1.2] B. H. Calhoun, J. F. Ryan, S. khanna, M. Putic, J. Lach, "Flexible circuits and architectures for ultralow power," in *Proceeding of the IEEE*, pp. 267-282, Feb. 2010.
- [1.3] A. P. Chandrakasa, D. C. Daly, D. F. Finchelstein, J. Kwong, Y. K. Ramadass,
   M. E. Sinangil, V. Sze and N. Verma, "Technologies for Ultradynamic voltage scaling," in *Proceeding of the IEEE*, pp. 191-214, Feb. 2010.
- [1.4] W. H. Cheng and B. M. Baas," Dynamic Voltage and Frequency Scaling Circuits with Two Supply Voltages," in *IEEE Int'l Symp. Circuits and Systems*, pp. 1236-1239, June 2008.
- [1.5] V.F. Pavlidis, E.G. Friedman, "Interconnect-Based Design Methodologies for Three-Dimensional Integrated Circuits, " in *Proceedings of the IEEE*, vol.97, no.1, pp.123-140, Jan. 2009.
- [1.6] A. W. Topol, D. C. L. Tulipe, L. Shi, D. J. Frank, K. Bernstein, S. E. Steen, A. Kumar, G. U. Singco, A. M. Young, K. W. Guarini, M. Ieong, "Three-dimensional integrated circuits," in *IBM Journal of Research and* 112

Development, vol.50, no.4.5, pp.491-506, July 2006

- [1.7] T. Hamamoto, S. Sugiura, and S. Sawada, "On the retention time distribution of dynamic random access memory (DRAM)," in *IEEE Trans. Electron Devices*, vol. 45, pp. 1300–1309, June 1998.
- [1.8]Minchen Chang, Jengping Lin, S.N. Shih, Tieh-Chiang Wu, Brady Huang, Jen Yang, P.-I. Lee, "Impact of gate-induced drain leakage on retention time distribution of 256 Mbit DRAM with negative wordline bias," in *IEEE Trans. Electron Devices*, vol.50, no.4, pp. 1036- 1041, April 2003.
- [1.9] Joohee Kim, M.C. Papaefthymiou, "Block-based multiperiod dynamic memory design for low data-retention power," in *IEEE Trans. VLSI Systems*, vol.11, no.6, pp. 1006- 1018, Dec. 2003
- [1.10] M. Trakimas, Sungkil Hwang, S. Sonkusale, "Low Power Asynchronous Data Acquisition Front End for Wireless Body Sensor Area Network," 24th International Conference on VLSI Design, , pp.244-249, 2-7, Jan. 2011

#### Chapter 2

- [2.1] M. A. P. Pertijs, A. Bakker, and J. H. Huijsing, "A high-accuracy temperature sensor with second-order curvature correction and digital bus interface," in *IEEE Int'l Symp. Circuits and Systems*, vol. 1, pp. 368–371, May 2001.
- [2.2] G. Wang and G. C. M. Meijer, "The temperature characteristics of bipolar transistors fabricated in CMOS technology," in *Sens. Actuat.*, vol. 87, pp. 81–89, Dec. 2000.
- [2.3] G. C. M. Meijer, G. Wang, and F. Fruett, "Temperature sensors and voltage

references implemented in CMOS technology," in *IEEE Sensors Journal*, vol. 1, pp. 225–234, Oct. 2001.

- [2.4] A. L. Aita, M. A. P. Pertijs, and K. A. A. Makinwa, "A CMOS smart temperature sensor with a batch-calibrated inaccuracy of ±0.25°C (3σ) from -70° C to 130°C," *in IEEE Int. Solid-State Circuits Conf.*, pp. 342–343, 343a ,Feb. 2009.
- [2.5] M. A. P. Pertijs, K. A. A. Makinwa, and J. H. Huijsing, "A CMOS smart temperature sensor with a 3\_ inaccuracy of ±0.1\_C from -55°C to 125°C," in *IEEE Journal of Solid-State Circuits*, vol. 40, no. 12, pp. 2805–2815, Dec. 2005.
- [2.6] R.B.Staszewski, S. Vemulapalli, P. Vallur, J. Wallberg, P. T. Balsara, "1.3V 20p Time-to-Digital Converter for Frequency Synthesis in 90-nm CMOS", in *IEEE Trans. on Circuits and Systems II*, pp.220-224, Mar.2006.
- [2.7] C. M. Hsu, M. Z. Straayer, M. H. Perrott, "A Low-Noise, Wide-BW 3.6GHz Digital ΔΣ Fractional-N Frequency Synthesizer with a Noise-Shaping Time-to-Digital Converter and Quantization Noise Cancellation," *in IEEE Int. Solid-State Circuits Conf.*, pp.340-617, Feb. 2008.
- [2.8] T. T. Nguyen, S. Kwansu, S. W. Kim, "A Delay Line with Highly Linear Thermal Sensitivity for smart temperature sensor," in *IEEE Trans. Circuit and System*, pp.899-902, Aug. 2007.
- [2.9] P. Chen, M. C. Shie, Z. Y. Zheng, Z. F. Zheng, C. Y. Chu, "A Fully Digital Time-Domain Smart Temperature Sensor Realized With 140 FPGA Logic Elements", in *IEEE Trans. Circuit and system*, vol. 54, no. 12, pp. 2661–2668, December 2007.

- [2.10] Poki Chen, Tuo-Kuang Chen, Yu-Shin Wang, Chun-Chi Chen, "A Time-Domain Sub-Micro Watt Temperature Sensor With Digital Set-Point Programming, " in *IEEE Sensors Journal*, vol.9, no.12, pp.1639-1646, Dec. 2009.
- [2.11] K. Nose, M. Kajita, M. Mizuno, "A 1-ps Resolution Jitter Measurement Macro Using Interpolated Jitter Oversampling," in *IEEE Journal of Solid-State Circuits*, pp.2911-2920, Dec. 2006.
- [2.12] T. Komuro, "ADC Architecture Using Time-to-Digital Converter," *IEICE* vol. J90-C, April 2007.
- [2.13] J. P. Kim, W. Yang and H-Y Tan, "A low power 256-Mb SDRAM with an on-chip thermometer and biased reference line sensing scheme", in *IEEE Journal of Solid-State Circuits*, Vol. 38, No. 2, pp. 329-337, Feb. 2003.
- [2.14] Y.W. Li, H. Lakdawala, A. Raychowdhury, G.Taylor, K. Soumyanath, "A 1.05V 1.6mW 0.45°C 3σ-resolution ΔΣ-based temperature sensor with parasitic-resistance compensation in 32nm CMOS," *in IEEE Int. Solid-State Circuits Conf.*, pp.340-341,341a, 8-12 Feb. 2009
- [2.15] H. Lakdawala, Y.W. Li, H. Lakdawala, A. Raychowdhury, G.Taylor, K. Soumyanath, "A 1.05 V 1.6 mW, 0.45 °C 3σ Resolution ΣΔ Based Temperature Sensor With Parasitic Resistance Compensation in 32 nm Digital CMOS Process," in *IEEE Journal of Solid-State Circuits*, vol.44, no.12, pp.3621-3630, Dec. 2009
- [2.16] C.P.L. van Vroonhoven, K.A.A. Makinwa, "A CMOS Temperature-to-Digital Converter with an Inaccuracy of  $\pm$  0.5° C (3/spl 115

sigma)from -55 to 125°C," in *IEEE Int. Solid-State Circuits Conf.*, pp.576-637, 3-7 Feb. 2008

- [2.17] V. Székely, Cs. Márta, Zs. Kohári, and M. Rencz, "CMOS sensors for on-line thermal monitoring of VLSI circuits," *IEEE Trans. VLSI System*, vol. 5, no. 3, pp. 270–276, Sep. 1997.
- [2.18] M. Sasaki, M. Ikeda, K. Asada, "A Temperature Sensor With an Inaccuracy of -1/+0.8 °C Using 90-nm 1-V CMOS for Online Thermal Monitoring of VLSI Circuits," *IEEE Trans. Semiconductor manufacturing*, vol. 21, no. 2, pp. 201 – 208, May 2008
- [2.19] P. Chen, C. C. Chen; C. C. Tsai, W. F. Lu, "A Time-to-Digital-Converter-Based CMOS Smart Temperature Sensor," in *IEEE Journal of Solid-State Circuits*, vol. 40, no. 8, PP1642-1648, August 2005.
- [2.20] T. A. Demassa and Z. Ciccone, Digital Integrated Circuits. New York: Wiley, 1996.
- [2.21] P. Chen, C. C. Chen, Y. H. Peng, K. M. Wang, Y. S. Wang , "A Time-Domain SAR Smart Temperature Sensor With Curvature Compensation and a  $3\sigma$  Inaccuracy of  $-0.4^{\circ}$ C ~  $+0.6^{\circ}$ C Over a 0°C to 90°C Range," in *IEEE Journal of Solid-State Circuits*, vol.45, no.3, pp.600-609, March 2010
- [2.22] K. Woo, S. Meninger, T. Xanthopoulos, E. Crain, D. Ha, and D. Ham, "Dual-DLL-based CMOS all-digital temperature sensor for microprocessor thermal monitoring," in *IEEE Int. Solid-State Circuits Conf. Dig.*, pp. 68–69, Feb. 2009.

- [2.23] M. K. Law, A. Bermak, and H. C. Luong, "A sub-µW embedded CMOS temperature sensor for RFID food monitoring application," *IEEE Journal* of Solid-State Circuits, vol. 45, no. 6, pp. 1246–1255, Jun. 2010.
- [2.24] P. Ituero, J.L. Ayala, M. Lopez-Vallejo, "A Nano-watt Smart Temperature Sensor for Dynamic Thermal management, " in *IEEE Sensors Journal*, vol.8, no.12, pp.2036-2043, Dec. 2008
- [2.25] Kisoo Kim, Hokyu Lee, Sangdon Jung, Chulwoo Kim, "A 366kS/s 400uW 0.0013mm<sup>2</sup> frequency-to-digital converter based CMOS temperature sensor utilizing multiphase clock," in *IEEE Custom Integrated Circuits Conf.*, pp.203-206, 13-16 Sept. 2009

#### Chapter 3

- [3.1] W. H. Cheng and B. M. Baas, "Dynamic Voltage and Frequency Scaling Circuits with Two Supply Voltages," in *IEEE Int'l Symp. Circuits and Systems*, pp. 1236-1239, June 2008.
- [3.2] D. Markovic, C. C. Wang, L. P. Alarcon, L. T. Tsung, J. M. Rabaey, "Ultralow-Power Design in Near-Threshold Region," in *Proceedings of the IEEE*, vol.98, no.2, pp.237-252, Feb. 2010
- [3.3] S.K. Gupta, A. Raychowdhury, K. Roy ,"Digital Computation in Subthreshold Region for Ultralow-Power Operation: A Device–Circuit–Architecture Codesign Perspective," in *Proceedings of the IEEE*, vol.98, no.2, pp.160-190, Feb. 2010
- [3.4] K. Itoh, "Adaptive circuits for the 0.5-V nanoscale CMOS era," in *IEEE International Solid-State Circuits Conference*, pp.14-20, 8-12 Feb. 2009

- [3.5] J. Kwong, Y. Ramadass, N. Verma, M. Koesler, K. Huber, H. Moormann, A. Chandrakasan, "A 65nm Sub-Vt Microcontroller with Integrated SRAM and Switched-Capacitor DC-DC Converter," in *IEEE International Solid-State Circuits Conference*, pp.318-616, 3-7 Feb. 2008
- [3.6] H. Shao, C. Y. Tsui and W. H. Ki,"A Micro Power Management System and Maximum Output Power Control for Solar Energy Harvesting Applications," in *Int'l Symp. on Low Power Electronics and Design*, pp. 298-303, Aug. 2007.
- [3.7] H. Lhermet, C. Condemine, M. Plissonnier, R. Salot, P. Audebert, and M. Rosset, "Efficient Power management Circuit: From Thermal Energy Harvesting to Above-IC Microbattery Energy Storage," in *IEEE Journal of Solid-State Circuits*, vol. 43, pp. 246-254, Jan. 2008.
- [3.8] A. L. Aita, M. A. P. Pertijs, and K. A. A. Makinwa, "A CMOS smart temperature sensor with a batch-calibrated inaccuracy of ±0.25°C (3\_) from 1896
  -70\_C to 130\_C," in *IEEE International Solid-State Circuits Conf.*, pp. 342–343, 343a ,Feb. 2009.
- [3.9] M. A. P. Pertijs, K. A. A. Makinwa, and J. H. Huijsing, "A CMOS smart temperature sensor with a 3σ inaccuracy of ±0.1°C from -55°C to 125°C," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 12, pp. 2805–2815, Dec. 2005.
- [3.10] P. Chen, C. C. Chen; C. C. Tsai, W. F. Lu, "A Time-to-Digital-Converter-Based CMOS Smart Temperature Sensor," in *IEEE Journal of Solid-State Circuits*, vol. 40, no. 8, PP1642-1648, August 2005.
- [3.11] P. Chen, C. C. Chen, Y. H. Peng, K. M. Wang, Y. S. Wang, "A Time-Domain SAR Smart Temperature Sensor With Curvature Compensation 118

and a  $3\sigma$  Inaccuracy of  $-0.4^{\circ}$ C ~  $+0.6^{\circ}$ C Over a 0°C to 90°C Range," in *IEEE* Journal of Solid-State Circuits, vol.45, no.3, pp.600-609, March 2010

- [3.12] E. Socher, S. M. Beer, and Y. Nemirovsky, "Temperature sensitivity of SOI-CMOS transistors for use in uncooled thermal sensing," in *IEEE Transactions on Electron Devices*, vol. 52, no. 12, pp. 2784–2790, Dec. 2005.
- [3.13] Kisoo Kim, Hokyu Lee, Sangdon Jung, Chulwoo Kim, "A 366kS/s 400uW 0.0013mm<sup>2</sup> frequency-to-digital converter based CMOS temperature sensor utilizing multiphase clock," in *IEEE Custom Integrated Circuits Conf.*, pp.203-206, 13-16 Sept. 2009
- [3.14] Y. Taur and T. H. Ning, *Fundamentals of Modern VLSI Devices*. Cambridge,U.K.: Cambridge Univ. Press, 1998.
- [3.15] I. M. Filanovsky and A. Allam, "Mutual compensation of mobility and threshold voltage temperature effects with applications in CMOS circuits," in *IEEE Transactions on Circuits and Systems 1*, vol. 48, no. 7, pp. 876–884, Jul. 2001.
- [3.16] Tae-Hyoung Kim, J. Keane, Hanyong Eom, C.H. Kim, , "Utilizing Reverse Short-Channel Effect for Optimal Subthreshold Circuit Design," in *IEEE Transactions on VLSI Systems*, vol.15, no.7, pp.821-829, July 2007
- [3.17] H. Lakdawala, Y.W. Li, H. Lakdawala, A. Raychowdhury, G.Taylor, K. Soumyanath, "A 1.05 V 1.6 mW, 0.45 °C 3σ Resolution ΣΔ Based Temperature Sensor With Parasitic Resistance Compensation in 32 nm Digital CMOS Process," in *IEEE Journal of Solid-State Circuits*, vol.44, no.12, pp.3621-3630, Dec. 2009

[3.18] P. Ituero, J.L. Ayala, M. Lopez-Vallejo, "A Nanowatt Smart Temperature Sensor for Dynamic Thermal Management," in *IEEE Sensors Journal*, vol.8, no.12, pp.2036-2043, Dec. 2008

Chapter 4

- [4.1] A. L. Aita, M. A. P. Pertijs, and K. A. A. Makinwa, "A CMOS smart temperature sensor with a batch-calibrated inaccuracy of ±0.25°C (3\_) from -70\_C to 130\_C," in *IEEE International Solid-State Circuits Conf.*, pp. 342–343, 343a.,Feb. 2009,
- [4.2] M. A. P. Pertijs, K. A. A. Makinwa, and J. H. Huijsing, "A CMOS smart temperature sensor with a 3\_ inaccuracy of ±0.1°C from -55°C to 125°C," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 12, pp. 2805–2815, Dec. 2005.
- [4.3] P. Chen, C. C. Chen; C. C. Tsai, W. F. Lu, "A Time-to-Digital-Converter-Based CMOS Smart Temperature Sensor," in *IEEE J. Solid-State Circuits*, vol. 40, no. 8, PP1642-1648, August 2005.
- [4.4] P. Chen, C. C. Chen, Y. H. Peng, K. M. Wang, Y. S. Wang , "A Time-Domain SAR Smart Temperature Sensor With Curvature Compensation and a  $3\sigma$  Inaccuracy of  $-0.4^{\circ}$ C ~  $+0.6^{\circ}$ C Over a 0°C to 90°C Range," in *IEEE Journal of Solid-State Circuits*, vol.45, no.3, pp.600-609, March 2010
- [4.5] K. Woo, S. Meninger, T. Xanthopoulos, E. Crain, D. Ha, and D. Ham, "Dual-DLL-based CMOS all-digital temperature sensor for microprocessor thermal monitoring," in *IEEE International Solid-State Circuits Conf.*, pp. 68–69, Feb. 2009.

- [4.6] Kisoo Kim, Hokyu Lee, Sangdon Jung, Chulwoo Kim, "A 366kS/s 400uW 0.0013mm<sup>2</sup> frequency-to-digital converter based CMOS temperature sensor utilizing multiphase clock," in *IEEE Custom Integrated Circuits Conf.*, pp.203-206, 13-16, Sept. 2009
- [4.7] H. Lakdawala, Y.W. Li, H. Lakdawala, A. Raychowdhury, G.Taylor, K. Soumyanath, "A 1.05 V 1.6 mW, 0.45 °C 3σ Resolution ΣΔ Based Temperature Sensor With Parasitic Resistance Compensation in 32 nm Digital CMOS Process," in *IEEE Journal of Solid-State Circuits*, vol.44, no.12, pp.3621-3630, Dec. 2009

Chapter 5

[5.1] A. W. Topol *et al.*, "Three-dimensional integrated circuits," *IBM J. Res. Dev.*, vol. 50, no. 4/5, pp. 491–506, Jul./Sep. 2006.

- [5.2] J. A. Burns *et al.*, "A wafer-scale 3D circuit integration technology," in *IEEE Trans. Electron Dev.*, vol. 53, no. 10, pp. 2507–2516, Oct. 2006.
- [5.3] S. M. Jung, "Highly cost effective and high performance 65 nm S3 (stacked single-crystal Si) SRAM technology with 25F<sup>2</sup>, 0.16 μm<sup>2</sup> cell and doubly stacked SSTFT cell transistors for ultra high density and high speed applications," in *Symp. VLSI Technology Dig. Tech. Papers*, pp. 220–221, 2005.
- [5.4] K. T. Park *et al.*, "A 45 nm 4 Gb 3Dimensional double-stacked multi-level NAND flash memory with shared bitline structure," in *IEEE International Solid-State Circuits Conf.*, pp. 510–511, 2008.
- [5.5] J. Burns et al., "Three-dimensional integrated circuits for low-power,

high-bandwidth systems on a chip," in *IEEE International Solid-State Circuits Conf.*, pp. 268–269, 2001.

- [5.6] W. R. Davis *et al.*, "Demystifying 3D ICs: The pros and cons of going vertical," in *IEEE Design & Test of Computers*, vol. 22, no. 6, pp. 498–510, Nov./Dec. 2005.
- [5.7] K. Puttaswamy *et al.*, "Implementing caches in a 3D technology for high performance processors," in *Proc. IEEE Int. Conf. Computer Design*, pp. 525–532, 2005.
- [5.8] C. C. Liu *et al.*, "Bridging the processor-memory performance gap with 3D IC technology," in *IEEE Design & Test of Computers*, vol. 22, no. 6, pp. 556–564, Nov./Dec. 2005.
- [5.9] P. G. Emma *et al.*, "Is 3D chip technology the next growth engine for performance improvement?," in *IBM J. Res. Dev.*, vol. 52, no. 6, pp. 541–552, Nov. 2008.
- [5.10] G. Van der Plas et al., "Design Issues and Considerations for Low- Cost 3D TSV IC Technology,"*in IEEE Journal of Solid-State Circuits*, vol. 46, no. 1, pp.293-pp.307, Jan, 2011.
- [5.11] H. Oprins *et al.*, "Fine grain thermal modeling of 3D stacked structures," in *Proc. THERMINIC*, pp. 45–49, 2009.
- [5.12] C. Torregiani *et al.*, "Thermal analysis of hot spots in advanced 3Dstacked structures," in *Proc. THERMINIC*, pp. 56–60, 2009.
- [5.13] H. Oprins *et al.*, "Fine grain thermal modeling and experimental validation

of 3D-ICs," in Microelectronics J., submitted for publication, Nov. 2009.

- [5.14] C. Torregiani *et al.*, "A wafer-scale 3D circuit integration technology," in *Proc. EPTC*, pp. 131–136, 2009.
- [5.15] David Atienza et al "3D Stacked Architectures with Interlayer Cooling (CMOSAIC)," École Polytechnique Fédérale De Lausanne, [Online]. Available: <u>http://esl.epfl.ch/page-42448-en.html</u>
- [5.16] Jae-Mo Koo et al "Integrated Microchannel Cooling for Three-Dimensional Electronic Circuit Architectures," in Journal of Heat Transfer, vol. 127, pp. 49-58, Jan. 2005.
- [5.17] Minchen Chang et al "Impact of gate-induced drain leakage on retention time distribution of 256 Mbit DRAM with negative wordline bias," in *IEEE Trans. Electron Devices*, vol.50, no.4, pp. 1036-1041, April 2003.
- [5.18] Chan-Kyung Kim, Bai-Sun Kong, Chil-Gee Lee, Young-Hyun Jun, "CMOS temperature sensor with ring oscillator for mobile DRAM self-refresh control," in *IEEE Int'l Symp. Circuits and Systems*, pp.3094-3097, May 2008.
- [5.19] Uksong Kang et al "8 Gb 3D DDR3 DRAM Using Through-Silicon-Via Technology," *IEEE Journal of Solid-State Circuits*, vol.45, no.1, pp.111-119, Jan. 2010.
- [5.20] P. Jacob, A. Zia, O. Erdogan, P. M. Belemjian, J.-W. Kim, M. Chu, R. P. Kraft, J. F. McDonald, and K. Bernstein, "Mitigating memory wall effects in high clock rate and multi-core CMOS 3D ICs processor memory stacks," in *Proceedings of the IEEE JPROC*, vol. 97, no. 1, pp. 108-122, Jan. 2009

# Vita

# 林上圓 Shang-Yuan Lin

### PERSONAL INFORMATION

Birth Date: Aug. 12, 1987

Birth Place: Kaohsiung, TAIWAN.

E-Mail Address: exad7758@gmail.com

#### **EDUCATION**

07/2009 – 09/2011 M.S. in Electronics Engineering, National Chiao Tung University

Thesis: Ultra-low Dynamic Voltage Scaling Fequency-Ratio-Based PVT Sensor Design and Applications

09/2005 – 06/2009 B.S. in Engineering Science, National Cheng Kung University.

#### PUBLICATIONS

Ming-Hung Chang, Jung-Yi Wu, Wei-Chih Hsieh, Shang-Yuan Lin, You-Wei Liang, and Wei Hwang "High Efficiency Power Management System for Solar Energy Harvesting Applications" Asia Pacific Conference on Circuits and Systems, Dec. 2010.

#### PATENTS

Shang-Yuan Lin, Shi-Wen Chen ,Ming-Hung Chang, Wei Hwang and Kun-Ru Cai "Fully On-Chip All Digital Process Invariant Temperature Sensor" US/TW Patent Pending. (pending)