# 國立交通大學

# 電子工程學系 電子研究所碩士班

碩士論文

用於矽穿孔之三維積體電路完整電源供應之分析

Power Integrity for TSV 3D Integration

指導教授:黄 威博士

研究生:楊博任 撰

中華民國一百年九月

### 用於矽穿孔之三維積體電路完整電源供應之分析

## Power Integrity for TSV 3D Integration

指導教授: 黄 威 博士 Advisor: Prof. Wei Hwang

研究生:楊博任 Student: Po-Jen Yang



Submitted to Department of Electronics Engineering & Institute of Electronics

College of Electrical Engineering and Computer Engineering

National Chiao Tung University

in partial Fulfillment of the Requirements

for the Degree of

Master

in

Electronics Engineering September 2011 Hsinchu, Taiwan, Republic of China

中華民國一百年九月

### 用於矽穿孔之三維積體電路完整電源供應之分析

研究生:楊博任 指導教授:黃 威 博士

國立交通大學電子工程學系電子研究所

### 摘 要

本論文針對矽穿孔(through-silicon-via, TSV)之三維積體電路提出一個的階層式電源供應系統(hierarchical power delivery system),此電源供應系統包含多種雜訊抑制技術以提供電路穩定電源。所提出的階層式電源供應利用電源調節模組(voltage regulator modules)分離全域與區域的電源供應網絡,此種階層式架構將能降地電路對於解耦合電容(decoupling capacitor)的需求並提供電路較彈性的電壓源。此外,為了能在全域與區域的電源供應網絡上獲得良好的雜訊抑制,我們分別採用"主動切換式解耦合電容電路"及"偏壓電流可調節式之低壓降電源穩壓器(adaptively biased regulator)"。針對基板傳遞雜訊及矽穿孔耦合雜訊,我們也提出了一套有效的基板雜訊抑制技術以達到更優良的電源供應品質。另一方面,我們也針對矽穿孔擺放及數目提出一個設計方法,能在合理的電壓降損失下,規劃出最小面積的矽穿孔孔徑與數量。

我們以一個異質矽穿孔三維整合系統(heterogeneous TSV 3D integration) 做為電源完整供應性的研究案例,根據模擬結果顯示,本論文所提出的階層式電源供應系統能在電源供應網絡上有效抑制 71.10%的雜訊,並且僅需多花費 1.11%的額外功率消耗。如此具有高電源供應品質又不需大幅修改設計架構的階層式電源供應系統,相信對於異質矽穿孔三維整合會有相當程度的幫助。

**Power Integrity for TSV 3D Integration** 

Student: Po-Jen Yang

Advisor: Prof. Wei Hwang

Department of Electronics Engineering & Institute of Electronics

National Chiao-Tung University

**ABSTRACT** 

In this thesis, a hierarchical power delivery system is proposed for the power

integrity of through-silicon-via (TSV) 3D integrations using various noise reduction

techniques. The proposed hierarchical power delivery system decouples the global

power network and the local power networks not only for reducing the required

decoupling capacitors (DECAPs) but providing flexible power sources. For achieving

the further power noise reduction both in the global and local power networks, an

active switching DECAPs and adaptively biased low dropout regulators are adopted

as the global regulator and local regulators, respectively. Additionally, a substrate

noise suppression technique is also presented to enhance the power integrity by

reducing both substrate and TSV coupling noises. Moreover, a design methodology

for area-efficient power TSV planning is proposed to optimize the area-occupancy

and voltage drop performance.

The simulation results of a heterogeneous TSV 3D integration demonstrate that

the noise reduction on power supply pairs (VDD + GND) are suppressed by up to

71.10% with only 1.11% power overhead based on the proposed hierarchical power

delivery system. Therefore, the proposed hierarchical power delivery system is very

useful for the power integrity of the heterogeneous integration in TSV 3D-ICs.

2

## 誌謝

幸蒙指導教授黃威博士的悉心指導與教誨, 對於匡正 研究方向、觀念啟迪、資訊的提供等不遺餘力,使我從中獲 益匪淺,得以完成本篇碩士論文,於此致上最誠摯的謝意。

另外感謝口試委員莊景德教授、陳冠能教授對於論文提 出建議與需修正之處,使得本論文更臻完備與嚴謹。

亦感謝黃柏蒼、謝維致、張銘宏、楊皓義、以及林天鴻 等諸位學長於研究上的協助與指點迷津,並且對於疏漏之處 不厭其煩的提點與指導。

在此,本人亦銘謝同窗夥伴杜威宏、陳建亨、林上圓於 交通大學修業期間,對於學術上的切磋砥礪與相互勉勵,使 我回首兩年碩士生涯,充實且多彩多姿。

最後,特將本文獻給我最親愛的父母以及女朋友,感謝 雙親含辛茹苦的養育與無時無刻的關懷與支助,和女朋友長 期以來的支持鼓勵,讓我能專注於課業研究中,論文得以付 梓,願以此與家人共享。

> 楊博任 謹誌於 國立交通大學電子所 民國一百年九月

# **Content**

| Chapter 1 Introduction                                                   | 1            |
|--------------------------------------------------------------------------|--------------|
| 1.1 Motivation                                                           | 2            |
| 1.2 Research Goals and Major Contributions                               | 4            |
| 1.3 Organization                                                         | 5            |
| Chapter 2 Overview of 3D Integration Technologies                        | 8            |
| 2.1 Why 3D?                                                              | 8            |
| 2.2 Categories of 3D Integration Technology                              | 13           |
| 2.3 Key Technologies of TSV 3D Integration                               | 16           |
| 2.3.1 Stacking Approach                                                  | 17           |
| 2.3.2 Stacking Orientation                                               | 18           |
| 2.3.2 Bonding Methods                                                    | 19           |
| 2.3.4 Wafer Types                                                        | 20           |
| 2.3.5 TSV Formation                                                      |              |
| 2.3.6 Categories of TSV Scheme                                           | 22           |
| 2.3.6 Categories of TSV Scheme                                           | 23           |
| 2.5 Power Delivery in 3D-ICs                                             | 25           |
| 2.5.1 The Basic of Power Delivery                                        | 26           |
| 2.5.2 3D-IC Power Delivery: Modeling and Challenges                      |              |
| 2.5.3 Design Techniques for Controlling Power Delivery Network Noise.    | 33           |
| Chapter 3 Hierarchical Power Delivery System and Area-Efficient Power TS | $\mathbf{V}$ |
| Planning for TSV 3D Integrations                                         | 40           |
| 3.1 Hierarchical Power Delivery System for TSV 3D-ICs                    | 40           |
| 3.2 Power Design Flow of Hierarchical Power Delivery System              | 47           |
| 3.2.1 Layer Order Planning and Power Domain Partition                    | 48           |
| 3.2.2 Current Density Analyses                                           | 49           |
| 3.2.3 Intra-Layer Power Network Design and Vertical Global Power Netw    | ork          |
| Planning                                                                 | 49           |
| 3.2.4 Substrate Noise Cancellation                                       | 50           |
| 3.3 Area-Efficient Power TSV Planning                                    | 51           |
| 3.3.1 Modeling for TSV 3D Integration                                    | 51           |
| 3.3.1.1 Physical and Electrical Modeling of TSV                          | 51           |
| 3.3.1.2 Closed-Form Expression of TSV                                    | 52           |
| 3.3.2 Power Grids Noise Estimation                                       | 55           |
| 3.3.3 Power TSV Structure                                                | 57           |
| 3.3.3.1 Area Function of Power TSV Structure                             | 58           |
| 3.3.3.2 Parasitic Impedance Computation                                  |              |

| 3.3.4 Power Noise Estimation of TSV 3D Integrations                        | 60     |
|----------------------------------------------------------------------------|--------|
| 3.3.5 Design Methodology for Area-Efficient TSV Planning                   | 61     |
| 3.4 Active Decoupling Capacitor for Supply Noise Regulation of TSV 3D      |        |
| Integration                                                                | 65     |
| 3.4.1 Power Noise Suppression of 3D Integrity                              | 66     |
| 3.4.1.1 Switched DECAPs                                                    | 67     |
| 3.4.1.2 Low Pass Filter                                                    | 67     |
| 3.4.1.3 Latch-Based Comparator                                             | 68     |
| 3.4.1.4 Charge Pump with Improving Body Effect                             | 69     |
| 3.4.2 Simulation Results                                                   | 70     |
| 3.5 Summary                                                                | 73     |
| Chapter 4 Intra-Layer Power Delivery Network and Voltage Regulation Ans    | alysis |
|                                                                            | 75     |
| 4.1 Review of Low Dropout Voltage Regulator                                | 75     |
| 4.2 Wide Bandwidth Variable Output Voltage Regulator                       | 77     |
| 4.2.1 Variable Output Voltage Regulator with Adaptively Biasing Technic    | ղսе 77 |
| 4.2.2 Stability Analysis                                                   | 80     |
| 4.2.3 Simulation Results                                                   | 84     |
| 4.2.2 Stability Analysis                                                   |        |
| 4.3.1 Optimum Sizing of Power Grids for IR Drop                            |        |
| 4.3.2 P/G Grids Analysis of LdI/dt Drop on Power Grids                     | 91     |
| 4.3.3 Power Delivery Network Modeling                                      | 93     |
| 4.4 On-Chip Power Distribution Network Analysis                            |        |
| 4.5 Summary                                                                |        |
| Chapter 5 Substrate Noise Suppression for Power Integrity of TSV 3D Integr | ation  |
|                                                                            | 101    |
| 5.1 Substrate Noise Reduction Techniques                                   | 101    |
| 5.1.1 Noise Cancelling Technique Using Power di/dt Detecor                 | 102    |
| 5.1.2 Active Substrate Noise Canceller with Decoupling Amplifier           | 103    |
| 5.2 Substrate Noise Analysis and Modeling in TSV 3D Integration            | 105    |
| 5.3 Active Substrate Decoupler (ASD) Design                                | 107    |
| 5.4 ASD Placing for Noise Suppression                                      | 108    |
| 5.4.1 Mixed-Signal Layer in TSV 3D Integrations                            | 108    |
| 5.4.2 Separated Analog Layer in TSV 3D Integrations                        | 111    |
| 5.4.3 ASD Placing in TSV 3D Integrations                                   |        |
| 5.5 Summary                                                                |        |
| Chapter 6 Power Integrity for Heterogeneous 3D Integration (Case Study)    |        |
| 6.1 Various 3D Chips Stacking                                              |        |

| 6.2 Heterogeneous 3D Integration of a Processor Memory Stack | 120 |
|--------------------------------------------------------------|-----|
| 6.2.1 A Prototype System of Processor Memory Stack           | 120 |
| 6.2.2 Architecture of the prototype system                   | 121 |
| 6.3 Power Delivery for the Processor Memory Stack            | 123 |
| 6.3.1 Hierarchical Power Delivery System                     | 123 |
| 6.3.2 Power Delivery Model and Current Profiling Model       | 125 |
| 6.3.3 TSV planning                                           | 127 |
| 6.4 Simulation Results                                       | 129 |
| 6.5 Summary                                                  | 137 |
| Chapter 7 Conclusion and Future Work                         | 138 |
| 7.1 Conclusion                                               | 138 |
| 7.2 Future Work                                              | 138 |
| References                                                   | 142 |



# **List of Figures**

| Fig. 1.1. 3D integrations achieve performance, form factor and cost requirement for    |
|----------------------------------------------------------------------------------------|
| future ICs [1.1]1                                                                      |
| Fig. 1.2. Hierarchical distributed power delivery architecture                         |
| Fig. 1.3. Research goals and major contributions                                       |
| Fig. 2.1. Conceptual view of a 3D stacked SoC [2.1]9                                   |
| Fig. 2.2. Different approaches for combining logic and memory [2.1]11                  |
| Fig. 2.3. 3D System from ICs and 3D ICs [2.2]                                          |
| Fig. 2.4. Schematic illustration of various 3D integration technologies [2.4]13        |
| Fig. 2.5. Advanced packing trends [2.3].                                               |
| Fig. 2.6. Enable technology of TSV 3D integration [2.3]                                |
| Fig. 2.7. Stacking approaches of wafer-to-wafer integration and die-to-wafer           |
| integration                                                                            |
| Fig. 2.8. (a) Face-to-face stacking. (b) Face-to-back stacking [2.10]19                |
| Fig. 2.9. Wafer bonding techniques for wafer-level 3D integration [2.10]20             |
| Fig. 2.10. Wafer selection.                                                            |
| Fig. 2.11. Fabrication of TSV [2.3]                                                    |
| Fig. 2.12. 3D TSV technologies as function of TSV diameter and aspect ratio and        |
| classification in three categories, 3D-SIC intermediate and global and 3D              |
| WLP bondpad with their key attributes [2.11]23                                         |
| Fig. 2.13. (a) Conventional power delivery architecture. (b) On-chip power grid        |
| [2.15]27                                                                               |
| Fig. 2.14. (a) Simulation of supply noise spectrum. (b) Measurement results of         |
| supply noise [2.17]28                                                                  |
| Fig. 2.15. Distributed model for 3D IC [2.18]. (a) Division of the power grid into     |
| independent cells. (b) A model for on such cell                                        |
| Fig. 2.16. (a) Cross section of 3D FD-SOI process. (b) Simplified via resistance       |
| model aligned with a cross-sectional SEM photograph [2.20]30                           |
| Fig. 2.17. (a) Simplified PSN models for comparing impedance response in 2D and        |
| 3D. (b) Impedance response comparison between 2D and 3D. (c)                           |
| Impedance response of the three tiers in a 3D IC [2.19]31                              |
| Fig. 2.18. Insertion of a DC–DC converter near the load [2.21]                         |
| Fig. 2.19. $Z$ -axis power delivery based on monolithic power conversion [2.15]35      |
| Fig. 2.20. The $(a)$ conventional and $(b)$ multistory power delivery schemes [2.19]37 |
| Fig. 2.21. Tapered Stacked (TAP) 3D Configuration for improving IRdrop and             |
| Ldi/dt droop [2.23]39                                                                  |

| Fig. 3.1. A conceptual 3D-IC stacking with TSV connection [3.1]                      | 41    |
|--------------------------------------------------------------------------------------|-------|
| Fig. 3.2. The hierarchical power delivery system.                                    | 43    |
| Fig. 3.3. Simulation results of effective supply voltage versus DECAP size           | 45    |
| Fig. 3.4. Block diagram of proposed hierarchical power delivery system               | 47    |
| Fig. 3.5. Power design flow of hierarchical power delivery system                    | 48    |
| Fig. 3.6. Electrical model of TSV considering the coupling terms [3.9]               | 51    |
| Fig. 3.7. (a) Sample 3X3 power grid of orthogonal connection. (b) TSV physical       |       |
| model. [3.10]                                                                        | 52    |
| Fig. 3.8. Equivalent circuit model of power grid [3.11].                             | 55    |
| Fig. 3.9. A long strip power TSV structure.                                          | 57    |
| Fig. 3.10. Power noise estimation of multi-layer TSV structure                       | 60    |
| Fig. 3.11. Flowchart of area-efficient power TSV planning                            | 63    |
| Fig. 3.12. An example of area-efficient power TSV planning                           | 64    |
| Fig. 3.13. Power Integrity for TSV 3D Integration.                                   | 65    |
| Fig. 3.14. Architecture of the Noise Suppression Technique.                          | 66    |
| Fig. 3.15. Resonant noise suppression using switched DECAPs.                         | 67    |
| Fig. 3.16. Latch-based comparator with High-Vt decive                                |       |
| Fig. 3.17. Modified Dickson charge pump                                              | 70    |
| Fig. 3.18. Noise suppressions of the active and passive DECAPs for (a) high          |       |
| performance IC (b) TSV 3D integration                                                |       |
| Fig. 3.19. Layout view of the noise suppression circuit                              | 73    |
| Fig. 4.1. Conventional analog style linear regulator [4.5].                          |       |
| Fig. 4.2. The concept of adaptively biased regulator.                                | 78    |
| Fig. 4.3. Schematic of adaptively biased regulator.                                  | 78    |
| Fig. 4.4. Resistor-string voltage divider.                                           | 80    |
| Fig. 4.5. Typical structure of a low-dropout regulator with an intermediate buffer s | tage  |
| [4.4]                                                                                | 81    |
| Fig. 4.6. Source-follower implementation of the intermediate buffer stage [4.4]      | 82    |
| Fig. 4.7. Frequency response of classical LDO.                                       |       |
| Fig. 4.8. Simulation waveform of output state changing                               | 85    |
| Fig. 4.9. 0.9V output under different process and temperature conditions             | 85    |
| Fig. 4.10. Unit gain frequency improvement of adaptive current biasing               | 86    |
| Fig. 4.11. Current efficiency and quiescent current with adaptive biasing technique  | e. 86 |
| Fig. 4.12. Simulation waveform of load transient response.                           | 87    |
| Fig. 4.13. Simulation waveform of line transient response.                           | 87    |
| Fig. 4.14. Frequency response at light load operation.                               | 88    |
| Fig. 4.15. Frequency response at heavy load operation.                               | 88    |
| Fig. 4.16. Power supply rejection of the adaptively biased regulator                 | 89    |

| Fig. 4.17. Power grids of (a) odd power lines (b) even power lines (c) reduced odd       |
|------------------------------------------------------------------------------------------|
| power lines model, and (d) reduced even power lines model [4.16]91                       |
| Fig. 4.18. Three designs of power distribution grids. (a) interdigiated power grid, (b)  |
| single paired power grids, and (c) multi-paired power grid [4.17]92                      |
| Fig. 4.19. (a) P/G mesh and (b) RL model network93                                       |
| Fig. 4.20. The unit resistance with different power line pitch and width94               |
| Fig. 4.21. Different voltage supply scenarios. (a) scenario 1: direct voltage connection |
| from outside power TSV bundles, (b) scenario 2: voltage regulators are                   |
| placed out of the PDN as supply sources. (c) scenario 3: voltage regulators              |
| are placed inside the PDN as supply sources95                                            |
| Fig. 4.22. Maximum voltage drop of the 4 <sup>th</sup> PDN with different voltage supply |
| scenarios while (a) the pitch of power lines is 200 µm, and (b) the pitch of             |
| power lines is 100μm97                                                                   |
| Fig. 4.23. Effective supply voltage for the power distribution grid. (a) scenario 1:     |
| direct voltage connection from outside power TSV bundles. (b) scenario 2:                |
| voltage regulators are placed out of the PDN as supply sources. (c) scenario             |
| 3: voltage regulators are placed inside the PDN as supply sources99                      |
| Fig. 5.1. Substrate noise canceller using di/dt detector [5.6]103                        |
| Fig. 5.2. Active decoupling amplifier circuits [5.9]                                     |
| Fig. 5.3. Noise coupling calculation results [5.9].                                      |
| Fig. 5.4. Two propagation paths of substrate noises in 3D ICs                            |
| Fig. 5.5. Schematic of active substrate decoupler (ASD)                                  |
| Fig. 5.6. Block diagram of mixed-signal circuit in TSV 3-D integration109                |
| Fig. 5.7. Noise suppression effect of ASD planning for mixed-signal circuit in 3D        |
| structure110                                                                             |
| Fig. 5.8. Noise suppression effect on mixed-signal layer                                 |
| Fig. 5.9. TSV in 3-D integration. (a) Analog circuit on the top layer. (b) Analog        |
| circuit on the bottom layer                                                              |
| Fig. 5.10. Noise comparison with different analog layers                                 |
| Fig. 5.11. Noise suppression effect. (a) Analog circuit on the top layer. (b) Analog     |
| circuit on the bottom layer                                                              |
| Fig. 6.1. A chip-stacked memory using 3D packing technology [6.1]117                     |
| Fig. 6.2. A high-speed, low-power 3D-SRAM architecture [6.2]                             |
| Fig. 6.3. 3D integrated SRAM with TFLOP processor [6.3]                                  |
| Fig. 6.4. Floorplan of the 3D processor stack combing CPU and L1 cache on the            |
| bottom tier with three tiers of L2 cache stacked on top of it [6.4]119                   |
| Fig. 6.5. Heterogeneous integration of multi-core, SRAM, DRAM, front-end circuits        |
| stacking121                                                                              |

| Fig. 6.6. Multi-core processor that features core-to-core connection with TSVs122      |
|----------------------------------------------------------------------------------------|
| Fig. 6.7. A 128Mb DRAM stratum as a main memory for multi-core processor 123           |
| Fig. 6.8. Hierarchical power delivery system applied to the processor memory stack.    |
|                                                                                        |
| Fig. 6.9. Power delivery network model                                                 |
| Fig. 6.10. Current load model                                                          |
| Fig. 6.11. Simulation waveforms of voltage performance while using active              |
| DECAPs (the blue line) and that without active DECAPs (the grey line). 131             |
| Fig. 6.12. Simulation waveforms of voltage performance while using local voltage       |
| regulators (the blue line) and that connecting to power TSV directly (the              |
| grey line)                                                                             |
| Fig. 6.13. Simulation waveforms of voltage performance while using ASDs to             |
| suppress substrate noises (the blue line) and that without ASDs (the grey              |
| line)                                                                                  |
| Fig. 6.14. Simulation waveforms of voltage performance while using the hierarchical    |
| power delivery system (the blue line) and that without hierarchical power              |
| delivery system (the grey line)                                                        |
| Fig. 6.15. Noise reductions of each power supply pair while (a) with active DECAPs     |
| only, (b) with active DECAPs and voltage regulators, (c) with active                   |
| DECAPs, voltage regulators, and ASDs                                                   |
| Fig. 6.16. Effective supply voltages across the 3D structure. (a) the effective supply |
| voltages for processor (1.0V), and (b) the effective supply voltages for               |
| front-end circuits (1.2V)                                                              |
| Fig. 6.17. Power overhead breakdown of each power component                            |
| Fig. 7.1. Hierarchical power delivery system for wide voltage range heterogeneous      |
| integrations140                                                                        |
| Fig. 7.2. Temperature-power management diagram                                         |

# **List of Tables**

| Table 2.1. Comparison of bonding methods                   | 20  |
|------------------------------------------------------------|-----|
| Table 3.1. Curve fitting parameter of effective inductance | 59  |
| Table 3.2. Curve fitting parameter of effective resistance | 60  |
| Table 3.3. Threshold size of TSV diameter                  | 62  |
| Table 3.4. Comparisons of active DECAPs                    | 72  |
| Table 4.1. Design parameters of regulator                  | 84  |
| Table 4.2. Comparison with previous works.                 | 89  |
| Table 4.3. Comparison of voltage drop performance.         | 100 |
| Table 5.1. ASD placing under different TSV 3D structure    | 115 |
| Table 6.1. Supply voltages of each power domain            | 125 |
| Table 6.2. Parameters of current load model                | 127 |
| Table 6.3. Parameters and results of TSV planning.         | 128 |



## Chapter 1

## Introduction

Moore's law describes a long-term trend in the history of integrated circuit technology, in which the number of transistors that can be placed inexpensively on an integrated circuit has doubled approximately every two years. However, Moore's Law will ultimately hit a brick wall as the lithography techniques become limited by the wavelength of light. Hence, three-dimensional (3D) integration is regarded as the solution to keep the pace with the performance improvement projected by Moore's law.

Among different 3D technologies, through-silicon via (TSV) has the potential to achieve the greatest interconnect density but also the greatest cost. Meanwhile, TSV 3D integration also provides enormous advantages in achieving small form factor, improving system performance, reducing power consumption and flexible heterogeneous integration for future generations of ICs [1.1], as shown in Fig. 1.1. Therefore, TSV 3D integration is recognized as a trend of future.



Fig. 1.1. 3D integrations achieve performance, form factor and cost requirement for future ICs [1.1].

#### 1.1 Motivation

Although 3D ICs offer many advantages over 2D ICs, many challenges should be overcome before volume production of TSV-based 3D ICs. For example, with the advanced 3D IC technologies, the average wire length can be decreased by a factor of  $N^{1/2}$  where N is the number of the stacked strata in the 3D chip [1.2]. The wire resistance and capacitance drops proportionally. As a result, power consumption drops by a factor of  $N^{1/2}$ . However, the power density per square area in 3D stacked chip increases by a factor of  $N^{1/2}$  due to the reduced footprint [1.3]. Besides, for the same circuitry, the reduced footprint of the 3D die also effectively increases the package parasitic: since the ratios of the number of supply pins and bonding wires to the supply current are reduced, the role of the package resistance and inductance is increased. These increased power density and increased package parasitics in 3D integration lead to a worse IR drop noise than that of 2D ICs because of the fewer supply pins and the additional resistance from TSVs [1.4].

Generally, increasing the TSV cross section area and density improves the impedance of power delivery network and as a result mitigates the IR drop noise. However, increase in the dimension and density will reduce the routable area of the stacked dies. On the other hand, more decoupling capacitors (DECAPs) are also required to suppress the Ldi/dt noise as the load current varies. However, the usage of the on-chip passive DECAPs is limited by two major constraints, including a great amount of gate tunneling leakage and large area occupation [1.5]. In view of these, robust power delivery is one of the critical challenges in 3D chips.

In this thesis, power integrity for TSV 3D integration is investigated. In order to enhance the quality of power delivery, a hierarchical power delivery architecture for TSV 3D ICs is proposed. As shown in Fig. 1.2, the major concept is that the global and the local power networks are decoupled. Power domains can be defined on local power networks. And each power domain is powered by a dedicated voltage regulator module with the requested voltage. Since the TSVs in the proposed decoupled power structure do not supply circuits directly, the stability constraint can be relaxed. Therefore, the required DECAPs used to stabilize the global power supply can be greatly reduced as well. Additionally, there are also lots of voltage stabilization techniques used in the hierarchical power delivery structure. For example, active switching DECAPs are adopted as a global regulator to suppress the resonant noise caused by package parasitics whereas wide bandwidth voltage regulators are used as local regulator to provide a clean supply voltage for each power domain. And active substrate decouplers are used to deal with the coupling noises propagated through shared substrate and TSVs. In addition, a design methodology for area-efficient power TSV planning is proposed to have the best trade-off between area-occupancy and voltage drop performance. All the techniques are adopted instead of simple decoupling capacitor to have the better power noise suppression both in global and local power networks. Such a hierarchical power delivery system is believed to be very useful for heterogeneous integration in 3D IC chips.



Fig. 1.2. Hierarchical distributed power delivery architecture.

#### 1.2 Research Goals and Major Contributions

The research goal of this thesis is to enhance power integrity of TSV 3D integrations. For this reason, we were devoted to the study of robust power delivery system and noise reduction designs. A brief power design flow for power integrity can be represented as the research goals and major contributions of our works, as shown in Fig. 1.3. We introduce a power delivery system and develop a power design flow correspondingly. In order to deal with coupling noises and improve the quality of power delivery, several noise reduction techniques are also combined into the power delivery system, including active DECAPs, active substrate decoupler, adaptively biased voltage regulator, and power TSV optimization.



Fig. 1.3. Research goals and major contributions.

The major contributions are listed as follows.

- A new concept of hierarchical power delivery architecture and its corresponding power design flow for TSV 3D integrations are introduced (in chapter 3).
- 2. A design methodology for area-efficient power TSV planning is proposed to have the best trade-off between area-occupancy and voltage drop performance

(in chapter 3).

- 3. A wide-band variable output voltage regulator with adaptive biasing technique is proposed to improve transient performance at heavy load and keep low quiescent current at light load (in chapter 4).
- 4. In order to exploit the voltage fluctuations in entire system, power delivery network analyses considering the placement of voltage regulator modules and the size of power delivery grid are also investigated (in chapter 4).
- 5. A substrate noise suppression technique is presented for power integrity of TSV 3D-ICs by considering both substrate and TSV coupling noises. For further achieving effective noise reduction, the ASD placing is also presented for different 3D structures (in chapter 5).
- 6. A case study for power integrity of heterogeneous 3D integration is simulated by current profiling models. In this study case, power integrity based on the proposed hierarchical power delivery system and that on general power delivery structure are analyzed and compared (in chapter 6).

### 1.3 Organization

The rest of this thesis is organized as follows. An overview of 3D integration technology is introduced in chapter 2. In section 2.1, we present the advantages and evaluation of 3D integration. Different kinds of fabrication of 3D integration are introduced in section 2.2. The key technologies of TSV 3D integration are discussed in section 2.3. Although 3D ICs offer many advantages over 2D ICs, many challenges should be overcome before volume production of TSV 3D ICs, and these challenges are introduced in section 2.4. In section 2.5, we depict the 3D power delivery.

The power delivery system and power TSV planning for TSV 3D integration is presented in chapter 3. In this chapter, the hierarchical power delivery system for multiple supplies heterogeneous 3D integration is described at first. And its corresponding power construction flow is developed in section 3.2. In order to have the best trade-off between area-occupancy and voltage drop performance, an area-efficient power TSV planning is proposed in section 3.3. In addition, the active switching DECAPs for supply noise suppression of TSV 3D integration is introduced in section 3.4.

The intra-layer power delivery network design and noise regulation analysis is presented in chapter 4. In this chapter, variants of low dropout voltage regulators in the literatures are described at first. Subsequently, we propose a wide bandwidth variable output voltage regulator with adaptive biasing technique in section 4.2. To further exploit the voltage fluctuations within the entire planar, a method of power/ground grid construction is introduced in section 4.3. Consequently, power delivery network analyses considering the placement of voltage regulator modules and the size of power delivery grid are investigated in section 4.4.

The substrate noise suppression for power integrity of TSV 3D Integration is presented in chapter 5. In this chapter, the widely used substrate noise reduction techniques are described at first. Subsequently, the modeling of TSV 3D structure considering the coupling substrate is introduced in section 5.2. In section 5.3, we depict the active substrate decoupler design. For further reducing the coupling noises, ASD placing for noise suppression under different 3D structures is presented in section 5.4.

A case study of power integrity for heterogeneous 3D integration is investigated in chapter 6. In this chapter, variants 3D chips stacking are described at first.

Subsequently, the heterogeneous 3D integration of a process memory stack is built in section 6.2. In section 6.3, the techniques presented in chapter 3-5 are combined together to be a hierarchical power delivery system for providing multiple and low-noisy power supplies to the processor memory stack. Consequently, the simulation results of the study case are shown in section 6.4. Finally, we conclude the thesis and depict the future work in chapter 7.



## Chapter 2

## **Overview of 3D Integration Technologies**

#### 2.1 Why 3D?

As the semiconductor roadmap strides on, packaging and interconnection technologies are required to follow. In order to stay in pace with system demands on scaling, performance and functionality 3D integration is gaining a lot of interest as a solution to this demand [2.1]. The reasons and requirements for 3D integration are however very diverse and often application specific.

A basic reason for 3D-integration is system-size reduction. Traditional assembly technologies are based on 2D planar architectures. Die are individually packaged and interconnected on a planar interconnect substrate, mainly printed circuit boards. The area-packaging efficiency (ratio of die to package area) of individually packaged die is generally rather low (e.g. 5x5mm die in 7x7mm package: 50% area efficiency) and an additional spacing between components on the board is typically required, further reducing the area efficiency (for example above e.g. 1mm clearance: 30% area efficiency). If we consider the volumetric packaging density, the packaging efficiency drops to very low levels. If in the previous example, we consider the active area of a die to be about  $10~\mu m$ , and the combined package and board thickness to be 2 mm, the volumetric packaging density is only 0.15%. There is clearly room for improvement of the packaging density.

A different reason for looking at 3D integration is performance driven. Interconnects in a 3D assembly are potentially much shorter than in a 2D configuration, allowing for a higher operating speed and smaller power consumption.

This is of particular interest for advanced computing applications. Due to the rising on-chip clock speeds, only a limited distance may be traveled by a signal in a synchronous operating mode. Using 3D-IC stacking techniques, more circuits may be packed in a single synchronous region. This requires a technology with 3D interconnects with low parasitics; in particular low capacitance and inductance are needed to avoid additional signal delay. The interconnection of circuit elements can be performed at several levels of the on-chip hierarchy. Of particular interest is the 3D stacking at the so-called "tile-level". As shown in Fig. 2.1, typical system-on-chip, SOC, devices are constructed of a number of functional blocks. The longest on-chip lines are those that are used to interconnect these tiles. Functional 'tiles' on the die are rearranged in multiple die that are vertically interconnected, resulting in much shorter global interconnect these lines are typically in the top-on-chip interconnect layers and are referred to as "global" interconnects in the on-chip wiring hierarchy. Within the tiles, "local" and "intermediate" wiring hierarchy levels are mainly used. In a 3D approach, the large die is split in a number of smaller die, using the 3D interconnects as "global" interconnects between the tiles on both die. As this interconnect goes one or more levels down the traditional IC-pad level, a very high 3D interconnect density is required for such an application.



Fig. 2.1. Conceptual view of a 3D stacked SoC [2.1].

A third, and maybe most important, reason to consider 3D integration is so-called

hetero-integration. As silicon semiconductor technologies continue to scale (vertical scaling), the realization of true SOC devices with a large variety of functional blocks becomes very difficult to achieve. Technologies need specific optimization for logic, analog, memory etc. to reach the desired performance levels and circuit density.

Furthermore, the substrates used to build active devices may vary significantly between non-silicon technologies, including substrates. e.g. compound semiconductors. Also systems may contain other planar components, such as MEMS and integrated passive devices. Besides the 'vertical' scaling we are also experiencing a 'horizontal' scaling. Realizing the full system on a single SOC die is becoming increasingly difficult and often not economically justified. If however a high-density 3D technology is available, a "3D-SOC" device could be manufactured, consisting of a stack of heterogeneous devices. This device would be smaller, lower power and higher performance than a monolithical SOC approach. Such an approach is the obvious choice for many sensor-array applications. Many sensor applications use particular substrate materials, such as IR and X-ray sensing, that are incompatible with Si-CMOS processing. These applications require however high-density circuits to read-out the signals from individual sensor pixels, a requirement best met with advanced CMOS technologies. The solution therefore consists in flip-chip (3D) mounting the sensor-array on a read-out electronics chip. Another possible application for this approach is the combination of logic and memory, which is shown as Fig. 2.2. The left one is 2D interconnect between logic and memory die, and the center is present (2D-SOC) combined logic and memory device, and the Right one is shown "heterogeneous 3D-SOC" stacking of a memory and logic device with 3D interconnects between individual logic tiles and memory banks.



Fig. 2.2. Different approaches for combining logic and memory [2.1].

Most applications require a combination of logic and memory. When large amounts of memory are needed, the memory is realized as a separate die, using a high density, optimized memory technology. Due to the use of large busses on the logic and memory die and the use of off-chip interconnects, only a relatively slow and power-hungry interconnect between memory and logic is possible. To overcome these limitations, e.g. for real-time data processing applications, a SOC approach is typically used. Although not optimal for the integration of high-density memory, the IC logic technology is used for integrating large amounts of memory. This allows for allocating smaller pieces of memory (memory-banks) to specific logic blocks. Distance between logic and memory is short, resulting in the required performance.

The integrated memory is however of the same performance as dedicated memory technologies would offer. In particular, a much larger die area is consumed by the memory cells, resulting in a die are that is significantly larger than the case with 2 die solutions. 3D interconnect technology may solve this problem, by allowing for logic 'tiles' on a first die to directly access memory banks on a memory chip. In this case the number of 3D connections required from the memory die to the logic die will increase by an order of magnitude compared to the I/O count of standard memory devices.

A new electronics era has begun to emerge, the focus of which is on 3D ICs instead of monolithic integration of heterogeneous functions. While the impact of this

approach is profound, it addresses a small part of the system. Therefore, another paradigm shift is illustrated in Fig. 2.3. 3D systems are leading to unparalleled miniaturization, functionality and cost at system level [2.2].



Fig. 2.3. 3D System from ICs and 3D ICs [2.2].

To conclude, there are different motivations for the development of 3D IC solutions:

- Form factor: It can increase density, achieve the highest capacity and volume ratio.
- *Increased electrical performances*: Which includes shorter interconnects length and improves device speed, and it achieves better electrical insulation (to reduce electrical parasitances in RF applications).
- Heterogeneous integration: Integration of different functions in a 3D IC is available. (RF + memory + logic + sensor + imagers + different substrate materials + ...)
- *Cost*: Cost of 3D integration may be cheaper than to keep shrinking 2D design rules following the ITRS / Moore law.

#### 2.2 Categories of 3D Integration Technology

3D integration is generally defined as fabrication of stacked and vertically interconnected device layers. The large spectrum of 3D integration technologies can be reasonably classified mainly in three categories [2.3]-[2.9]:

- 1. Stacking of packages and Die stacking (without TSVs)
- 2. TSV technology
- 3. Monolithic 3D

Fig. 2.4 is a representative schematic illustration of the 3D integration technologies that have been proposed to date and consists of three categories. The first category consists of 3D stacking technologies that do not utilize TSVs and are shown in Fig. 2.4 (a)-(c). The second category consist of 3D integration technologies that require TSVs (Fig. 2.4 (d)-(e)), and the third category consists of monolithic 3D systems that make use of semiconductor recrystallization to form active levels that are vertically stacked (with on-chip interconnects possibly between). Of course, a combination of all these technologies is possible.



Fig. 2.4. Schematic illustration of various 3D integration technologies [2.4].

#### A. Stacking of packages and Die stacking (without TSVs)

The non-TSV 3D systems span a wide range of different integration methodologies [2.4] and [2.5]. Fig. 2.4 (a) illustrates stacking of fully packaged dice. Although this may offer the advantages of being low cost, simplest to adopt, fastest to market, and modest form-factor reduction, the overhead in interconnect length and low-density interconnects between the two die do not enable one to fully exploit the advantages of 3D integration. Fig. 2.4 (b) illustrates the most common method to stacking memory die, which is based on the use of wire bonds. Naturally, this 3D technology is suitable for low-power and low-frequency chips due to the adverse effect of wire bond length, low density, and peripheral limited pad location for signaling and power delivery.

On the other hand, Fig. 2.4 (c) illustrates the use of wireless signal interconnection between different levels using inductive coupling (capacitive coupling is also possible, but more limiting). This approach is quite elegant for low-power chips that require high-data rate signaling (without the need for TSVs). Power delivery, however, requires use of wire bonds for top dice in the stack, which are not applicable for high-performance/power chips. There are several derivatives to the topologies described above, such as the die embedded in polymer approach. This approach, although different from others discussed, makes use of a redistribution layer and vias through the polymer film, and thus is a hybrid die/package level solution. It is important to note that all non-TSV approaches rely on stacking at the die/package level (die-on-wafer possible for inductive coupling and wire bond) and thus do not utilize wafer-scale bonding. This may serve to impose limits on economic gains from 3D integration due to cost of the serial assembly process.

#### B. TSV technology

Fig. 2.4 (d)-(e) illustrate 3D integration based on TSVs. The former figure illustrates bonding of dice with C4 bumps and TSVs. The short interconnect lengths and high density of interconnects that this approach offers are important several orders of magnitude larger number of interconnects. Although it is possible to bond at the wafer level, this approach is most suitable for die-level bonding (using a flip-chip bonder) and thus faces some of the same economic issues described above. Fig. 2.4 (e) 3D stacking thin-film illustrates based bonding (metal-metal dielectric-dielectric). Not only are solder bumps eliminated in this approach, but also increased interconnect density and tighter alignment accuracy can be achieved when compared to the previous approach due to the fact that these approaches are based on wafer-scale bonding. Thus, they utilize semiconductor based alignment and manufacturing techniques.

#### C. Monolithic 3D

Finally, Fig. 2.4(f) illustrates a purely semiconductor manufacturing (non-packaging) approach to 3D integration. The main enabler to this approach is the ability to deposit an amorphous semiconductor film (Si or Ge) on a wafer during the IC manufacturing process and re-crystallize to form a single-crystal film using a number of techniques. Ultimately, this approach may offer the most integrated system with least interconnects possible but may not provide chip-size areas for device fabrication in the stack.

1896

Additionally, Fig. 2.5 shows the functionality and density of those advanced packaging technology that we have mention above. We can see the TSVs can deliver the highest performance and functionality and be cost effective. It is important to note that none of the above described 3D integration technologies address the need for

cooling in a 3D stack of high performance chips. This is a significant omission and imposes a constraint on the ability to fully utilize the benefits of 3D technology. As such, new 3D integration technologies are needed for such applications.



Fig. 2.5. Advanced packing trends [2.3].

### 2.3 Key Technologies of TSV 3D Integration

Among different 3D technologies, through-silicon via (TSV) has the potential to achieve the greatest interconnect density but also the greatest cost. Thus, this chapter will only focus on TSV 3D integration. Generally, TSV 3D integration can be classified into different categories based on the following differentiators [2.3], [2.6]-[2.10], as shown in Fig. 2.6:

- 1. Stacking approach: chip-to-chip, chip-to-wafer or wafer-to-wafer;
- 2. Stacking orientation: face-to-face or back-to-face stacking;
- 3. Bonding method: metal-to-metal, dielectric-to-dielectric or hybrid bonding;
- 4. Wafer types: bulk, silicon-on insulator (SOI) or glass wafers.
- 5. TSV formation: via first, via middle or via last;



Fig. 2.6. Enable technology of TSV 3D integration [2.3].

#### 2.3.1 Stacking Approach

Depending on the level of chip singulation, 3D integration can take place at three different stages: die-to-die (D2D), die-to-wafer (D2W) and wafer-to-wafer (W2W) stacking. In wafer-level 3D integration, permanent bonding can be done either in chip-to-wafer (C2W) or wafer-to-wafer (W2W) stacking, as shown in Fig. 2.7. Since known-good dies (KGD) can be used on the substrate wafer if pre-stacking testing is available, D2W integration has higher yield than W2W integration therefore. However, D2W and D2D approach have two main shortcomings: handling problem and low throughput [2.10].



Fig. 2.7. Stacking approaches of wafer-to-wafer integration and die-to-wafer integration.

#### 2.3.2 Stacking Orientation

Based on the stacking orientation of two device wafers, there are two different ways of wafer stacking: face-to-face (F2F) and face-to-back (F2B), where face refers to the surface on which transistors and the primary interconnect layers are formed and back refers to the Si substrate side of a die. The effects of wafer stacking orientation are clearly seen in terms of circuit symmetry, fabrication complexity, capacitance of interconnection and alignment consideration [2.10]. Both types of stacking methods have been applied in 3D integration applications.

Fig. 2.8 shows a schematic illustration of a 3D chip stacking where the left one are bonded face-to-face and the right one is bonded face-to-back. In face-to-face (or 'face down') stacking orientation, two wafers are aligned and bonded such that the circuitries are facing each other as shown in Fig. 2.8(a). From the fabrication technology point of view, this type of integration is easy to apply and does not require an additional handle wafer. However, the circuit symmetry aspect needs to be taken into consideration at the design stage [2.10].

For face-to-back (or 'face up') wafer stacking, the top wafer (or upper wafer) should be thinned from the substrate while the wafer's front side is temporarily attached to an additional handle wafer. When the required final thickness of the top wafer is achieved, it is bonded to the substrate wafer and the handle wafer is released. Comparing with the face-to-face version, this approach increases the process complexity. However, the wafer-to-wafer symmetric issues are eliminated [2.10].



Fig. 2.8. (a) Face-to-face stacking. (b) Face-to-back stacking [2.10].

#### 2.3.2 Bonding Methods

A major 3D bonding architectural choice is between dielectric bonding (Oxide-to-Oxide or Polymer-to-Polymer) and metallic bonding (Metal-to-Metal), which illustrate in Fig. 2.9 [2.10]. In addition to the differences in bonding materials, this choice also has a substantial impact on the details of the interstratum connections. In dielectric bonding, the interstratum connections are completed after bonding by using TSVs to pass through the top die and to connect to the conventional interconnect in the adjacent strata. In metallic bonding, the interstratum connections are completed by bonding pre-existing microconnects, and the interstratum connection may include TSVs. Another major option in the bonding of strata is the choice of wafer-to-wafer, die-to-wafer, and die-to-die bonding. Dielectric bonding typically uses wafer-to-wafer bonding, while metallic bonding is commonly associated with any of the three. Other detail characteristic is shown in Table 2.1.

#### Oxide-to-Oxide or Polymer-to-Polymer

·Oxide or polymer bonding has similar process flow



Fig. 2.9. Wafer bonding techniques for wafer-level 3D integration [2.10].

Table 2.1. Comparison of bonding methods

|            | Pros.                               | Cons.                            |
|------------|-------------------------------------|----------------------------------|
| Metal-to   | 1.Metal bonding can be used as      | 1. Large pitch(Misalignment)     |
| -Metal     | extra metal layer                   | 2. How to deal the un-Cu are?    |
|            | 2.Better heat dissipation           | °Æ                               |
|            | 3.Less cleanness requirement 396    | IE .                             |
| Oxide-to   | 1.Possible tight pitch              | 1.High cleanness requirement     |
| -Oxide     | 2.Everywhere is oxide-bonded        | 2.Heat dissipation               |
| Polymer-to | 1.Possible tight pitch              | 1.Good cleanness requirement     |
| -Polymer   | 2.Everywhere is polymer-boned       | 2.Heat dissipation               |
|            | 3.Stronger bond strength than oxide | 3.Possible polymer contamination |
|            |                                     | issue                            |

### 2.3.4 Wafer Types

There are two kinds of wafer selection has been used today. One is Bulk Si, which includes Si, Ge, or GaAs, and anther is SOI wafer which is shown in Fig. 2.10. High aspect ratio TSV are required in Buck Si wafer, the target length of TSV is equal to 50µm. And it's the most developed approach today due to the cost factor and process maturity. On the other hand, SOI simplify TSVs formation, avoid the need of

a temporary carrier, and allow to stack extremely thin layers. The BOX layer can be used as stopping layer, so the thickness of  $2^{nd}$  layer will be more uniform. However, it's very expensive. It seems that this approach is not cost-effective.



Fig. 2.10. Wafer selection.

### 2.3.5 TSV Formation

TSV is the vertical electrical interconnection for ICs on different planes. Forming TSV usually involves drilling a via in the wafer and filling conductance into the via. The order of TSV fabrication and insertion in a 3D IC process is related to interconnect density, material selections and applications. TSV can be formed at various stages during the 3D IC process as shown in Fig. 2.11. Fabrication of TSV can be separated into via first and via last. It depends on the via fabrication step before or after the BEOL process. Via-first approach is challenging by CMOS process. There will be some issues with the subsequent CMOS steps at different temperature ranges, so the materials must be CMOS compatible. But it has no yield issue, only good wafers are used. And it has lower cost than via-last. Via-last will not being thermal stress issues, but where the vias etching must be carefully done. However, the yield of the TSV process affects the full process, and it will lower the total yield.



Fig. 2.11. Fabrication of TSV [2.3].

## 2.3.6 Categories of TSV Scheme

As the Fig. 2.12 shown, the different proposed 3-D integration schemes can be categorized by their most important feature, via diameter/pitch and via aspect ratio [2.11]. Three categories are distinguished. The large size 3-D-WLP (Wafer Level Packaging) TSVs have diameters larger than 10µm and serve as bondpad I/O interconnect in systems. They are typically manufactured post-foundry and are compatible with both wafer-to-wafer and die-to-wafer stacking schemes. Because of their rather large size (diameter) small aspect ratios around one or two enable integration in wafers with thickness of 70µm or more, greatly easing wafer and die handling.

The medium size 3D-SIC (3D Stacked IC) TSVs have diameters between  $2\mu m$  and  $10\mu m$  and serve as global interconnect. They are manufactured at the foundry and are compatible with wafer-to-wafer and die-to-wafer stacking schemes. An aspect ratio of 5 or higher leads to wafer thickness between  $25\mu m$  to  $70\mu m$ , making wafer

and die handling challenging. The 3-D-SIC TSVs are an emerging technology and are expected to appear in applications in the coming years. The smallest size 3-D-IC TSVs with diameter size of 2µm and smaller target intermediate level interconnect. Even with aspect ratio above 20 they require extremely thinned dies. Their stacking scheme is typically wafer-to-wafer to avoid complex and difficult thin die handling. The 3D-SIC intermediate level interconnect TSVs are considered risk technology at this time.



Fig. 2.12. 3D TSV technologies as function of TSV diameter and aspect ratio and classification in three categories, 3D-SIC intermediate and global and 3D WLP bondpad with their key attributes [2.11].

#### 2.4 Challenges of TSV 3D Integration

Although 3D ICs offer many advantages over 2D ICs, many challenges should be overcome before volume production of TSV-based 3D ICs becomes possible. These challenges include technological challenges, yield and test challenges, thermal challenges, infrastructure challenges [2.12], etc.

- 1. Thermal issue—Although the power consumption of a die within a 3D IC is expected to decrease due to the shorter interconnects, the heat removing of a 3D IC is much more difficult than that of a 2D IC. The cause is that the ambient environment of the die of a 2D IC is the cooling material, but the ambient environment of a die within a 3D IC may be another die which also generates heat. Therefore, the thermal issue of a 3D IC is much severer than that of a 2D IC.
- 2. Yield issue—3D integration technology may benefit the yield of 3D ICs but may deteriorate the yield of 3D ICs on the other hand. For W2W bonding technology, the yield of a 3D IC is the product of the yields of multiple die and the yield of stacking process. When combining n untested die from wafers with a die yield Yi, then the compound yield of the 3D structure Y<sub>m</sub> can be expressed as Y<sub>m</sub> = Y<sub>s</sub><sup>n-1</sup> × Y<sub>i</sub><sup>n</sup>, where Y<sub>s</sub> is the yield of the stacking process. Apparently, the yield of a 3D IC is dramatically reduced. But, for D2W and D2D bonding technologies the yield of 3D ICs can be remained at high level if the known-good die (KGD) is done. On the other hand, 3D integration technology inherently increases the yield since heterogeneous structures can be fabricated on separate wafers using individually optimized fabrication process and materials. This is impossible for integrating these heterogeneous structures in a 2D IC.
- 3. Test issue—For achieving a high yield of 3D ICs, high-quality KGD must be done. Wafer-level KGD for 3D ICs is more difficult than existing KGD approaches for system-in-package (SiP). The cause is that the die for SiP has I/O pads but the die for 3D IC may only has TSVs. The pitch between TSVs is much smaller than that of I/O pads. In wafer-level testing, the probe of TSVs becomes a challenge. On the other hand, the TSV of each die before bonding is a partial circuit and these results in that the pre-bond testing is also a challenge. Furthermore, the test

optimization and integration for pre-bond and post-pond testing are also an important issue.

- 4. Technological issue—As aforementioned, different bonding technologies have different impact on the final yield of 3D ICs. In addition, each step of the overall 3D integration process has heavy impact on the final yield. For example, the alignment accuracy, wafer thinning, TSV formation, and so on. All of these should be investigated and developed further before the 3D integration technology is mature enough for high-volume production.
- 5. Infrastructure issue—Although many 3D integration technologies have been investigated and demonstrated, an effective design flow for 3D ICs has not be developed. Computer-aided design (CAD) algorithms and tools for 3D ICs thus are required. For example, floorplanning, placement, and routing tools for 3D ICs must be developed.

#### 2.5 Power Delivery in 3D-ICs

Despite the recent surge in 3D IC research, there has been very little work from the circuit design and automation community on power delivery issues for 3D ICs. On-chip power supply noise has worsened in modern systems because scaling of the power supply network (PSN) impedance has not kept up with the increase in device density and operating current due to the limited wire resources and constant RC per wire length, and as stated earlier, this situation is worsened in 3D ICs. The increased IR and Ldi/dt supply noise in 3D chips may cause a larger variation in operating speed leading to more timing violations. The supply noise overshoot due to inductive parasitics may aggravate reliability issues such as oxide breakdown, hot carrier injection (HCI), and negative bias temperature instability (NBTI) (which are also

negatively affected by elevated temperatures). Consequently, on-chip power delivery will be a critical challenge for 3D ICs [2.13].

#### 2.5.1 The Basic of Power Delivery

According to scaling roadmaps, future high-performance ICs will need multiple, sub-1V supply voltages, with total currents exceeding 100 A/cm2 even for 2D chips [2.14]. Conventional power delivery methods for high-performance ICs employ a DC- DC converter known as a voltage regulator module (VRM). The VRM is typically mounted on the motherboard, with external interconnects providing the power to the chip, as depicted in Fig. 2.13 [2.15]. The intrachip power delivery network is shown in Fig. 2.13(b), which shows a part of the modeled PSN of a microprocessor [2.16]. The package parasitics, contributed by the I/O pads and bonding wires, are modeled as an inductance and resistance in series. The decoupling capacitors (DECAPs) shown in the figure are intended to damp out transient noise and include the external decap as well as the capacitance due to the various circuit components such as the MOS gate capacitance.

The chip acts as a distributed noise source drawing current in different locations and at different frequencies, causing imperfections in the delivered supply. The supply that reaches the processor is affected by IR and Ldi/dt drop across the package constituting the supply noise: the package impedance has largely remained unaffected by technology scaling. Scaling does, however, result in some unwanted effects on-chip, namely, increased currents and faster transients from one technology node to the next. The former aggravate the IR drop, while the latter worsen the Ldi/dt drop [2.13]. Over and above these effects is the issue of global resonant noise in which the supply impedance gets excited to produce large drops on supply at or near the

resonant frequency. With these increased levels of noise and reduced noise margins, as Vdd levels scale down, reliable power delivery to power-hungry chips has become a major challenge.



Fig. 2.13. (a) Conventional power delivery architecture. (b) On-chip power grid [2.15].

The noise spectrum for a typical power grid is shown in Fig. 2.14(a). The DC mcomponent of the noise is given by IR drop across the package and power grid. The first peak in the figure corresponds to the resonant frequency, given by  $f_{\rm res} = 1/(2\pi\sqrt{LC})$ , which typically appears in the range of 100–300 MHz. An excitation at this frequency can be triggered during microprocessor loop operations or wakeup. Several other peaks are seen in the figure due to switching at clock frequency and its higher harmonics or due to local resonance: the corresponding noise is typically an order less in magnitude than the resonant peak. Fig. 2.14(b) shows a measured supply impedance profile of a separate test structure, which validates the simulation model

developed in Fig. 2.13. The noise at a particular frequency is estimated by multiplying the impedance with the current component at that frequency [2.17].



Fig. 2.14. (a) Simulation of supply noise spectrum. (b) Measurement results of supply noise [2.17].

#### 2.5.2 3D-IC Power Delivery: Modeling and Challenges

A model for 3D ICs, based on distributed models of the on-chip and package power supply structures, is shown in Fig. 2.15 [2.18]. Power is fed from the package through power I/O bumps distributed over the bottom-most tier and travels to the upper tiers using TSVs. The footprint of the chip can be divided into cells, which are identical square regions between a pair of adjacent power and ground pads, as shown in Fig. 2.15(a). The cells are connected in Fig. 2.15(b) in the form of a grid formed by several subcells between adjacent TSVs. Electrically, each TSV is modeled as a series combination of resistance and inductance. The planar square cells use a lumped model, where  $R_{si}$ ,  $J_i$ , and  $C_{di}$  represent, respectively, the grid resistance, effective current density, and chip decap on a per-unit basis. Since each pad is shared by four independent cells, the package parameters are normalized by a factor of four. The subcell can be then repeated multiple times to realize the complete 3D IC functional block.



Fig. 2.15. Distributed model for 3D IC [2.18]. (a) Division of the power grid into independent cells. (b) A model for on such cell.

The power grid model must necessarily be tied to a real 3D process. Fig. 2.16(a) depicts a 3D IC cross-sectional model of a production level 0.18µm 3D process from MIT Lincoln Laboratory [2.19]. This process has three tiers. The bonding pads are on the top tier, while the heat sink is typically below the bottom tier. Processors or other power intensive circuits would ideally be placed on the bottom tier in close proximity with the heat sink. The tiers are interconnected through TSVs for electrical and thermal conduction. Fig. 2.16(b) shows the cross-sectional scanning electron microscope (SEM) photograph of a stacked TSV connecting the back metal of the top tier with the top level metal of the bottom tier. A simplified resistance model is superimposed. Based on actual parameter extraction [2.18], each stacked cone-shaped TSV has a resistance of  $1\Omega$  in this process. The top and middle tiers are aligned face-to-back, while the middle and bottom tiers one face-to-face, making the path from the top to middle tier longer and more resistive. This configuration can be modeled by breaking up the total  $1\Omega$ -stacked via resistance into chunks of  $0.25\Omega$ ,  $0.5\Omega$ , and  $0.2\Omega$ , as shown in Fig. 2.16(b). The values of TSV inductance and capacitance can be ignored as their values, found experimentally, are fairly small.



Fig. 2.16. (a) Cross section of 3D FD-SOI process. (b) Simplified via resistance model aligned with a cross-sectional SEM photograph [2.20].

The TSV resistance in the supply path potentially imposes new challenges in 3D power delivery vis-à-vis the conventional 2D case [2.20]. First, the lower tiers experience worsened PSN noise due to the increased resistance in the PSN. Moreover, power intensive circuits have to be placed at the bottom tier, which makes reliable power delivery further difficult.

In 3D, there are two significant points of departure, in comparison with models for conventional 2D chips. First, for the same circuitry, the reduced footprint of the 3D die effectively increases the package parasitics: since ratios of the number of supply pins and bonding wires to the supply current are reduced, the role of the package resistance and inductance is increased. Second, the noise characteristics in each tier are affected by the additional TSV resistance in the supply path.

Fig. 2.17 shows the circuit models developed to compare the 3D and 2D cases. The models are based on curve fits with the impedance profile of a distributed supply network model, along with typical decap and package parasitic values. In 3D, we see that the supply path would be dominated by the TSVs. The overall chip capacitance

(3nF in the 2D case) within an equal footprint is assumed to be split equally in the 3D IC between its three tiers. Moreover, due to the reduced footprint of the 3D die, the number of power pins is assumed to be a third of the 2D case, leading to 3X increase in package parasitic inductance and resistance values.



Fig. 2.17. (a) Simplified PSN models for comparing impedance response in 2D and 3D. (b) Impedance response comparison between 2D and 3D. (c) Impedance response of the three tiers in a 3D IC [2.19].

Since the noise at the bottom tier is predictably worst, we compare the impedance response of this tier with the 2D case. The normalized impedance comparison is shown in Fig. 2.17(b), which illustrates the following:

• Low-frequency impedance: At low frequencies, the capacitors and inductors are open and short circuited, respectively. Therefore, the 2D model has an impedance of  $2(0.01+0.03)=0.08\Omega$ , while the 3D model has an impedance of

 $2(0.03+0.05+0.1+0.05)=0.46\Omega$ . This indicates that for the same amount of current, the 3D chip will have 5.75X more IR drop compared to 2D.

- Resonant peak impedance: The resonant peak is determined by the amount of damping and the value of inductance. Here, the increased role of inductance in 3D is counteracted by the increased damping provided by the larger resistance drop to the bottom tier, and the peaks show comparable values.
- Resonant frequencies: Two-dimensional circuits typically have a resonant frequency of around 50–300 MHz, given by  $f_{res} = 1/(2\pi\sqrt{LC})$ . If the equivalent capacitance in 3D is same as in our model, due to the increased L, the peak is shifted to a lower frequency as seen in Fig. 2.17(c).
- *High-frequency impedance*: At high frequencies, 2D and 3D impedances become comparable, and this is attributed to the shielding effect of the bottom tier capacitance, due to the fact that the capacitance becomes virtually a short circuit at high frequencies.

Clearly, it can be seen that DC supply noise becomes a greater concern in 3D designs as compared to its 2D counterpart. To understand the supply noise behavior in different tiers, we analyze the impedance spectrum (see Fig. 2.17(c)) across different tiers obtained by simulating the 3D IC model. The key results are as follows:

- Low-frequency impedance: As expected, the DC- and low-frequency impedances, which are governed by the TSV resistances, show a worsening trend for the lower level tiers.
- *High-frequency impedance*: At high frequencies, the top tier has the largest impedance, while the middle tier has the minimum AC impedance. Although this seems to be counter intuitive, it can be explained by the shielding/decap effect of

the adjacent tier capacitances, which causes the effective damping resistances to be the largest for the middle tier and smallest for the top tier. The above trend is more noticeable at high frequencies beyond the resonance peak.

• Resonant behavior: Since the shielding effect mentioned above is not significant at mid-frequencies, the resonance peak follows the low-frequency trend, with the bottom tier being the worst case. However, there is a reduced noise offset as noted

In summary, the AC impedance is worst for the bottom tier until the resonant frequency, while beyond this point, the top tier has a slightly larger impedance value. Since thermal constraints dictate that the bottom tier is likely to contain circuit blocks with large current consumption, the supply noise in the bottom tier (i.e., the product of current and impedance) will become a significant concern for 3D implementations.

The aim of the above discussion was to provide some quantitative understanding of power delivery in 3D ICs. It should be pointed out that these numbers are tied to a specific process and will change depending on the process. For example, if the technology allows TSVs with much lower resistance or area, then the impedance bottleneck in a path may be due to the supply pads, and the PSN models should account for that. However, regardless of this, it remains likely that PSN will be a key problem in 3D designs.

#### 2.5.3 Design Techniques for Controlling Power Delivery

#### **Network Noise**

The presence of severe power delivery bottlenecks necessitates a look at entirely novel power delivery schemes for 3D chips. In this section, we introduce several

possible approaches for this purpose.

#### A. On-Chip Voltage Regulation

One way of dealing with the power delivery problem in 3D ICs (and also in conventional 2D ICs) is to bring the DC–DC converter module closer to the processor, conceptually shown in Fig. 2.18 [2.21]. Boosting the external voltage and locally downconverting it ensures that the current through external package,  $I_{\rm ext}$ , is small, and relaxes the scaling requirement on external package impedance. Moreover, this point of load (PoL) regulation isolates the load from global resonant noise from external package and decap. Traditionally, the efficiency of monolithic DC–DC converters has been limited by the small physical inductors allowed on-chip. In order to increase the power efficiency, an on-chip switching DC-DC converter using a distributed filter to place the traditional LC filter was proposed in [2.22].



Fig. 2.18. Insertion of a DC–DC converter near the load [2.21].

On the other hand, typical off-chip DC–DC conversion requires high-Q inductors of the order of 1-100  $\mu$ H, which are difficult to implement on-chip due to their area requirements. With growing power delivery problems, the focus has been on building compact inductors through technologies like thin film inductors or on more efficient, but costly, DC–DC converters through multiphase/interleaving topologies. Clearly, there is an onus to incorporate these on-chip, which calls for a different process altogether. The possibility to stack different wafers with heterogeneous technologies,

as offered by three-dimensional wafer-level stacking in 3D ICs, is thus the natural solution for realizing on-chip switching converters.

#### B. Z-axis Power Delivery

Z-axis or 3D power delivery, in which the PSN is vertically integrated with the processor in a 3D stack, promises an attractive solution for on-chip DC–DC conversion. Fig. 2.19 shows the schematic visualization [2.15] of such a Z-axis power delivery technique using wafer—wafer integration. This still requires that all passives, including the inductors and output capacitors, must be monolithically integrated with the power switches and control circuitry. The idea is gaining traction in research, and implementation of such a structure, using two interleaved buck converter cells each operating at 200MHz switching frequency and delivering 500mA output current has been reported [2.15]. In the future, we may see a 3D IC with several tiers, with one whole tier dedicated to voltage regulation, incorporating various passives and other circuitry.



Fig. 2.19. Z-axis power delivery based on monolithic power conversion [2.15].

One main issue with Z-axis power delivery is the area overhead in dedicating a tier to an on-chip DC–DC converter, whose footprint should be at par with the processor in a wafer–wafer 3D process. Moreover, high-efficiency switching

regulators for DC-DC conversion require monolithic realization of bulky passive components. On the other hand, typical linear regulators, though less bulky, suffer from efficiency loss.

#### C. Multistory Power Delivery

A promising technique for achieving high-efficiency on-chip DC–DC conversion and supply noise reduction is the multistory power delivery (MSPD) scheme [2.20]. It has been demonstrated in [2.19] that the idea becomes particularly attractive for 3D IC structures involving stacked processors and memories. Fig. 2.20 demonstrates the basic concept of MSPD. A schematic of a conventional supply network is shown in Fig. 2.20(a), where all circuits draw current from a single power source. Fig. 2.20(b) shows the multistory supply network, with subcircuits operating between two supply stories. The concept of a "story" is merely an abstraction to illustrate the nature of the power delivery scheme, as opposed to the 3D IC architecture, where circuits are physically stacked in tiers. In this scheme, current consumed in the "2Vdd-Vdd story" is subsequently recycled in the "Vdd-Gnd story." Due to this internal recycling, half as much current is drawn compared to the conventional scheme, with almost the same total power consumption. A reduced current is beneficial since it cuts down the supply noise. Thus, in the best case, if the currents in the two subcircuits are completely balanced, the middle supply path will sink zero current. This results in minimal noise on that rail, as also illustrated in Fig. 2.20.

However, the main issue with this technique is the requirement of separate body islands. This may be difficult in typical bulk processes. If and only if we consider 3D ICs, the tiers are inherently separated electrically, which makes MSPD particularly attractive.



Fig. 2.20. The (a) conventional and (b) multistory power delivery schemes [2.19].

# D. Best Practices for 3D PDN Design and Optimization

Various techniques impact the quality of power delivery in 3D ICs. These include through-silicon via (TSV) size and spacing, controlled collapse chip connection (C4) spacing, and a combination of dedicated and shared power delivery. In [2.23], their evaluation system is composed of quad-core chip multiprocessor, memory, and accelerator engine. Each of these modules is running representative SPEC benchmark traces. And they present a set of guidelines for designing and optimizing power delivery networks in future 3D designs:

• Locality in the vertical dimension impacts both IR drop and Ldi/dt voltage droop trends in a 3D PDN. A voltage droop at a node in 3D can get current from decoupling caps in the vertical neighbors as well as from the ones in the same plane. The resulting behavior is dependent on the locality of the droop as well as the state of the neighboring nodes. Therefore, a detail 3D PDN analysis with architecture or module level placement using representative workloads is necessary during 3D chip design.

- A critical observation in [2.23] is the saturation trend of IR drop in 3D PDNs with increased TSV size. This suggests the need for first finding the optimal TSV size given the on-chip grids in 3D stacked layers such that the least amount of silicon area penalty is incurred.
- While it is generally expected that the power delivery would be affected most in the die stacked furthest away from the C4 connections, the author report that percentage degradation in power delivery is in fact worse in lower level dies closer to C4s. This is particularly true when a highly active module, such as PROC, is placed next to heat sink for thermal concerns and furthest away from C4 connections. Therefore, 3D PDN analysis needs to carefully consider the impact in all the dies while optimizing the grid.
- Increasing the TSV granularity or equivalently decreasing the TSV spacing in 3D PDN improves the standard deviation in IR drop and Ldi/dt voltage droop most, with marginal improvements in maximum and average values. Therefore, physical design for 3D PDN must consider this impact and choose TSV granularity accordingly.
- Despite selecting the optimal TSV size and TSV spacing, 3D PDN performs worse in both IR drop and Ldi/dt voltage droop compared to 2D PDN if the package connection, such as C4, pitch or granularity is maintained the same as in the 2D case. This study shows that improving off-chip component of the 3D PDN, for example through reducing C4 pitch for a higher number of C4s, has the highest relative impact on power grid metrics that enables 2D like or even better quality 3D PDN.
- A combination of shared and dedicated TSV power delivery can be used, as

illustrated in 3D TAP configuration in Fig. 2.21, to achieve improvements in both IR drop and Ldi/dt voltage droop.



Fig. 2.21. Tapered Stacked (TAP) 3D Configuration for improving IRdrop and Ldi/dt droop [2.23].



### Chapter 3

# Hierarchical Power Delivery System and Area-Efficient Power TSV Planning for TSV 3D Integrations

Heterogeneous systems of 3D integration has exacerbated the requirement for multiple, wide range, and well controlled power supplies. In view of these, we present a new concept of hierarchical power delivery system which the operating supply voltages can be chosen for different parts of TSV 3D-ICs. The noise reduction circuits, such as active switching DECAPs and substrate noise cancellers, can be integrated into the power delivery system to improve power integrity. Based on the system, a power design methodology is also developed correspondingly.

Since the power TSV size is much larger than the transistor, using excessive TSV counts will reduce the routable chip area. Therefore, an area-efficient power TSV planning is proposed to minimize the area occupied by power TSVs under a tolerable voltage drop delivery. Furthermore, to reduce the supply noise transmitting on power TSVs, the active switching DECAP circuit used to deal with the supply impedance response from the packages is presented in the end of this chapter.

#### 3.1 Hierarchical Power Delivery System for TSV 3D-ICs

Among different 3D technologies, through-silicon via (TSV) has the potential to achieve the greatest interconnect density but also the greatest cost. An example of die

stacking in a TSV 3D integration [3.1] is shown in Fig. 3.1. Practically the die with highest heat generation, i.e., highest power, is placed nearest to the heat sink. Other dies then organized between the top stratum and the package substrate considering connection and function relation. The power supply and the signals are transmitted between strata by TSV connections. Controlled collapse chip connection (C4) bumps connect the lowest stratum of the chip to the package substrate. Such an arrangement is necessary to comply with the thermal requirement of the chip.



Fig. 3.1. A conceptual 3D-IC stacking with TSV connection [3.1].

With the advanced 3D-IC technologies, the average wire length can be decreased by a factor of  $N^{1/2}$  where N is the number of the stacked strata in the 3D chip [3.2]. The wire resistance and capacitance drops proportionally. As a result, power consumption drops by a factor of  $N^{1/2}$  whereas wire RC delay drops by a factor of N. However, the power density per square area in 3D stacked chip increases by a factor of  $N^{1/2}$  [3.1]. So the power delivery requirements will increase with the number of stacked strata in 3D chips.

Many researches had worked on the quality of the power delivery [3.1], [3.3]-[3.4]. A voltage high-to-low conversion near (or integrated into) the chip using a

switching DC-DC converter is suggested [3.3] to reduce the input current from off-chip supply. As a result, the off-chip resources to maintain power integrity can therefore be reduced. A multi-story power delivery technique was also presented to employ the charge recycling concept in the 3D power network [3.4].

It was suggested that the TSV architecture has small impact on Ldi/dt noise [3.1] whose major source is the off-chip passive components. But the IR drop noise will become worse as the number of the stacked strata increases because of additional resistive and capacitive path of TSV connections. Since the most power hungry die is located the furthest from the power source, it will suffer the most IR drop noise as well as Ldi/dt noise. Increasing the total number of TSV improves the IR drop noise at the cost of more blockages for device and metal layer placement. More decoupling capacitors are also required to suppress the Ldi/dt noise as load current increases. A lot of area penalties will be incurred. To reduce the area overhead and gate tunneling leakages, researches on decoupling capacitance efficient placement and optimization were presented in [3.5], [3.6].

On the other hand, heterogeneous integration of 3D-IC is another challenge for power delivery. One single supply voltage across the entire 3D chip stacks was assumed in most previous works. However, the best scenario is that not only the fabrication processes but also the operating supply voltages can be chosen for different parts of the system to have the best trade-off between cost, power, and performance. It certainly cannot be satisfied with a single supply source from off-chip. It is also costly to have multiple (off-chip) supply sources and power networks in the 3D-IC chip. Therefore, a new concept of multi-layer power delivery structure for TSV 3D-IC is proposed in this section. And the noise reduction techniques, such as active switching DECAPs and active substrate decoupler (see chapter 5), can also be

integrated into the power delivery structure.



Fig. 3.2. The hierarchical power delivery system.

An illustration of the proposed hierarchical power delivery structure is shown in Fig. 3.2. The major concept is that the global and the local power networks are decoupled. Power domains can be defined on local power networks. And each power domain is powered by a dedicated voltage regulator with the requested voltage. In the power hierarchy, the global power network is the first layer and the voltage regulators are the second layer devices. Linear regulators are used because they have lower area consumption.

In Fig. 3.2, the global power network is connected to the off-chip power source and transmitted by TSVs to different strata. Note that the number and the location of TSV are flexible despite that only one in the center is shown in the figure. The IR drop noise exists as mentioned. Meanwhile, the inserted voltage regulator will induced a voltage drop. Therefore, the voltage source of the global power network should be raised in the proposed structure to endure these voltage drops. The cost of such a decoupled power delivery structure is the increased number of voltage regulators. However, the required decoupling capacitor to maintain a stable supply voltage can be greatly reduced.

The conventional power architecture connects TSVs to off-chip supply sources and directly supplies the load circuits by these TSVs. The fluctuations on TSVs are directly seen by the circuit. In order to manifest the advantage of proposed decoupled power delivery structure, an extended analysis is performed based on Lin's work [5.7] on TSV 3D-IC power integrity. The height and the diameter of the copper TSV are set at  $50\mu$ m and  $25\mu$ m, respectively. Each of power and ground has a 8x8 TSV matrix. Every stratum is assumed to have a ring oscillator to consumed 5A current. The worst effective supply voltage is measured at the farthest stratum from the off-chip power source. It reports that to have a minimum 0.8V effective supply voltage from a 0.9V off-chip supply, a decoupling capacitor of 9nF is required in every stratum.

Fig. 3.3 shows the analysis when using the proposed decoupled power delivery structure. The right axis is the size of decoupling capacitor (DECAP) at TSV node. The left axis shows the total capacitance of the regulator, including a fixed 1nF internal capacitor and the decoupling capacitor at the regulator output. The z-axis represents the acquired minimum effective supply voltage under different TSV DECAP and regulator DECAP combinations. There are two reference planes in the

figure. The horizontal one indicates the required 0.8V minimum effective supply voltage whereas the vertical one shows the location of a total of 9nF DECAPs. Therefore, it can be easily interpreted whether the proposed decoupled power structure is better.



Fig. 3.3. Simulation results of effective supply voltage versus DECAP size.

The first observation from Fig. 3.3 is that the required 0.8V minimum effective supply voltage is easily met by inserting voltage regulators. It also shows that the DECAP size at the regulator output has a much larger effect on suppressing the noise whereas the size of the DECAP at TSV node is almost irrelevant. It is reasonable since the minimum effective supply voltage is measured at the regulator output which is the supply node of the load circuit. The 0.8V effective supply is achieved when the regulator DECAP is larger than 1.5nF (a total capacitance of 2.5nF) even when the TSV DECAP is zero. There is a 72.2% capacitance reduction. If only half of the

supply fluctuation is allowed, i.e. 0.85 effective supply, the proposed decoupled power structure can still has an over 30% reduction of decoupling capacitor.

It is also observed during the analyses that after inserting voltage regulators, the current flows through TSVs are smoother. The produced Ldi/dt noise is much smaller (less than 50mV depending on the size of DECAP) than original architecture since most of the inductance comes from off-chip components. Meanwhile, since the TSVs in the proposed decoupled power structure do not supply circuits directly, the stability constraint can be relaxed. Therefore, the off-chip passive resources used to stabilize the global power supply can be greatly reduced as well.

Simple passive decoupling capacitors are used in previous analyses. There are also lots of supply voltage stabilization techniques in literature can be used. For example, an active decoupling capacitor [3.8] and active substrate noise cancellers can be adopted instead of simple capacitor to have better power noise suppression both in the global or local power networks.

A block diagram of proposed hierarchical power delivery system after employing active switching DECAPs and substrate noise cancellers in a 4-layer stacking structure is shown in Fig. 3.4. This system contains four noise reduction techniques: an active switching DECAP to reduce the resonant noise caused by package, an area-efficient power TSV optimization method for appropriate TSV planning, linear voltage regulators to provide a clean local supply voltage, and substrate noise cancellers to suppress coupling noise through the TSVs and shared substrate. All these supply stabilization techniques are used to have better power noise suppression both in the global or local power networks.



Fig. 3.4. Proposed hierarchical power delivery system for TSV 3D integration.

# 3.2 Power Design Flow of Hierarchical Power Delivery System

Based on this hierarchical power delivery system, a power design methodology is developed correspondingly, as shown in Fig. 3.5. The power delivery networks are constructed step by step with the aforementioned noise reduction techniques. The overall flow is divided into four parts, including layer order planning and power domain partition, current density analyses, intra-layer power network design and global TSV planning, and substrate noise cancellation. The detail descriptions are presented in the following subsections.



Fig. 3.5. Power design flow of hierarchical power delivery system.

#### 3.2.1 Layer Order Planning and Power Domain Partition

Stacking orders of chips are decided at the first step. Practically the die with highest heat generation, i.e., highest power, is placed nearest to the heat sink. Other dies then organized between the top stratum and the package substrate considering connection and function relation. Such an arrangement is necessary to comply with the thermal requirement of the chip. Moreover, both the width and height of a die on the top cannot be larger than those of underlying die. Therefore, the shapes of dies cannot change arbitrarily.

According to the operating supply voltages of different circuitry parts, multiple power domains can be defined. Under this power domain partition, the logical connection can be preserved naturally within a power domain. And each power domain is decoupled and powered by a dedicated voltage regulation module with the requested voltage. Therefore, power delivery networks can be easily optimized.

#### 3.2.2 Current Density Analyses

Multiple chips stacking in a TSV 3D integration makes the total current increases within the same footprint of a 2D chip. The larger current density and the shorter wire width lead to a severe *Electromigration* (EM) problem. Therefore, both the local and global maximum current density should be checked to comply with the EM problem and the maximum driving ability of the interconnection. If the local density is invalid, the power domain on local power networks is repartitioned to adjust the current density. On the other hand, if the global density is larger than the maximum current density, it means that the driving ability of the interconnection is overload. To satisfy this requirement, stacking order is reorganized to slacken the current variation and wire width is resizing to reduce the global current density. After completing the current analyses, the intra-layer power network design and vertical global power network planning are preceded separately.

1896

#### 3.2.3 Intra-Layer Power Network Design and Vertical Global

#### **Power Network Planning**

To ensure the supply voltage within whole power domain is tolerable, the power grid is constructed with appropriate power line widths and pitches. A dedicated voltage regulation module then powered the power grid. According to the requested voltage and current loads, the type of voltage regulation modules and the amount of DECAPs are chosen correspondingly.

On the other hand, the power supply is transmitted between strata by TSV connections. To reduce the voltage drop on TSVs, the size and count of power TSVs should be considered carefully. Therefore, an area-efficient power TSV planning

optimizes the tradeoff between area overhead and voltage drop on power TSVs. Furthermore, the main supply impedance response is dominated by both the TSVs and packages. In view of this, active switching DECAPs are used to suppress the global power noise.

Once the intra-layer power network design and global power network planning are complete, the TSVs and voltage regulation modules are connected to convert the power supply from the off-chip voltage source to the on-chip power delivery network.

#### 3.2.4 Substrate Noise Cancellation

In addition to the simultaneous noise from package, large coupling noises, such as ground bounce noises and substrate noises, are also coupled from the shared substrate or TSVs. Thus, substrate noise suppression using active substrate decouplers (ASDs) is presented for power integrity of TSV 3D-ICs.

In this stage, ASDs should be placed near the sensitive analog/RF circuit to have maximum substrate noise reduction. For further reducing coupling noise through strata by TSV connections, the ASDs should be distributed on the noise propagation path near the power TSVs.

Finally, if there are no voltage violations within each power domain, the whole procedure ends with a reliable power delivery network. In contrary, if there are voltage violations in any power domain, the power delivery network should be redesign from the initial stage.

#### 3.3 Area-Efficient Power TSV Planning

Although increasing the size of power TSVs will reduce the parasitic impedance, as a result, the voltage noises transmitting on TSVs are mitigation therefore. However, the power TSV size is much larger than the transistor (is about 100X), using excessive TSV counts will reduce the routable chip area. Therefore, an area-efficient power TSV planning is proposed to minimize the area occupied by power TSVs under a tolerable voltage drop delivery.

#### 3.3.1 Modeling for TSV 3D Integration

#### 3.3.1.1 Physical and Electrical Modeling of TSV

Roshan et al. [3.9] present close-form equations to extract resistive, capacitive and inductive components including coupling terms as a function of the TSV geometry. That has been achieved by simulating all structures in a 3-D/2-D quasi-static electromagnetic-field solve, which specifically used for parasitic extraction of electronic components. Based on the close form equations, the TSV electrical structure is modeled as regular matrix using the extraction parameters from [3.9], as shown in Fig. 3.6.



Fig. 3.6. Electrical model of TSV considering the coupling terms [3.9]

The TSV connecting mechanism in a 3D structure refers to [3.10]. Fig. 3.7(a) illustrates a sample of power supply transmitted between two adjacent chips by a 3X3 TSV matrix. The system of matrices shows each node on a chip connected to n neighboring nodes on same chip through supply line segment and corresponding node on other chip through TSV. And the Fig. 3.7(b) shows TSV interconnecting three chips where each TSV includes bump and pad pair at each side which is not shown here for the sake of simplicity.



Fig. 3.7. (a) Sample 3X3 power grid of orthogonal connection. (b) TSV physical model. [3.10]

#### 3.3.1.2 Closed-Form Expression of TSV

In [3.9], Closed-form expressions of the resistance, inductance, and capacitance of a 3-D via have revealed good agreement with full-wave electromagnetic simulation. Errors of less than 6% between the closed-form models and simulation have been demonstrated for both the resistance and inductance of a 3-D via. Errors of less than 8% for the capacitance have also been reported. The use of these closed-form expressions rather than full-wave electromagnetic simulations to estimate the 3-D via impedances enhances the system-level design process. These models of the via impedance are accurate over a wide range of diameters, lengths, dielectric thickness,

and spacing.

At DC situation, a closed-form expression of the 3-D via resistance is presented next as (3.1). The DC resistance is only dependent on the length L, radius R, and conductivity  $\sigma$ . Expressions for the DC partial self-(L<sub>11</sub>) nd mutual (L<sub>21</sub>) inductances are provided in (3.2)-(3.3). The inductance models with a fitting parameter to adjust for inaccuracies in the Rosa expressions.

$$R_{DC} = \frac{1}{\sigma} \frac{L}{\pi R^2} \tag{3.1}$$

$$L_{11} = \alpha \frac{\mu_0}{2\pi} \left[ \ln \left( \frac{H + \sqrt{H^2 + \left(\frac{D}{2}\right)^2}}{\frac{D}{2}} \right) H + \frac{D}{2} - \sqrt{H^2 + \left(\frac{D}{2}\right)^2} + \frac{H}{4} \right]$$
(3.2)

where  $\alpha = 1 - e^{\frac{-4.3 \, H}{D}}$ 

$$L_{12} = \frac{\mu_0}{2\pi} \left[ \ln \left( \frac{H + \sqrt{H^2 + P^2}}{P} \right) H + P - \sqrt{H^2 + P^2} \right]$$
 (3.3)

Equation (3.4) accounts for both the formation of a depletion region surrounding a p-type bulk substrate and the termination of the electrical field lines on a ground plane below the 3-D via. The termination of the field lines from the 3-D via to the ground plane forms a capacitance to the on-chip metal interconnect,

$$C = \alpha \beta \frac{\varepsilon_{SiO_2}}{t_{diel} + \frac{\varepsilon_{SiO_2}}{\varepsilon_{Si}} X_d T_p} 2\pi RH$$
(3.4)

Note that (3.4) is dependent on the depletion region depth  $X_dT_p$  in doped p-type silicon (the doped acceptor concentration  $N_A$  is  $10^{21}$  m<sup>-3</sup> in this case). The depletion region is, in turn, dependent on the p-type silicon work function $\phi_{fp}$ . The intrinsic semiconductor concentration ni is  $1.5 \times 10^{16}$  m<sup>-3</sup>, and the silicon permittivity is  $11.7 \times (8.85 \times 10^{-12})$  F/m. The thermal voltage kT/q at T = 300 K is 25.9 mV, where q is the electron charge  $(1.6 \times 10^{-19} \text{ C})$  and k is the Boltzmann constant,  $1.38 \times 10^{-23}$  J/K,

$$X_{\rm d}T_{\rm p} = \sqrt{\frac{4\epsilon_{\rm Si}\phi_{\rm fp}}{{\rm qN_A}}} \tag{3.5}$$

$$\phi_{f_p} = V_{th} \ln \left( \frac{N_A}{n_i} \right) \tag{3.6}$$

The fitting parameters  $\alpha$  and  $\beta$  are used to adjust the capacitance for the two physical factors. The  $\beta$  parameter adjusts the capacitance of a 3-D via since a smaller component of the capacitance is contributed by the portion of the 3-D via farthest from the ground plane. A decrease in the growth of the capacitance therefore occurs as the aspect ratio increases. The  $\alpha$  term is used to adjust the capacitance based on the distance to the ground plane Sgnd. As Sgnd increases, the capacitance of the 3-D via decreases. The  $\alpha$  and  $\beta$  terms are

$$\alpha = \left(-0.0351 \frac{H}{D} + 1.5701\right) S_{\text{gnd}_{\mu m}}^{0.0111 \frac{H}{D} - 0.1997}$$
(3.7)

$$\beta = 5.8934 D_{\mu m}^{-0.553} \left(\frac{H}{D}\right)^{-(0.0031D_{\mu m} + 0.43)}$$
(3.8)

An expression for the coupling capacitance between two 3-D vias over a ground plane is presented. The expression for the coupling capacitance between two 3-D vias is

$$C_{c} = 0.4\alpha\beta\gamma \frac{\epsilon_{Si}}{S}\pi DH \tag{3.9}$$

The 0.4 multiplier in (3.9) adjusts the sheet capacitance between two TSVs when assuming that all electric field lines originating from half of the surface of one TSV terminate on the other TSV. Each fitting parameter ( $\alpha$ ,  $\beta$ , and  $\gamma$ ) is used to adjust the coupling capacitance for a specific physical factor. The pitch P, which is the sum of the distance between the two vias and a single TSV diameter (P = S + D).

$$\alpha = 0.225 \ln \left( 0.97 \frac{H}{D} \right) + 0.53 \tag{3.10}$$

$$\beta = 0.5711 \left(\frac{H}{D}\right)^{-0.988} \ln\left(S_{gnd_{\mu m}}\right) + \left(0.85 - e^{-\frac{H}{D} + 1.3}\right)$$
(3.11)

$$\gamma = 1, \text{ if } \frac{s}{p} \le 1 \tag{3.12}$$

#### 3.3.2 Power Grids Noise Estimation

In [3.11], an equivalent circuit model to investigate the noise behavior in terms of the transition time is shown in Fig. 3.8, where R, L represent the power and ground impedances, C is the decoupling capacitor. The parasitic resistance and inductance of the power and ground networks are assumed to be equal due to the symmetry of these two TSV bundles. Note that this model does not consider the feedback effect of the power noise since in this model the current is independent of this noise.



Fig. 3.8. Equivalent circuit model of power grid [3.11].

The current  $I_{\rm swi}$  are provided by the decoupling capacitance  $I_C(t)$  and the power supply  $I_L(t)$ . Therefore, we can obtain the total current from the differential equations as follows.

$$I_{C}(t) = -C \frac{\partial V_{C}}{\partial t}$$
 (3.13)

$$I_{L}(t) = \frac{1}{L} \int_{0}^{t} V_{L}(t) \, \partial t \tag{3.14}$$

where  $V_C(t)$  and  $V_L(t)$  are the voltage cross on the capacitor C and the inductor L, respectively.

$$V_{C}(t) = V_{dd} - 2\Delta n(t) \tag{3.15}$$

$$V_{L}(t) = \Delta n(t) - I_{L}(t)$$
 (3.16)

A ramp function is assumed for the noise  $\Delta n(t)$  as

$$\Delta n(t) = \frac{V_{\text{noise}}}{t_{r,v}}t \tag{3.17}$$

where  $V_{noise}$  is the peak noise voltage and  $t_{r,v}$  is the transition time of the noise spike.

Substitute eq. (3.17) into eq. (3.15)

$$V_{C}(t) = V_{dd} - 2\frac{V_{noise}}{t_{r,v}}t$$

$$(3.18)$$

Substitute eq. (3.18) into eq. (3.13)

$$I_{C}(t) = -C\frac{\partial}{\partial t}(V_{dd} - 2\frac{V_{noise}}{t_{rv}}t) = \frac{2CV_{noise}}{t_{rv}}$$
(3.19)

Similarly, with the substitutions of eq. (3.17), eq. (3.16) into eq. (3.14), we can obtain

$$\frac{\partial I_L(t)}{\partial t} = \frac{V_{\text{noise}}t}{t_{\text{ryL}}} - \frac{R}{L}I_L(t)$$
 (3.20)

By solving the eq (3.20) with the initial conditions  $I_C(0) = 0$  and  $I_L(0) = 0$ , we can obtain that

$$I_{L}(t) = V_{\text{noise}} \left[ \frac{t}{t_{r,v}R} - \frac{L}{t_{r,v}R^{2}} (1 - e^{\frac{-tR}{L}}) \right]$$
(3.21)

Assume peak noise occurs when current reaches the maximum current, we can derivate the  $V_{\rm noise}$  equation as

$$I_{swi} = I_{C}(t_{r}) + I_{L}(t_{r}) = V_{noise} \left\{ \frac{2C}{t_{r}} + \left[ \frac{t_{r}}{t_{r}R} - \frac{L}{t_{r}R^{2}} (1 - e^{\frac{-t_{r}R}{L}}) \right] \right\}$$
(3.22)

$$V_{\text{noise}} = \frac{I_{\text{swi}} t_{\text{r}} R^{2}}{2CR^{2} + t_{\text{r}} R - L(1 - e^{\frac{-t_{\text{r}} R}{L}})}$$
(3.23)

We found that the term  $(\frac{-t_r R}{L}) < 1$  and  $2CR^2 \ll t_r R$ . Hence the equation (3.23) can be modified to equation (3.24) using a Taylor series expansion

$$V_{\text{noise}} = \frac{I_{\text{swi}} t_{\text{r}} R^{2}}{t_{\text{r}} R - L \left[ 1 - \left( 1 + \frac{\left( -\frac{t_{\text{r}} R}{L} \right)}{1!} + \frac{\left( -\frac{t_{\text{r}} R}{L} \right)^{2}}{2!} + \frac{\left( -\frac{t_{\text{r}} R}{L} \right)^{3}}{3!} \right) \right]}$$
(3.24)

Rewrite the equation (3.24), we can obtain the simplified equation (3.25)

$$V_{\text{noise}} = \frac{6I_{\text{swi}}L^2}{3t_rL - t_r^2R}$$
 (3.25)

## 3.3.3 Power TSV Structure

To minimize the area occupied by the power TSVs, an area-efficient power TSV planning is proposed. Compared to Lin's work [3.7], only one TSV pair is allowed to increase at a time in our work. For simplicity, we assume the increasing power TSV pair and the existing TSV matrix are placed in parallel. Therefore, a long strip power TSV structure is chosen for TSV planning, as shown in Fig. 3.9.



Fig. 3.9. A long strip power TSV structure.

#### 3.3.3.1 Area Function of Power TSV Structure

Due to the rectangular structure, the area occupied by power TSVs is a combination of the diameter, the spacing, and the pair count of TSVs. This area function can be expressed as follows

$$A(N, D) = [D + S + D] \times [N * D + (N - 1) * S]$$
(3.26)

where D is the diameter of TSV, S is the spacing of the TSV, and N is the total power TSV pairs.

Here, we assume the spacing of TSV is equal to the diameter of TSV. It means that the pitch of TSV is twice the diameter. Under this assumption, equation (3.27) can be simplified as equation (3.28)

$$A(N,D)|_{S=D} = 3D^2 \times [2N-1]$$
 (3.28)

#### 3.3.3.2 Parasitic Impedance Computation

Since the equivalent inductance and resistance of power network affect the power noise significantly, an estimation of parasitic impedance with higher accuracy makes the noise behaviors become more realistic. Therefore, the equivalent inductance and resistance are derived from this long strip power TSV structure. For each TSV, the effective inductance under this interlaced TSV structure is presented as follows

$$L = L_0 + 2M_{S=\sqrt{2}D} - 3M_{S=D}$$
 (3.29)

where  $M_{S=\sqrt{2}D}$  represents the in-phase mutual inductance with a  $\sqrt{2}D$  distance, and  $M_{S=\sqrt{2}D}$  represents the anti-phase mutual inductance with a D distance.

Substitute eq. (3.2), (3.3) into eq. (3.29), the effective inductance can be expressed as

$$\begin{split} L &= \alpha \frac{\mu_0}{2\pi} \Bigg[ ln \Bigg( \frac{H + \sqrt{H^2 + \left(\frac{D}{2}\right)^2}}{\frac{D}{2}} \Bigg) H + \frac{D}{2} - \sqrt{H^2 + \left(\frac{D}{2}\right)^2} + \frac{H}{4} \Bigg] \\ &+ 2 \frac{\mu_0}{2\pi} \Bigg[ ln \Bigg( \frac{H + \sqrt{H^2 + 5.8D^2}}{2.4D} \Bigg) H + 2.4D - \sqrt{H^2 + 5.8D^2} \Bigg] \\ &- 2 \frac{\mu_0}{2\pi} \Bigg[ ln \Bigg( \frac{H + \sqrt{H^2 + 4D^2}}{2D} \Bigg) H + 2D - \sqrt{H^2 + 4D^2} \Bigg] \end{split} \tag{3.30}$$

Note that, only the major mutual impedances of its surrounding TSVs are considered here. Additionally, if we only consider the diameter region of TSV varying from 10µm to 50µm under a fixed TSV height with copper material or that varying from 0.1µm to 2µm under a fixed TSV height with tungsten material, the effective inductance performs as a linear function of TSV diameter. Therefore, curve fitting is used to convert the complex equation (3.30) to a linear function of (3.31) under different feasible TSV height. The detailed fitting parameter is shown in Table 3.1.

$$L = (aD_{\mu m} + b) \times 10^{-12} \tag{3.31}$$

Table 3.1. Curve fitting parameter of effective inductance

μm ,

| Parameter | H=50µm | H=60µm | H=70µm | H=80µm | H=90µm | H=100µm |
|-----------|--------|--------|--------|--------|--------|---------|
| а         | -0.099 | -0.103 | -0.109 | -0.115 | 0.118  | -0.122  |
| b         | 14.05  | 14.63  | 17.23  | 19.84  | 22.44  | 24.99   |

On the other hand, the effective resistance derived from equation (3.1) is an exactly quadratic function of TSV diameter. Different form inductance, the effective resistance strongly depends on filling material of TSV. Therefore, curve fitting is used to convert the equation (3.1) to a quadratic function (3.32) under different filling materials and TSV height. The detailed fitting parameter is shown in Table 3.2.

| $R = \frac{K}{D_{\text{turn}}^2} \times 10^{-3}$ | (3.32) |
|--------------------------------------------------|--------|
|                                                  |        |

Table 3.2. Curve fitting parameter of effective resistance

| Parameter "k"       | H=50µm | H=60µm | H=70µm | H=80µm | H=90µm | H=100µm |
|---------------------|--------|--------|--------|--------|--------|---------|
| Filling by Tungsten | 3366.4 | 4039.7 | 4713   | 5386.3 | 6059.6 | 6732.9  |
| Filling by Copper   | 1018.5 | 1222.3 | 1426.1 | 1629.8 | 1833.5 | 2037.2  |

## 3.3.4 Power Noise Estimation of TSV 3D Integrations

The effective inductance and resistance of power TSVs under the interlaced power TSV structure are derived in section 3.3.3.2. By substituting the equations (3.31) and (3.32) into equation (3.25), we can obtain the power noise in an intra layer as equation (3.33).

$$V_{\text{noise}} = 2 \times I_{\text{swi}} \times \frac{(aD_{\mu m} + b)^2}{t_{r(ns)}(aD_{\mu m} + b) - t_{r(ns)}^2 \frac{k}{3D_{\mu m}^2}}$$
(3.33)

To expand the power noise estimation to TSV 3D integrations, the equation (3.33) is modified with a T multiplier, where T means the number of stacking chips in a 3D integration.

$$V_{\text{noise}} = 2 \times I_{\text{swi}} \times T \times \frac{(aD_{\mu m} + b)^{2}}{t_{r(ns)}(aD_{\mu m} + b) - t_{r(ns)}^{2} \frac{k}{3D_{\mu m}^{2}}}$$
(3.34)



Fig. 3.10. Power noise estimation of multi-layer TSV structure.

An example of power noise estimation is shown as Fig. 3.10. I<sub>swi</sub> represents the

current flowing in each TSV pair, and  $I_{supply}$  is the total current of the multiple layer structure. Therefore,  $I_{swi}$  can be replaced with the number of TSV pairs and total current of the structure ( $I_{swi} = I_{supply} / N$ ). Finally, the power noise in a T-layer 3D structure is estimated as equation (3.35)

$$V_{noise} = 2 \times \frac{I_{supply}}{N} \times T \times \frac{(aD_{\mu m} + b)^{2}}{t_{r(ns)}(aD_{\mu m} + b) - t_{r(ns)}^{2} \frac{K}{3D_{\mu m}^{2}}}$$
(3.35)

## 3.3.5 Design Methodology for Area-Efficient TSV Planning

Power supply noise in a 3D structure can be estimated by equation (3.35). In that equation, the information of TSV filling material, total stacking layers (T), transition time of circuitry ( $t_r$ ), and total current loads ( $I_{supply}$ ) can be acquired when the functionality of a 3D structure had been decided. Therefore, power supply noise is a function of  $D_{\mu m}$  (Diameter of TSV) and N (amount of TSV pairs).

As a result,  $V_{noise}$  decreases as the diameter of TSV or the TSV pair increases. However,  $V_{noise}$  is saturated when the diameter of TSV is higher than a threshold size  $(D_{limit})$  at a fixed amount of TSV pairs. Therefore, increasing the diameter of TSV excessively is not only inefficient for improving noise drop but also incurring more area penalties. For this reason, we find out the threshold diameter,  $D_{limi}$ , by solving the following equation

$$\frac{\partial V_{\text{noise}}(mV)}{\partial D_{\text{um}}} = -1 \tag{3.36}$$

The results are shown in Table 3.3. We can observe that the threshold size of TSV diameter is more related to the filling material instead the height of TSV. Furthermore, the magnitude of current flowing in a TSV pair also affects the threshold size. The more current flows in a TSV pair, the larger the threshold size is.

Table 3.3. Threshold size of TSV diameter

|         | Filling by Copper         | Filling by Tungsten        |  |  |
|---------|---------------------------|----------------------------|--|--|
| H=50µm  | Dlimit = 5.1 * Iswi +10.2 | Dlimit = 7.5 * Iswi +17.8  |  |  |
| H=60µm  | Dlimit = 5.5 * Iswi +10.3 | Dlimit = 8.0 * Iswi +17.9  |  |  |
| H=70µm  | Dlimit = 5.9 * Iswi +10.4 | Dlimit = 8.6 * Iswi +17.9  |  |  |
| H=80µm  | Dlimit = 6.3 * Iswi +10.5 | Dlimit = 9.2 * Iswi +17.9  |  |  |
| H=90µm  | Dlimit = 6.7 * Iswi +10.6 | Dlimit = 9.7 * Iswi +18.0  |  |  |
| H=100µm | Dlimit = 7.1 * Iswi +10.7 | Dlimit = 10.3 * Iswi +18.1 |  |  |

Hence, a methodology of area-efficient power TSV planning is proposed by combining the above equations. The design flowchart consists of three stages, as shown in Fig. 3.11. The amount (N) and diameter (D) of power TSV are initialized at the initial stage. According to the process technologies, the minimum size of power TSV is set 10µm for copper filling material and 0.1µm for tungsten filling material.

In the first stage, we attempt to find out the first (N, D) combination to make the supply noise less than a tolerable voltage drop. If enlarging the diameter of TSV cannot achieve the desired result, the TSV number will increase. Therefore, iteration restarts with the adjustment in TSV number. Once the goal achieved, the number and diameter of TSV are recorded. Meanwhile, the area occupied by this (N, D) combination is regarded as the optimized solution temporarily.

However, the solution found in the first stage does not ensure the area is minimal. Consider the following case. If the area overhead of a (N2, D2) combination is smaller than a (N1, D1) combination, where N1 is smaller than N2 (N1 < N2) and D1 is larger than D2 (D1 > D2), the area occupied by power TSVs could be shrunk by increasing the TSV pairs of the (N1, D1) combination. For this reason, the second stage attempts to find other possible (N, D) combinations than the first solution to minimize the power TSV area.



Fig. 3.11. Flowchart of area-efficient power TSV planning

As it is similar to the first stage, the second stage adjusts the (N, D) parameter to comply with the tolerant voltage drop by multiple iterations. If the area occupied by new (N, D) combination is smaller than the current one, the new (N, D) combination then replace as the optimized solution. On the other hand, the flowchart ends when the area occupied by new (N, D) combination is even larger than the original solution,

An example of area-efficient power TSV planning is shown in Fig. 3.12. The parameters of total stacking layers, total current loads, transition time of current loads, and tolerant voltage drop are assumed to be 4, 1A, 1ns, and 50mV, respectively. The filling material of TSV is copper and the height is 100µm. Such a 3D structure, the power noise can be estimated with above parameters according to equation (3.35).



Fig. 3.12. An example of area-efficient power TSV planning

Follow the flowchart step by step, we can obtain the first (N, D) combination to comply with the tolerant voltage drop. Here, a (4, 19) combination is recorded as the optimized solution with 7581µm² area temporarily. After finishing the first stage, a further analysis in the second stage begins. As the TSV number increases to five pairs, (5, 11) is another eligible solution with 3267µm² area. Thus, the optimized solution is updated as (5, 11). Similarly, next iteration restarts with six TSV pairs. However, the areas occupied by power TSVs are always larger than the (5, 11) combination even at the smallest TSV diameter. Therefore, the design flow is done and the (5, 11) combination is the final solution to comply with 50mV tolerant voltage drop with

# 3.4 Active Decoupling Capacitor for Supply Noise Regulation of TSV 3D Integration

Fig. 3.13 shows the power integrity for the TSV 3D integration. Heavy current density of TSVs and packages exists in the power network and further increases the power supply noise. Moreover, the supply impedance response is dominated by both the TSVs and packages. In view of these, noise suppression will become one of the critical design problems of TSV 3D integration.



Fig. 3.13. Power Integrity for TSV 3D Integration.

To suppress the power noise, decoupling capacitors (DECAPs) are widely used. DECAPs perform as a local reservoir of charge, which is released when the current load varies. Since the inductance of packages scales slowly, the DECAPs significantly affect the design of the power/ground (P/G) networks in high performance ICs and TSV 3D integration. At higher frequencies, DECAPs are distributed on chips to effectively manage the power supply noise. However, the usage of the on-chip passive DECAPs is limited by two major constraints, including a great amount of gate

tunneling leakage and large area occupation [3.12]. Therefore, current suppression techniques have been proposed to reduce power supply noise, and the resonant supply noise is suppressed via the delay-line-based and OP-based detection circuits with switched DECAPs, respectively [3.12], [3.13]. However, the efficiency of these noise suppression techniques would be reduced significantly by the leakage current in nano-scale technologies. In this section, a noise suppression technique is proposed for TSV 3D integration based on UMC 65nm CMOS technology. This noise suppression technique reduces the supply noise using a latch-based comparator and switched DECAPs.

## 3.4.1 Power Noise Suppression of 3D Integrity

Depending on the heavy current loading of P/G networks in 3D integration, the supply noise is a serious problem for power integrity. In view of this, Fig. 3.14 shows the proposed architecture of the noise suppression technique to reduce the supply noise. This architecture contains four blocks: a low pass filter, a latch-based comparator, a charge pump, and switched DECAPs. The prior three blocks are designed to detect the resonant supply noise and to control the switches of the switched DECAPs. The details of each block are described as follows.



Fig. 3.14. Architecture of the Noise Suppression Technique.

#### 3.4.1.1 Switched DECAPs

The switched DECAPs are designed to suppress the resonant supply noise. Fig. 3.15 illustrates the resonant noise suppression using switched DECAPs. If the power supply is overshooting than the Vdd\_DC, excess charge would be transferred to the capacitors, Cd1 and Cd2, respectively. On the contrary, if the power supply is undershooting than the Vdd\_DC, the DECAPs would be connected in series. And thus, the boosted voltage would be twice Vdd\_DC. The additional charge can be provided for the power supply from DECAPs to reduce the supply noise. Additionally, the hysteresis voltage levels are the high/low boundary conditions for switching the DECAPs from series to parallel and parallel to series, respectively. The switched DECAPs would be switched only when the supply voltage is higher or lower than the hysteresis voltage levels. In other words, the hysteresis voltage provides a tolerant interval and avoids frequent switches with a small supply noise.



Fig. 3.15. Resonant noise suppression using switched DECAPs.

#### 3.4.1.2 Low Pass Filter

The RC low pass filter provides the reference voltage for the latched-based comparator. However, the current through the resistor of the RC filter induces the IR drop and further decreases the reference voltage. The current contains the leakage

current of the capacitance and the load current of the comparator. Therefore, the demand current of the comparator should be small to generate a low-drop reference voltage.

## 3.4.1.3 Latch-Based Comparator

In order to switch the DECAPs, a latch-base comparator is designed to detect the resonant noise. The supply voltage (Vdd) and the reference voltage (Vdd\_DC) are compared via the latch-based comparator. This comparator achieves not only good noise tolerant interval but low power consumption. Fig. 3.16 shows the schematic of the latch-based comparator via High-Vt device to reduce the leakage power in nano-scale technologies. The operation voltage of this latch-based is higher than the on-chip supply voltage to detect the resonant supply noise. Additionally, the demand current of the comparator is very small because the reference voltage (Vdd\_DC) is connected to the gate of M1.

The frequency of the clock determines the sampling rate of the comparison results. When the clock is high, two discharging paths exist to pull down n1 and n2. After a half clock cycle, M3 would be turned off by the low level of the clock. And thus, the discharging paths would disappear. Therefore, the data in the weak back-to-back invertors would be determined according to the charge in n1 and n2. Additionally, two sampling latches capture the comparison results at the positive edge of the clock.

The latched-based comparator switches the switched DECAPs with a hysteresis voltage. Assume Vdd is increasing from the undershooting state to the overshooting state, and the initial voltage of n2 is high. While the Vdd is a little larger than the Vdd\_DC, the discharging time is not enough to flip the data in the back-to-back

inverter although the drain current of M2 is larger than that of M1. Until the Vdd is larger than the reference voltage plus a hysteresis voltage, M2 has enough driving ability to flip the data. Therefore, the data in the back-to-back inverter would be changed and the DECAPs would be switched into the series stack. If Vdd is decreasing from the overshooting state to the undershooting state, the switched DECAPs would be changed to the parallel mode as well as the Vdd is smaller than the reference voltage minus a hysteresis voltage.



Fig. 3.16. Latch-based comparator with High-Vt decive

## 3.4.1.4 Charge Pump with Improving Body Effect

For the latched-based comparator, an additional higher voltage (VDDH) is applied to ensure that the transistors, M1 and M2, are operated in the saturation region. Additionally, the hysteresis voltage level is influenced by the level of VDDH and the widths of M1 and M2. With the increasing of VDDH, both the hysteresis voltage and power consumption increase. Therefore, a modified Dickson charge pump is designed to pump the VDDH to 1.6V. Fig. 3.17 shows the modified circuits from the Dickson charge pump [3.14]. The bodies of MN1, MN2 and MN3 are connected to their drains to adjust the threshold voltage and further increase the efficiency of the charge pump.



Fig. 3.17. Modified Dickson charge pump.

When the clock is low, the node n1 is charged to Vdd- $V_{tn1-low}$ , where  $V_{tn1-low}$  is shown as Eq. (3.37). Therefore,  $V_{tn1-low}$  is smaller than the threshold voltage when the body is connected to ground.

$$V_{\text{tn1-low}} = V_{\text{t0}} + \gamma \left[ \sqrt{(-V_{\text{DD}}) + 2\phi_{\text{f}}} - \sqrt{2\phi_{\text{f}}} \right]$$
 (3.37)

When CLK is high, the node n1 is pumped to  $2Vdd-V_{tn1-low}$ . And thus, the threshold voltage of MN1 would be adjusted as Eq. (3.38) which is smaller than the threshold voltage when the body is connected to ground. In view of this, the leakage current from n1 to Vdd is reduced. In the following stages of the charge pump, the threshold voltage of NMOS in each stage is the same as  $V_{tn1}$ . The body bias of NMOSs can both improve the pumping efficieny and reduce the leakage current by adjusting the threshold voltage.

$$V_{\text{tn1-high}} = V_{\text{t0}} + \gamma \left[ \sqrt{(V_{\text{DD}} - V_{\text{tn-low}} + 2\phi_{\text{f}}} - \sqrt{2\phi_{\text{f}}} \right]$$
 (3.38)

### 3.4.2 Simulation Results

The noise suppression technique with low power active DECAPs is implemented via UMC 65nm CMOS technology and the TSV model [3.15]. According to the speed of the resonant noise, the frequency of the comparator is 2GHz. This clock source is also provided to the charge pump at 1GHz by a frequency divider.



Fig. 3.18. Noise suppressions of the active and passive DECAPs for (a) high performance IC (b) TSV 3D integration.

Fig. 3.18(a) and (b) show the suppressed noises resonated at 100MHz and 40MHz, respectively. The two configurations of the pads are set as L=0.75nH, C=1.69nF, R=0.14Ω and L=4.5nH, C=1.69nF, R=0.28Ω, which represent a typical supply impedance for high performance chips [3,13] and the footprint and TSV impedance in 3D integration. Since the number of the power pins would be limited in 3D integration, and the inductance of the package would be larger than the 2D-ICs. Compared to the same value of the passive capacitance, the active DECAP can realize the improvements of 55% (6.9dB) and 57.6% (7.4dB) noise reductions for the high performance IC and TSV 3D integration, respectively. Additionally, in order to evaluate the boost factor of the proposed active DECAP, a great amount of passive DECAPs are traced to achieve the similar noise suppression. Therefore, the passive DECAPs with 3400 pF and 2400 pF are deployed for the similar noise regulation. And thus, 17X and 12X boost factor can be achieved based on the proposed noise suppression circuit. Moreover, the leakage power can also be reduced by 71% and 59% due to the reduced DECAP area.

Table 3.4 lists the comparisons between the proposed active DECAP and other approaches [3.12] and [3.13]. The hysteresis voltage for the latch-based comparator is 17mV to avoid the continuous switches if the supply noise is small. The range of

hysteresis is between 9mV to 31mV across different process corners from -50~125 °C. The delay-line- based active DECAP uses two delay line with biasing starved inverters to compare the supply noise. However, the delay line using the bias scheme is too sensitive when the noise is large. The comparisons would be wrong since the difference between the two delay lines is more than one clock cycle. Moreover, this approach would face leakage problems when shrinking to nano-scale technologies. The static power of the proposed scheme is 0.55mW. In the latch-based comparator, the demand current from the RC filter is small because the Vdd\_DC is connected to the gate of M1 as shown in Fig. 3.16. For the delay-line-based comparator, the current of the constant delay line flows through the resistor and induces 40mV drop. It not only affects the switching timing but decreases the boost factor. Therefore, the proposed scheme can realize not only good noise tolerant interval but low power consumption.

Table 3.4. Comparisons of active DECAPs

| 896                |                               |                              |                           |  |
|--------------------|-------------------------------|------------------------------|---------------------------|--|
|                    | Delay Line<br>[JSSC'09][3.13] | Analog OP<br>[JSSC'09][3.12] | Latch-based<br>Comparator |  |
| Technology         | 0.13 μm                       | 90 nm                        | 65 nm                     |  |
| Static Power       | 0.65 mW                       | 2.9 mW                       | 0.55 mW                   |  |
| Hysteresis Voltage | 0mV                           | 5mV                          | 17mV                      |  |
| Triggered Voltage  | 40mV                          | 50mV                         | 17mV                      |  |

Fig. 3.19 shows the layout view and the floorplan of the noise suppression circuit. The total value of DECAPs is 200pF. Moreover, 84% area is occupied by the switched DECAPs which are implemented by MOS capacitors. The size of the proposed noise suppression circuit is  $170x230 \ \mu m^2$ .



Fig. 3.19. Layout view of the noise suppression circuit.

## 3.5 Summary

1896

For the TSV 3D-IC applications, a hierarchical power delivery system is proposed to decouple the global and local power networks. The decoupled power delivery structure can greatly reduce the size of decoupling capacitor especially on the global power network. Moreover, the operating supply voltages can be chosen for different parts of the system to have the best trade-off between cost, power, and performance. Such a hierarchical power delivery system is believed to be very useful for heterogeneous integration in 3D-IC chips.

In order to reduce the area occupied by power TSV matrix and reserve the routable chip area, an area-efficient power TSV planning is also proposed. Based on the TSV planning, an appropriate TSV diameter and TSV count can be chosen to deliver a reliable power supply and minimize the area occupied by power TSV matrix

at meanwhile.

Additionally, a noise suppression technique for TSV 3D integration is proposed to reduce the supply noise by the latch-based active DECAPs. Based on UMC 65nm CMOS technology, the proposed scheme can realize maximum 7.4dB supply noise reduction and 12X boost fact at the resonant frequency. Therefore, the proposed noise suppression circuit can provide a stable power for the power integrity in TSV 3D integration.



# Chapter 4

# Intra-Layer Power Delivery Network and Voltage Regulation Analysis

An on-chip wide bandwidth linear voltage with adaptive biasing technique is presented in this chapter. This adaptively biased regulator enhances transient response by increasing the bias current in heavy load, while keeps low quiescent current to maintain high current efficiency in light load. To further exploit the voltage fluctuations in entire planar system, power delivery network analyses considering the placement of voltage regulator modules and the size of power line width and pitch are also introduced in this chapter.

# 4.1 Review of Low Dropout Voltage Regulator

1896

Reliable power supply voltage is essential for the integration of system on a chip (SoC) and 3D applications. Since low dropout (LDO) voltage regulators occupied small chip area and produced accuracy voltage conversion, they are widely used in on-chip voltage regulation. Fig. 4.1 shows a conventional linear regulator. It is composed of an error amplifier, an analog voltage buffer, and large power transistors as output device. The voltage  $V_{OUT}$  is determined by resistor divider and reference voltage  $V_{REF}$ . An error amplifier in feedback loop compares the feedback voltage  $V_{FB}$  with  $V_{REF}$  and appropriately adjusts the gate voltage of output device. However, large parasitic capacitance at the gate of a power transistor degrades the slew rate of an

error amplifier. The transient response is therefore limited by the bandwidth and the internal slew rate of the LDO regulators. Since the bandwidth is proportional to the bias current of error amplifier, a simple solution to improve transient performance is increasing bias current. However, a trade-off exists between the bandwidth and current efficiency. Hence, adaptively biased regulators are proposed to provide fast transient response and high current efficiency at the same time [4.1], [4.2]. Additionally, several techniques are also presented for improving transient response to maintain a reliable supply voltage [4.3]-[4.8].



Fig. 4.1. Conventional analog style linear regulator [4.5].

Replica controlled circuit is also an useful technique to enhance transient speed [4.9]-[4.11]. Since there is no large loading connected to the replica circuit, less stability issues will be involved. One can use wider loop bandwidth to achieve fast tracking speed. The local feedback technique is also applied to the load regulation. With much small non-dominant poles than conventional single pole approaches, the load regulation loop exhibits fast transient response when the loading is disturbance. However, using replica circuit controlled makes an inaccurate regulation due to the feedback node is not a real output.

In rent years, the research of digital controlled voltage regulators are emerging [4.12], [4.13]. They replace the analog blocks with their counterparts. Therefore,

digital regulators have larger immunity against PVT variations than traditional analog regulators. The digital circuit is also easy to migrate from one technology to another. However, discontinuous output voltage and large amount of capacitance needed are the drawbacks of digital regulators.

On the other hand, low-noise regulator design is a critical issue for sensitive analog/RF circuits. In order to effectively decouple the supply noise from  $V_{DD}$  to  $V_{OUT}$  for PSR improvement, a regulator using feedforward amplifier to cancel the noise propagated from voltage supply is proposed in [4.14].

## 4.2 Wide Bandwidth Variable Output Voltage Regulator

System heterogeneity offered by 3D integration has exacerbated the requirement for multiple, wide range, and well controlled power supplies. In view of these, a wide bandwidth variable output voltage regulator is designed.

1111111

The voltage regulator using an adaptive biasing circuit to enhance transient response by increasing the bias current in heavy load, while keeps low quiescent current to maintain high current efficiency in light load. In order to provide variable output voltage, a resistor-string voltage divider also used. In the following, we describe the details of the adaptive biasing netwok, resistor-string voltage divider, and stability analysis of the proposed regulator.

# 4.2.1 Variable Output Voltage Regulator with Adaptively Biasing Technique



Fig. 4.2. The concept of adaptively biased regulator.

The concept of adaptive biasing technique is illustrated in Fig. 4.2. Different from most of regulators, which are biased with fixed bias current, the adaptive biasing circuit feeds an extra bias current to an error amplifier through a simple current mirror. By adjusting the bias current to be proportional to the current load, the bandwidth of the regulator is improved. A large additional bias current results in fast transient response in heavy load, whereas a small additional bias current keeps efficient conversion in light load.



Fig. 4.3. Schematic of adaptively biased regulator.

Fig. 4.3 shows the detailed schematic of adaptively biased regulator. It is composed of a bias current generator, a two-stage error amplifier (EA), an adaptive biasing network, an output voltage device, and a voltage divider. The voltage  $V_{OUT}$  is

determined by voltage divider and reference voltage  $V_{REF}$ . An EA in feedback loop compares the feedback voltage  $V_{fb}$  with  $V_{REF}$  and appropriately adjusts the gate voltage of output device. When the feedback voltage  $V_{fb}$  is lower than the reference voltage  $V_{REF}$ , the EA decreases the gate voltage of output device. The small gate voltage of power transistor leads to a large number of current recovering  $V_{OUT}$  to the nominal voltage.

The output resistance should be infinity of an ideal bias circuit to makes bias current stable at nominal value under process, voltage, and temperature variations. Therefore, cascode current mirror was chosen for generating a fixed bias current to EA due to large output resistance.

High-precision regulation requires high loop gain amplifier. A high gain EA can be realized by multi-stage or folded cascode amplifier scheme. In low-voltage operation, cascode type EA does not guarentee all the transistors operating in saturation region. To achieve high gain, two-stage EA is used in this design. The EA, which is modified from [4.2], consists of  $M_0$  to  $M_8$ . The Miller capacitor  $C_f$  is connected between the output node of the regulator and the output node of the differential stage. As the gain from differential stage to  $V_{OUT}$  is large, pole splitting is effective.

The adaptive biasing network is implemented with two current mirror pairs ( $M_P$ ,  $M_S$ ), and ( $M_{ab1}$ ,  $M_{ab2}$ ). The sensing transistor Ms senses the gate voltage of the power transistors and changes its drain current according to the current load. The sensing current therefore conducted to  $M_{ab2}$  and mirrored a small additional bias current to  $M_{ab1}$ . The magnitude of additional bias current is depending on the width ratio of power transistor and sensing transistor. However, the drain-to-source voltage of sensing transistor is higher than power transistor ( $V_{ds,Ms} > V_{ds,Mp}$ ). The sensing current

has enlarged over the nominal aspect ratio. Hence, the sensing transistor should be smaller in the actual design when considering the cannel length modulation.

The reference voltage  $V_{REF}$ , which is 0.6V, is assumed from a bandgap reference circuit. In order to produce variable output voltage without using multiple bandgap reference, a resistor-string voltage divider is necessary to divide each desired regulated voltage to 0.6V. Fig. 4.4 depicts the resistive voltage divider. It takes  $V_{OUT}$  as input and provides three divides voltages with different dividing ratios. Only one at a time will be passed to  $V_{fb}$  by the switch. For example, when demanding a 0.9V regulated voltage, the decorder block will activate the first switch. A 6/9 ratio will divide 0.9V to 0.6V that is able to be compared with  $V_{REF}$ .



Fig. 4.4. Resistor-string voltage divider.

## 4.2.2 Stability Analysis

A typical structure of a low-dropout regulator shows in Fig. 4.5. which consists of an error amplifier comparing the output voltage to the reference voltage  $V_{ref}$ , a PMOS pass transistor  $M_p$ , and the output buffer stage driving  $M_p$ . There are three different poles in the voltage regulator structure located at the output node of the error

amplifier (N1), the output node of the buffer (N2), and the output node of the voltage regulator  $(V_{out})$ . In particular, these poles are given by

$$P_1|N_1 = \frac{1}{r_{01}C_1} \tag{4.1}$$

$$P_2|N_2 = \frac{1}{r_{ob}C_p} \tag{4.2}$$

$$P_{o}|N_{out} = \frac{1}{r_{oeq}C_{L}}$$
 (4.3)



Fig. 4.5. Typical structure of a low-dropout regulator with an intermediate buffer stage [4.4].

The  $r_{o1}$  is the output resistance of the error amplifier,  $C_1$  is the equivalent capacitance at N1 which is dominated by the input capacitance of the buffer  $C_{ib}$ ,  $r_{ob}$ , is the output resistance of the buffer,  $C_p$  is the input capacitance of  $M_p$ , and  $r_{oeq}$  is the equivalent resistance seen at the output of the voltage regulator. Ideally, both  $C_{ib}$  and  $r_{ob}$  should be very small in order to achieve single-pole loop response by locating both p1 and p2 at frequencies much higher than the unity-gain frequency of the regulation loop.

In order to construct the required output buffer stage, a simple PMOS source-follower is first considered for implementing the output buffer and its structure is shown in Fig. 4.6. The PMOS source-follower provides near complete shutdown of

the pass device when under the light-load conditions. Because of the output resistance rob of the source-follower is given by  $1/gm_{21}$ , it is necessary to increase  $gm_{21}$  in order to decrease the value of rob and allow p2 to be located at frequencies much higher than the unity-gain frequency of the regulation loop. Transconductance  $gm_{21}$  can only be increased either through using a larger W/L ratio of transistor  $M_{21}$ , or through increasing the DC biasing current  $I_{21}$  through  $M_{21}$ , or both. However, increasing  $I_{21}$  would increase the total quiescent current of the regulator, and the current efficiency of the voltage regulator is degraded. Using a larger W/L ratio of  $M_{21}$  would increase the input capacitance  $C_{ib}$  of the buffer, which is in turn pushes p1 to a lower frequency and the stability would be poorly affected. A simple PMOS source-follower need to design carefully.



Fig. 4.6. Source-follower implementation of the intermediate buffer stage [4.4].

The stability of LDO based on dominant-pole compensation with pole-zero cancellation as shown in Fig. 4.7. The second pole  $p_2$  is cancelled by the zero  $z_1$  created by the ESR of the output capacitor. With a large output capacitance, the LDO stability is achieved by locating  $p_3$  beyond the unity-gain frequency of the loop gain for providing sufficient phase margin. However, when loop gain is too high,  $p_3$  locates before the unity-gain frequency, and an even larger output capacitance is required to

retain LDO stability.

Moreover, the power PMOS transistor in the classical LDO must operate in saturation region for considering the stability problem at different input voltages. The change of the voltage gain due to different drain—source voltage is not substantial when the transistor operates in saturation region. However, if the transistor operates in linear region at dropout, the transistor will operate in saturation region instead as the input voltage increases. As mentioned before, when the loop gain increases, the classical LDO based on dominant-pole compensation may be unstable. Hence, the power PMOS transistor needs to operate in saturation region throughout the entire range of input voltage, so a large transistor size is required for providing a small saturation voltage at the maximum output current.



Fig. 4.7. Frequency response of classical LDO.

### **4.2.3 Simulation Results**

The simulation of the wide bandwidth variable output voltage regulator uses a UMC 65nm standard CMOS technology. Only normal threshold voltage devices are used. The input voltage is 1.0V with a 0.1nF decoupling capacitor. The output voltage is regulated to 0.7V, 0.8V, 0.9V according to the control signal of voltage divider. The design parameters are shown in Table 4.1. Maximum output current load is 200mA. The  $C_f$ ,  $C_C$ , and  $R_C$  act as frequency compensator. The aspect ratio of sensing transistor and output device is set 3.2 $\mu$ m: 4000 $\mu$ m. Although the factor is only 1/1250 in design stage, it increases to 1/300 after considering channel length modulation.

Table 4.1. Design parameters of regulator

| Technology       | UMC 65nm                   |  |  |
|------------------|----------------------------|--|--|
| V <sub>IN</sub>  | 0.95V - 1.5V               |  |  |
| V <sub>OUT</sub> | $\frac{1}{0.7}$ V $-0.9$ V |  |  |
| $V_{REF}$        | 0.6V                       |  |  |
| $I_{LOAD}$       | 1mA – 200mA                |  |  |
| $C_{L}$          | 100pF                      |  |  |
| $C_f, C_C, R_C$  | 3.5pF, 0.4pF, 2.0kΩ        |  |  |
| R1, R2           | 10kΩ, 20kΩ                 |  |  |
| Ms:Mp            | 3.2μm : 4000μm             |  |  |

Output mode changing is shown in Fig. 4.8. The minimum error is 1mV at 0.7V output state whereas the maximum error is 2mV at 0.9V output state. The error comes from the limited EA loop gain. The output state changes from 0.8V to 0.9V within 70ns. For simplicity, only the 0.9V output simulation results are presented in the

following. The stable state output voltage under different process and temperature conditions are shown in Fig. 4.9. The maximum voltage error is less than 3mV in the temperature range from -40°C to 125 °C.



Fig. 4.8. Simulation waveform of output state changing.



Fig. 4.9. 0.9V output under different process and temperature conditions

High bandwidth is guaranteed to have excellent performance in line/load transient. Without adaptive biasing technique, high bandwidth can only be achieved by using a large quiescent current. In this work, an adaptively biased regulator is designed to extend bandwidth in heavy load operation whereas keep low quiescent current in light load. The bandwidth improvement with adaptive current biasing is

shown in Fig. 4.10. This adaptive biasing technique increases the unit gain frequency from 6MHz to 15.3MHz over 200mA current load range. The maximum improvement is 5.52X at 100mA current load.



Fig. 4.10. Unit gain frequency improvement of adaptive current biasing.

The current efficiency and quiescent current using adaptive biasing technique are shown in Fig. 4.11. The basic quiescent current is fixed at  $60\mu A$ . With the increasing of output load from 1mA to 200mA, the adaptive biasing network feeds an additional bias current to EA. The quiescent current is therefore changed from  $63\mu A$  to  $741\mu A$ . This adaptively biased regulator leads to 94.05% current efficiency at 1mA current load whereas 99.63% at 200mA current load.



Fig. 4.11. Current efficiency and quiescent current with adaptive biasing technique.

The simulated load step response of the regulated voltage is shown in Fig. 4.12.

A 10mA to 100mA current step with 100ps rising and falling time is used. The overshoot/undershoot voltage is 70mV/138mV. It recovers the voltage to regulated output within 140ns/80ns. The error voltage is 2mV and load regulation is  $22.22\mu V/mA$ .



Fig. 4.12. Simulation waveform of load transient response.

The line step response of the regulated voltage is shown in Fig. 4.13. A 10% voltage variation from 0.95V to 1.1V with 100ps rising and falling time is used. The overshoot/undershoot voltage is 112mV/108mV. It recovers the voltage to regulated output within 100ns/100ns. The error voltage is 0.43mV and line regulation is 2.86mV/V.



Fig. 4.13. Simulation waveform of line transient response.

To ensure the stability of LDO, the frequency response analyses of light/heavy load operation are also required. Fig. 4.14 and Fig. 4.15 show the frequency responses when the regulator operates in light load (1mA) and heavy load (200mA), respectively. The DC gain of the regulator is 56.9dB and 25.8dB, and the phase margin (PM) is 89° and 79.26° in the light load and heavy load operation, respectively. Both the light load and heavy load operation, the regulator is stabilized with sufficient phase margin (PM>60°).



Fig. 4.14. Frequency response at light load operation.



Fig. 4.15. Frequency response at heavy load operation.

As shown in Fig. 4.16, due to the extended bandwidth, the regulator has high power supply rejection (PSR). The PSR is 56dB at 100KHz, and 40dB even at 1MHz. It means that the output ripple is only 1% of supply ripple variation.



Fig. 4.16. Power supply rejection of the adaptively biased regulator.

The comparison of the adaptively biased regulator with previous works is shown in Table 4.2. Because of the smallest decoupling capacitor and not excessive quiescent current in our work, the adaptively biased regulator has the best figure of merit (FOM) in these researches.

Table 4.2. Comparison with previous works.

|                                       | M. Al-Shyoukh,<br>JSSC'07[4.4] | P. Hazucha,<br>JSSC'07[4.5] | M. El-Nozahi,<br>JSSC'10[4.14] | This Work                    |
|---------------------------------------|--------------------------------|-----------------------------|--------------------------------|------------------------------|
| Technology                            | 0.35 μm                        | 90 nm                       | 130 nm                         | 65 nm                        |
| Input Voltage                         | 2.0V                           | 2.4V                        | 1.15V                          | 1.0V                         |
| Output Voltage                        | 1.8V                           | 1.2V                        | 1.0V                           | 0.9V                         |
| Output droop $\Delta { m V}_{ m OUT}$ | 54mV                           | 120mV                       | 10mV                           | 129mV                        |
| Max. Current                          | 200mA                          | 1A                          | 25mA                           | 200mA                        |
| $I_Q$ (quiescent current)             | 20μΑ                           | 25.7mA                      | 50μΑ                           | 63μA @1mA<br>741μA @200mA    |
| Current Efficiency                    | 99.80%                         | 97.50%                      | 99.80%                         | 94.05% @1mA<br>99.63% @200mA |
| Decoupling Cap.                       | 1000nF                         | 2.4nF                       | 4000nF                         | 0.1nF                        |
| FOM [4.5] (figure of merit)           | 27ps                           | 7.4ps                       | 3.2ns                          | 239fs                        |

### 4.3 Power/Ground Grid Construction

As technology advances towards Gigascale Integration (GSI), chips require higher current densities and lower supply voltages in their power distribution networks. The tolerable power supply noise of circuits, as a result, decreases and makes design of power distribution networks more challenging.

IR-drop and Simultaneous Switching Noise (SSN) are the two main components of power supply noise. IR-drop results from the supply current passing through the parasitic resistance of the power distribution network. SSN is caused by inductance of the power delivery system, and occurs when a group of circuits switch simultaneously. Among three distinct droops of SSN, the first droop has the shortest duration and largest magnitude, thus it influences chip performance most severely [4.15].

In order to mitigate the on-chip voltage fluctuations, the optimum sizing of power grids for reducing IR-drop is proposed in [4.16]. As for SSN reduction, an interdigitated power/ground (P/G) networks where a few wide lines are replaced by a large number of narrow lines is often used to reduce the inductive effect [4.17]-[4.19]. The interdigitated structure is shown to achieve the greatest reduction in LdI/dt drop. The brief introduction of aforementioned works and power grid modeling method are described in the following subsections.

## 4.3.1 Optimum Sizing of Power Grids for IR Drop

As IR drop is only concerned in [4.16], only the resistive nature of the power grid lines will be considered, and a constant DC current load will represent the average load current. The grid structure exhibits large symmetry around the origin.

The power grid now has a group of square-shaped concentric nodes. Grouping parallel resistors and current sources in Fig. 4.17(a) and Fig. 4.17(b), the corresponding reduced grid models in Fig. 4.17(c) and Fig. 4.17(d) are obtained.

This paper targeted the initial design of power grids [4.16]. It presented simple reduced models based on equipotential-nodes approximation, and simple yet accurate analytical optimum line width to uniform one formula to calculate the IR drop with error less than 0.1%. The models and the IR drop formula accurately captured the effects of the design parameters on IR drop. Thus, they have high fidelity in comparing different grid designs regarding the IR drop. This paper also derived the optimum sizing scheme of power grid lines for minimum IR drop at the grid center [4.16]. Using this technique, a uniform grid was optimum for uniform load current profiles. For real chips examples, the paper showed a reduction of 14% in IR drop by using the optimum sizing rather than the uniform one.



Fig. 4.17. Power grids of (a) odd power lines (b) even power lines (c) reduced odd power lines model, and (d) reduced even power lines model [4.16].

# 4.3.2 P/G Grids Analysis of LdI/dt Drop on Power Grids

With increased clock frequencies and power supply demands, on-chip inductance

has become a significant factor in the total LdI/dt drop. In [4.17], the author presents two new power grid topologies that reduce the voltage drop induced by on chip inductance. The original power grid design, the interdigitated topology, is shown in Fig. 4.18(a) and consists of alternating power and ground lines that are equally spaced. However, due to the large spacing of the power grid wires, this topology results in large inductance and hence, a high supply voltage drop. The topology is shown in Fig. 4.18(b), which is refer to as the single layer paired topology, reduces the spacing between power and ground lines by adding orthogonal routs between the Vdd pad the Vdd wires at the top layer. According to [4.17], this configuration significantly reduces the total voltage drop. However, this topology significantly reduces the routability of the top interconnect layer since it requires orthogonal supply wires. Hence, the author proposed a multi-layer paired topology in Fig. 4.18(c). In this topology the spacing between power and ground lines is reduced, similar to that in the single paired topology, however, the routability of the design is not reduced since they use paring of the supply lines at different layers.

The voltage drop due to on-chip inductance was reduced by 70% using the proposed topology. The author examines the layer dependency of the proposed paired topology. As expected, applying the paired power grid topology at the top layers yields the most significant reduction in Ldi/dt drop.



Fig. 4.18. Three designs of power distribution grids. (a) interdigiated power grid, (b) single paired power grids, and (c) multi-paired power grid [4.17].

## 4.3.3 Power Delivery Network Modeling

Previous power grid studies have used lumped models of the on-chip power delivery network to capture the mid-frequency resonance. The major limitation of these architecture models is the global treatment of on-chip VDD/GND as a single node, which fails to capture local voltage variations across the entire chip. Thus, the on-chip grid using a distributed power delivery network (PDN) to capture the whole chip voltage variation is utilized.



Fig. 4.19. (a) P/G mesh and (b) RL model network

The P/G network is modeled as the mesh structure with *RL* network [4.20], as shown in Fig. 4.19. The pitch of the grid determines the distance between each power/ground line. In representing the P/G network as a graph, each intersection of the P/G wires is considered a node in the graph. To decrease the complexity of P/G analysis, each power and ground pin of the circuitry block is connected to the closest P/G node. The power pin from the IO pad or power TSV will supply the network node with the supply or ground voltages.

The typical sheet resistance  $(\Omega/\Box)$  in 65 nm technology node for top, medium and bottom metal layer is 0.075, 0.170, and 0.230, respectively. In real chips, the PDN can be composed by all the metal layers, but it should be mainly built using top two metal layers. For simplicity, we adjust the weight of three metal layers and set the

modified sheet resistance as  $(0.075\times0.4+0.170\times0.3+0.230\times0.3)=0.150\Omega/\Box$  in our simulation. Therefore, the unit resistance in PDN is calculated by  $R_{unit}=R_{\Box}\times\frac{L}{w}$ , where  $R_{\Box}$  is the modified sheet resistance, L is the length of power line, and W is the width of power line. Similarly, the unit inductance calculation is obtained from this formula with 0.05pH sheet inductance.

The total power grid nodes are decided by the pitch of power line and the power domain size. For example, if the pitch between two adjacent power lines is  $100\mu m$  in a  $1mm \times 1mm$  power domain, we have that the PDN size is  $10 \times 10$ . The resistance per unit thus varied from  $1.5\Omega$  to  $0.15\Omega$  when the power line width changed from  $1\mu m$  to  $10\mu m$  in this example. Since the unit resistance is proportional to the line width and inversely proportional to the line pitch, as shown in Fig. 4.20, the wider width and shorter pitch of power line, the smaller the unit resistance.



Fig. 4.20. The unit resistance with different power line pitch and width.

## 4.4 On-Chip Power Distribution Network Analysis

The effective voltage within a power domain is correlated to its power grid structure and the position of power supply pins. To further exploit the voltage

fluctuations in entire planar system, power delivery network analyses considering the placement of voltage regulator modules and the size of power line width and pitch are investigated.



Fig. 4.21. Different voltage supply scenarios. (a) scenario 1: direct voltage connection from outside power TSV bundles, (b) scenario 2: voltage regulators are placed out of the PDN as supply sources. (c) scenario 3: voltage regulators are placed inside the PDN as supply sources.

Fig. 4.21 illustrated three practicable voltage supply scenarios in intra-layer PDN design of TSV 3D-ICs. As it is similar to the widely used 3D structures, the power grid connected to the outside power TSV bundles as a direct voltage source, as shown

in Fig. 4.21(a). Different from this type, the structures using voltage regulators to decouple the load and power TSVs for intra-layer PDN are also compared. The structures, which regulators are placed out of the PDN and inside the PDN to provide a clean local voltage supply, are shown in Fig. 4.21(b) and Fig. 4.21(c), respectively.

The current load is 500mA in each layer leading to a maximum 2A current consumption in the 1mm×1mm footprint size structure. Each current is modeled as a triangular waveform with 0.5ns rise time, 0.5ns fall time, and 5ns cycle time. We assume that current loads evenly distributed within the PDN and 4 adaptively biased voltage regulators are used to provide local power supply. The external VDD is set 0.9V to delivery global power for the 4-layer stacking structure. If any power domain is decoupled by regulators, the external VDD raised to 1.0V to balance the 0.1V dropout voltage of the regulator. The power TSV bundle is constructed as a 8×8 rectangular array of VDD and GND pairs using copper TSV process. The height, size, and pitch of TSVs are 50μm, 25μm, and 50μm, respectively.

Due to the longest TSV conduction path, the maximum IR drop occurs to the topmost layer of 3D-ICs. For the purpose of simplicity, only the worst PDN (4<sup>th</sup> layer) is considered in our simulation. Fig. 4.22 shows the simulated maximum voltage drop results of three voltage supply scenarios. Among the three scenarios, the sharp of power delivery networks were regular except they contain holes for reserving signal TSV connection. The maximum voltage drop steadily decreases as the width of power line increases. When the width of power line increases, the resistance and inductance per unit becoming smaller, this leads to less voltage drop. The worst voltage performance appears in the general power delivery structure, which power TSV bundles delivery a global supply directly. Since large SSN significantly degrades the voltage on global power TSVs, power TSV counts and decoupling capacitors insertion

should be considered carefully to cope with coupling noise in this scenario.



Fig. 4.22. Maximum voltage drop of the 4<sup>th</sup> PDN with different voltage supply scenarios while (a) the pitch of power lines is 200μm, and (b) the pitch of power lines is 100μm

On the other hand, if we decouple the load and power TSV by regulators, the voltage performance has improved. As the pitch of power lines fixed at 200µm, the maximum voltage drops varied from 197mV to 46mV of scenario 2 when the width of power line increases from 1µm to 10µm. Compared to the general power delivery type, the maximum voltage drops have a 8%~28% suppression. By placing the regulators inside the power domain, the circuits can obtain power supply from neighbor voltage sources with the shortest power line length. This short path leads to

minimize voltage losses on the power lines. The voltage drops therefore reduced of scenario 3 under the same conditions. The maximum voltage drop is 35 mV at  $10 \mu \text{m}$  line width and has 45 % drop suppression.

Reducing the power line pitch also improves the voltage performance. The maximum voltage drop suppressions are up to 33.3% and 56.3% while the pitch of power lines is 100µm. The maximum voltage drops of scenario 2 and 3 are almost 20mV and 41mV smaller than the scenario 1, respectively. To allow 50mV supply fluctuation (5% VDD) in entire power domain, the scenarios 2 and 3 can be easily met when the width of power lines is larger than 6µm.

Fig. 4.23(a)-(c) show the effective supply voltage for power distribution networks with the grid parameter of 10µm line width and 100µm line pitch. As we expected, maximum voltage drops appear at the center of PDN when power supplies come from the four corners of the power domain. In contrast, the regulators placed inside the PDN have four voltage peaks. These voltage peaks from the internal PDN increase the effective supply voltage and improve the PDN quality. The effective supply voltage of the PDNs are in the range of 840mV~856mV, 860mV~873mV, and 872mV~876mV of scenarios 1, 2, and 3. In view of less voltage drop, the best power supply structure (scenario 3) is that the regulators are placed inside the power domain and decoupled the load and power TSV bundles. However, it severely increases the routing complexity of the PDN.

The overall comparison of voltage drop performance is shown in Table 4.3. The maximum voltage drop steadily decreases as the width of power line increases or the pitch of power line decreases. According to the simulation results, the maximum voltage drop of scenario 2 is close to that of scenario 3 as the width of power line increases, no matter the pitch of power lines is. Since more metal rails will introduce

more mesh nodes in the area and if the total current loads in this area remains the same, each mesh node will share less current. Halving the pitch of power line is thus useful to reduce voltage variation range within PDN. However, the maximum voltage drop is strongly related to the parasitic impedances on power lines. Therefore, sizing the power line width has greater voltage drop reduction than the power line pitch.



Fig. 4.23. Effective supply voltage for the power distribution grid. (a) scenario 1: direct voltage connection from outside power TSV bundles. (b) scenario 2: voltage regulators are placed out of the PDN as supply sources. (c) scenario 3: voltage regulators are placed inside the PDN as supply sources.

In view of these, it seems a better choice to use wider power line widths and appropriate power line pitch in scenario 2 to maintain a high quality of PDN and

reduce routing complexity meanwhile.

Table 4.3. Comparison of voltage drop performance.

| scenario | width (µm) | pitch (µm) | max drop (mV) | P/G area (µm²) |
|----------|------------|------------|---------------|----------------|
| 1        | 5          | 200        | 81            | 60000          |
| 2        | 5          | 200        | 63            | 60000          |
| 3        | 5          | 200        | 43            | 60000          |
| 1        | 5          | 100        | 73            | 110000         |
| 2        | 5          | 100        | 53            | 110000         |
| 3        | 5          | 100        | 40            | 110000         |
| 1        | 10         | 200        | 64            | 120000         |
| 2        | 10         | 200        | 46            | 120000         |
| 3        | 10         | 200        | 41            | 120000         |
| 1        | 10         | 100        | 60            | 220000         |
| 2        | 10         | 100        | 40            | 220000         |
| 3        | 10         | 100        | 38            | 220000         |

#### 4.5 Summary.

An on-chip wide bandwidth linear voltage with adaptive biasing technique is presented in this chapter. This adaptively biased regulator enhances transient response by increasing the bias current in heavy load, while keeps low quiescent current to maintain high current efficiency in light load. The maximum current load is 200 mA with  $22.22 \mu \text{V/mA}$  load regulation and 2.86 mV/V line regulation.

To further exploit the voltage fluctuations in entire planar system, power delivery network analyses considering the placement of voltage regulator modules and the size of power line width and pitch are also introduced in this chapter.

#### Chapter 5

## **Substrate Noise Suppression for Power Integrity of TSV 3D Integration**

In this chapter, a substrate noise suppression technique is proposed for power integrity of TSV 3-D integrations. This substrate noise suppression technique reduces both substrate and TSV coupling noises using active substrate decouplers (ASDs) to absorb the substrate noise current. Additionally, the ASD placing is also presented to suppress noises effectively for different 3D structures. The proposed substrate noise suppression technique can enhance the power integrity of TSV 3D-ICs by reducing the coupling substrate noises.

#### **5.1 Substrate Noise Reduction Techniques**

Three-dimensional (3D) integration technology can provide enormous advantages in achieving multi-functional integration, microminiaturizing form factor, improving system speed and reducing power consumption for future generations of ICs. In addition, through-silicon-via (TSV) has emerged as a solution in developing 3D integration. However, stacking multiple dies would face a severe challenge of the power integrity due to the increasing current density and parasitic impedance in TSV 3D-ICs [5.1], [5.2]. Therefore, we used active DECAPs to suppress simultaneous switching noises from packages [5.3]. The active DECAPs significantly affect the design of the power/ground (P/G) networks in TSV 3D integration.

In addition to the simultaneous noise from package, large coupling noises, such as ground bounce noises and substrate noises, are also coupled from the shared substrate or TSVs [5.1]. These noise signals significantly degrade signal integrity. In view of this, noise suppression will become a critical design issue in TSV 3D heterogeneous integrations.

In order to deal with substrate noise and maintain a high quality power distribution, many researchers have investigated in the related work, like evolution of substrate noise generation mechanisms [5.4], substrate noise modeling [5.5], and noise reduction techniques [5.6]-[5.12]. For passive noise isolation method, guard rings are the commonly used technique to suppress coupling noise. However, they cannot effectively address the high frequency noise. Thus, several active decoupling techniques including feedforward method [5.6]-[5.8] and decoupling amplifier circuit [5.9]-[5.12] are proposed. The two circuit level techniques for noise suppression are described briefly in the following subsection.

### 5.1.1 Noise Cancelling Technique Using Power di/dt Detecor

Feedforward technique [5.6] used a di/dt detector to sense the noise current and then generate an anti-phase current to substrate for noise cancellation. The detail structure of substrate noise canceller using di/dt detector is shown as Fig. 5.1. A power supply current of the internal circuit goes through parasitic inductance L<sub>1</sub> of the power supply line. A pickup inductance L<sub>2</sub> of the di/dt detector coupled to L<sub>1</sub> with a coupling coefficient K induces a di/dt proportional voltage. A noise tolerant amplifier amplifies the induced voltage and generates anti-phase di/dt proportional current. This di/dt detector can be applicable to cancelling the substrate noise by injecting anti-phase noise current into substrate. Unfortunately, this anti-phase injected method

cannot exactly cancel out the substrate noise current due to limited tracking speed of amplifier.



Fig. 5.1. Substrate noise canceller using di/dt detector [5.6].

#### 1896

#### **5.1.2** Active Substrate Noise Canceller with Decoupling

#### Amplifier

Active decoupling amplifier circuits using an operational amplifier and a negative feedback capacitor to absorb substrate current through Miller multiplication effect are proposed [5.9]-[5.12]. It also uses a virtual grounding to keep the substrate voltage stable. Fig. 5.2 shows a concept of active substrate noise canceller. The capacitor, C, in the feedback loop of the opamp acts as a decoupling capacitor. Its capacitance is multiplied by the gain, A(w), through the Miller effect. The resulting negative feedback causes the guard band to be virtually shorted to the reference ground line. The noise current from the substrate, which is the crosstalk, is thereby absorbed in the opamp and flows into the pair of power supply lines, VDD and VSS,

rather than into the reference ground line.



Fig. 5.2. Active decoupling amplifier circuits [5.9].

To better understand the concept of active decoupling, a simple calculation has been done for both decoupling circuits [5.11]. The noise source voltage, Vn, was assumed to couple through the parasitic capacitance, Cp, to the voltage, Vc, of the node to which decoupling capacitor C is connected. The level of noise coupling or crosstalk that appears at each node, Vc/Vn, is given by

$$Vc/Vn = (Cp/C)\{1+jwC(R+jwL)\}/\{1+jwCp(R+jwL)\}$$
 (5.1)

for the capacitor decoupling circuit and

$$Vc/Vn = Cp/\{(1+A(w))C\}$$
 (5.2)

for the active decoupling circuit, where R and L are the parasitic resistance and inductance, respectively, and it is assumed Cp that is negligibly smaller than C.



Fig. 5.3. Noise coupling calculation results [5.9].

Fig. 5.3 illustrates that the active decoupling scheme suppresses Vn more effectively (by a factor of  $\{1+A(w)\}$  than the capacitor decoupling one at low frequencies. When the frequency exceeds the parasitic resonant frequency of  $1/(2\pi\sqrt{LC})$ , the crosstalk in the capacitor decoupling scheme increases and continues to rise toward the maximum level at the other resonant frequency,  $1/(2\pi\sqrt{LCp})$ ; it then approaches Vn (0 dB) due to the large impedance, jwL. However, the active decoupling circuit is not subject to this problem and maintains a substantial decoupling ability.

#### **5.2 Substrate Noise Analysis and Modeling in TSV 3D**

#### **Integration**

In TSV stacked 3D-ICs, large coupling noises are induced by the increasing current density. The fast state transitions in digital circuits induce substantial switching noises that are coupled to analog circuits through the silicon substrates. Fig. 5.4 shows two propagation paths of the coupling noises, which degrade the power

integrity in heterogeneous integration significantly. On Path-1, the shared silicon substrate provides a noise transmission medium because of the substrate contact and parasitic junction capacitances. Moreover, this conductive path is the major noise propagation path for mixed-signal circuits in system-on-chip (SoC) and TSV 3D-ICs. In addition to the conductive substrate, another noise propagation path in TSV 3D-ICs is from the substrate of digital circuit to the analog power TSVs as Path-2. The S<sub>i</sub>O<sub>2</sub> layer surrounding TSV for DC isolation results in the high parasitic capacitance between the TSV and the silicon substrate. Hence, the noises from the silicon substrate can be coupled to the TSV through the large capacitance [5.1].



Fig. 5.4. Two propagation paths of substrate noises in 3D ICs.

The power TSV bundle is constructed as a rectangular array of VDD and GND pairs, and can be modeled through via extraction parameters of resistance, inductance, and capacitance [5.13]. The TSV characteristics of height, size, and pitch are 50 $\mu$ m, 25 $\mu$ m, and 50 $\mu$ m, respectively. In addition to the power TSV bundle, the silicon substrate is modeled as a mesh topology, where the resistance is 15 $\Omega$  and capacitance is 5fF [5.14]. By considering the realistic coupling effect between TSVs and silicon substrates, the substrate model is modified using the parasitic capacitance to couple noises between TSVs and silicon substrates.

#### 5.3 Active Substrate Decoupler (ASD) Design



Fig. 5.5. Schematic of active substrate decoupler (ASD)

To suppress the substrate noises and bouncing noises, an ASD is designed using a noise suppression amplifier [5.11] as shown in Fig. 5.5. The ASD consists of an operational amplifier and a negative feedback capacitor. The negative input and positive input of the operational amplifier are connected to the substrate and the quiescent ground, respectively. This amplifier virtually shorts the substrate to the reference ground and keeps the shared substrate quiescent. Additionally, the output and negative input of the operational amplifier are connected as a negative feedback loop using a decoupling capacitor,  $C_L$ . This capacitance is multiplied by the voltage gain through Miller multiplication effect. Hence, more substrate noise current can be further absorbed by the decoupling capacitor. Since the output capacitor  $C_L$  affects the bandwidth of ASD, the effective suppression bandwidth is controlled by  $C_L$ . The

bandwidth is proportional to the output capacitor  $C_L$ . Therefore, one can potentially increase the output capacitance to improve the noise suppression at low frequency. However, this requires more chip area. To have relatively good performance while keeping small area, 10 pF is chosen for the output capacitance  $C_L$ .

The differential inputs of ASD are DC biased at 660mV, and AC coupled with the substrate or reference ground by a small capacitor (1pF). Based on UMC 65nm CMOS SP technology, the simulation results show that the DC gain, -3dB frequency, and power dissipation are 28dB, 404MHz, and 313μW, respectively. Moreover, multiple ASDs connected in parallel can further achieve great noise reduction. However, a trade-off exists between the area overhead and noise suppression efficiency.

#### **5.4 ASD Placing for Noise Suppression**

In TSV 3D-ICs, the performance of sensitivity analog circuits would be degraded since large simultaneous switching noises are induced by the fast switching in digital circuits and coupled to analog power TSVs. For the power integrity of TSV 3D-ICs, the substrate noise suppression technique with ASD placing is described in this section. Consequently, inverter matrix and differential amplifiers are utilized as digital circuits and analog circuits based on UMC 65nm CMOS technology.

#### **5.4.1 Mixed-Signal Layer in TSV 3D Integrations**

A TSV 3D integration with a mixed-signal layer is illustrated as shown in Fig. 5.6. The structure of this 3D integration is extended to four stacked layers, consisting of 1 mixed-signal layer and 3 digital layers. The mixed-signal circuit is placed on the

top layer as Layer 4. For the analog circuitry, the power noises are propagated from the substrate and analog power TSVs.



Fig. 5.6. Block diagram of mixed-signal circuit in TSV 3-D integration.

For the ASD placing, three ASDs are placed on Layer 4 because the conductive path from the substrate is the major noise propagation path. For Layer 1-3, one ASD is placed on each digital layer. According to the ASD placement of each layer, different scenarios of the ASD placing are compared as shown in Fig. 5.7. The noise suppression effect is defined as the ratio between the RMS voltage of the power noise with ASDs and that without ASDs. Placing 3 ASDs on the mixed-signal layer can achieve more noise reduction (+ 3dB) compared to 3 ASDs placed on three digital layers. Moreover, extra 2.1dB noise reduction can be achieved using 6 ASDs (3 ASDs on Layer 4 and 3 ASDs on Layer 1-3). In view of these, placing ASDs on mixed-signal layer can reduce power noises significantly.



Fig. 5.7. Noise suppression effect of ASD planning for mixed-signal circuit in 3D structure.

For the ASD placing on the mixed-signal layer, three ASDs are placed at various locations on the mixed-signal layer, including near the noise source, at the midpoint of digital and analog circuit, and near the sensitive analog circuit. The noise suppression effects of the ASD placing on the mixed-signal layer are as shown in Fig. 5.8.

The maximum suppression effects are -11.8dB and -9.2dB with placing ASD near the analog and digital circuit at 300MHz, respectively. Placing ASDs near the analog circuit can realize the largest noise suppression effect compared to other ASD placements. Moreover, the noise suppression effect for placing ASDs near the analog circuit is almost 2dB larger than that near the digital circuit over 100MHz to 1GHz. Therefore, placing ASDs near the analog circuit is effective to reduce the power noise of the sensitive analog circuit. The substrate noise can be eliminated significantly while the ASDs are placed close to the sensitive circuits, not close to the noise source. This phenomenon results from the characteristic of virtual shorting of ASD. Additionally, the ASD placing on the mixed-signal layer is not only suitable for TSV 3D-ICs but SoC applications.



Fig. 5.8. Noise suppression effect on mixed-signal layer.

#### **5.4.2 Separated Analog Layer in TSV 3D Integrations**

TSV 3D structures with separated digital and analog layers are illustrated as shown in Fig. 5.9, a four-layer stacking structure with one analog layer and three digital layers. The noise propagated from the shared substrate does not exist in these structures. However, the simultaneous noises are propagated through the analog power TSV to the analog circuits. The noise propagation paths are related to the stacking order of the analog layer in TSV 3D-ICs. Fig. 5.10 shows the noise comparisons with different orders of the analog layer. Clearly, the analog layer placed as the bottommost stratum can achieve a most significant substrate noise reduction because of the long noise propagation path and the small impendence of analog power TSVs.



Fig. 5.9. TSV in 3-D integration. (a) Analog circuit on the top layer. (b) Analog circuit on the bottom layer.

Fig. 5.9(a) and 6(b) present the block diagrams of placing analog layer at the top (worst case) and at the bottom (best case), respectively. For the ASD placing, three ASDs placed on the analog layer and distributed to other digital layers are realized. Fig. 5.11 shows the comparisons of noise suppression in both best and worst cases.

Three ASD are distributed to each digital stratum and placed near either the digital TSV bundles or analog TSV bundles.

The noise suppression effect for the worst case can achieve -10.3dB while the three ASDs are placed within the analog layer. On the other hand, the noises are suppressed to -5.0 dB when the ASDs are distributed on the other three digital layers near the analog power TSVs.

Additionally, the noise suppression effect increases from -4.0dB to -5.0dB when the ASDs move from the vicinal region of digital power TSVs to analog power TSVs. The simulation results for the best case are similar than those for the worst case as shown in Fig. 5.11(b). Nevertheless, moving the ASDs from the vicinal region of digital power TSVs to analog power TSVs decreases the noise suppression effect from -6.0dB to -3.9dB. In view of this, ASDs should be distributed on the noise propagation path near the sensitive circuits.



Fig. 5.10. Noise comparison with different analog layers



Fig. 5.11. Noise suppression effect. (a) Analog circuit on the top layer. (b) Analog circuit on the bottom layer.

#### **5.4.3 ASD Placing in TSV 3D Integrations**

With different 3D structures, the distribution of ASDs affects the efficiency of the ASD placing. The guideline of ASD placing is that ASDs should be distributed on the noise propagation path near the sensitive circuits. Table 1 lists some helpful information of the ASD placing according to different 3D structures. Since the noise

propagated within a layer is larger than other noises coupled from other layers, ASD should be placed on the mixed-signal layer near analog circuits to achieve significant noise reduction. Consequently, for effectively reducing the coupled noises between layers, placing ASDs near the analog/digital power TSVs can achieve noise suppression effect significantly while the analog layers are stacked on the top/bottom layers, respectively.

Table 5.1. ASD placing under different TSV 3D structure.

| Mixed-Signal 2D ICs                   | Place ASDs near analog circuits                                               |  |  |
|---------------------------------------|-------------------------------------------------------------------------------|--|--|
| TSV 3D-ICs                            | Rule 1: Place ASDs in MS layers                                               |  |  |
| (Mixed-Signal Layer)                  | Rule 2: Place ASDs near analog circuits                                       |  |  |
|                                       | Suggestion: Analog layer at the bottom layer                                  |  |  |
| TSV 3D-ICs<br>(no mixed-signal layer) | Analog layer at Place ASDs near analog power TSVs in each layer top layer     |  |  |
| (TSV coupling noises)                 | Analog layer at bottom layer Place ASDs near digital power TSVs in each layer |  |  |

#### 5.5 Summary

In this chapter, a substrate noise suppression technique is presented for TSV 3D-ICs by considering both substrate and TSV coupling noises. This substrate noise suppression technique reduces noises using ASDs that utilizes a decoupling capacitor to absorb the substrate noise current. For further achieving effective noise reduction, the ASD placing is also presented for different 3D structures. Therefore, the proposed technique can enhance the power integrity of TSV 3D-ICs.

#### Chapter 6

# Power Integrity for Heterogeneous 3D Integration (Case Study)

Three-dimensional integration is an emerging technology, which vertically stacks and interconnects multiple materials, technologies, and functional components to form highly integrated systems. This 3D integration is expected to lead to an industry paradigm shift because of its tremendous benefits. However, stacking multiple dies would face a severe challenge of power integrity due to the increasing current density and parasitic impedance in TSV 3D-ICs. The challenges of how to design and analyze for such heterogeneous systems arise therefore.

#### 1896

In this chapter, a case study of power integrity for 3D heterogeneous integrations is analyzed. The heterogeneous integration is simulated by current profiling models of multi-core processors, static random access memory (SRAM), dynamic random access memory (DRAM), and front-end analog circuits. Furthermore, the techniques presented in Chapter 3, Chapter 4, and Chapter 5 are combined together to be a hierarchical power delivery system for delivering multiple, low-noise, and well-controlled power supplies to the 3D heterogeneous integration. Finally, power integrity based on the proposed hierarchical power delivery system and that on general power delivery structure, which only global supply voltages across the entire 3D chip are assumed, are compared.

#### 6.1 Various 3D Chips Stacking

Recently, 3-D integration of chips, such as a stacked reconfigurable SRAM on a SoC based on micro-bump technology [6.1], a high-speed low-power 3D SRAM based on TSV [6.2], a stack composed of an SRAM on a CPU based on TSV [6.3], a three-tier L2 cache structure for 3D processor-memory integration [6.4], multiple stacked DRAMs based on through-silicon-via (TSV) technology [6.5], and a 1Gb DRAM stacked on a multi-core processor [6.6] have been developed. These 3-D integration technologies increase the number of I/O pins between chips drastically because they can be placed anywhere on the chips. They also reduce the parasitic capacitance of the interconnection between chips. These two factors enable faster data rate and lower-power chip-to-chip communication compared with conventional 2-D interconnection.



Fig. 6.1. A chip-stacked memory using 3D packing technology [6.1].

Fig. 6.1 shows 3-D packaging technology applied to a 1Mb chip-stacked memory [6.1]. The local memories on a SoC are moved to a memory chip, and the IP cores are moved to a logic chip. This allows for an increase in memory size, and the separate chip can be fabricated in a low-cost memory-specified fabrication process.

By using memory-specified network interconnects, area overhead of network interconnects for the memory chip is reduced by 63% and the latency overhead by 43%.

H. Nho *et al.* present a novel 3D-SRAM architecture [6.2] that can be used to extend the scaling of SRAM, as shown in Fig. 6.2. In this architecture, local bit-lines are vertical and connect through select transistors to the global bit-lines routed on the bottom level. As a result, the length and capacitance of the global bit-lines depend only on the number of local bit-lines, not the total number of cells. Thus, this architecture significantly reduces the bit-line capacitance, achieves 3.4 times reduction in active power consumption and 1.8 times reduction in access time.



Fig. 6.2. A high-speed, low-power 3D-SRAM architecture [6.2].

Fig. 6.3 illustrates the 3D integrated SRAM with a TFLOP processor [6.7] that delivers 1 TFLOP (Tera-FLOP) floating point compute performance [6.3]. The processor tiles are connected together with a bidirectional interface to the 3D integrated SRAM tile. The logic interface in the SRAM tile decodes memory access instructions and performs the memory operations instructed by the processor core. The SRAM tile sustains 12GB/sec memory bandwidth to each core, and provides 1TB/sec total bandwidth necessary to sustain overall TFLOP performance.



Fig. 6.3. 3D integrated SRAM with TFLOP processor [6.3]

The major bottlenecks in high-performance microprocessors are small memory capacitance and large miss penalty time. To address these problems, a 3-tier 192-kB L2 cache for a 3D processor-memory stack is proposed in [6.4], as in Fig. 6.4. In this approach, the wordlines are divided into multiple layers, resulting in reduced wordline length and memory latency. Therefore, the data transfer rate is up to 96GB/sec and the average access time has 13% improvement of this 3D processor-SRAM integration.



Fig. 6.4. Floorplan of the 3D processor stack combing CPU and L1 cache on the bottom tier with three tiers of L2 cache stacked on top of it [6.4].

### **6.2 Heterogeneous 3D Integration of a Processor Memory**Stack

#### **6.2.1** A Prototype System of Processor Memory Stack

Slow cache memory systems and low memory bandwidth present a major bottleneck in performance of modern microprocessors. Over the years, a number of architectural techniques have been proposed to overcome the penalties associated with slow memory systems. However, the basic problem of insufficient bandwidth for data transfer between the processor and the memory hierarchy has never been solved and it continues to be a bottleneck in CPU performance. As the mature TSV 3D integration grows, Jacob *et al.* discussed the advantages of moving the memory hierarchy to independent tiers on multi-core processors to mitigate the memory wall effects [6.8]. Such an architecture would require multiple wide structures that are feasible only with 3D chip stacking using ultra small and dense vertical TSVs.

To demonstrate the benefits discussed above, a heterogeneous 3D integration of processor memory stack is built. The heterogeneous integration consists of a 16-core processor tier, a SRAM tier, a DRAM tier, and a front-end circuit tier, as in Fig. 6.5. The multi-core processor with L1 cache is located on the bottom layer. A 1Mb SRAM based L2 cache, which is smaller but operates faster, is sandwiched between the processor chip and the main memory chip. And a 128Mb DRAM memory chip, which can hold more data but operate slower, stacked on top of the L2 cache. Such a 3D structure with multiple level of memory hierarchy alleviates the memory wall problem and increases the throughput of multi-core processor.

Additionally, for wireless communication with antenna and the digital baseband,

a front-end RF/analog module is integrated into the process-memory stack. Since the inseparable architecture of multi-core processor and the memory hierarchy, the front-end circuit stacked only suitable on the top stratum. This front-end RF/analog module consists of both a receiver and a transmitter to provide wireless signals processing. The modulated/demodulated signals then can communicate with the multi-core processor and memories.



Fig. 6.5. Heterogeneous integration of multi-core, SRAM, DRAM, front-end circuits stacking.

#### 6.2.2 Architecture of the prototype system

A well-known technique for creating high-bandwidth caches is dividing the cache into multiple independently addressed banks and interleaving their data buses. However, as the cache size increases, this approach leads to increased length of interconnect wires resulting in increased wire delays. The problem can be solved by considering vertical direction for integration of multiple memory arrays or banks forming the cache. If the 3-D technology is aggressive enough, much wider address and data bus paths are possible, and many more ports for simultaneously accessing

independent blocks or tiers of memory become feasible. Thus, a multi-core processor and SRAM based L2 cache system is introduced in Fig. 6.6.

This system has three advantages. First, each pair of cores (i.e., a processor core and a SRAM core) can be operated independently by using bidirectional TSVs. This memory architecture widely increases the flexibility of memory system design. The second advantage is an ultra-high-speed interface. Small TSVs are used to integrate thousands of I/O pins on a chip, and interconnect parasitics between stacked chips is greatly reduced. These features make a high speed interface between chips possible. The third advantage is reduction of power consumption. TSVs shorten interconnect length between the CPU and SRAM chip. Moreover, since the SRAM chip is divided into multiple cores, the data bus on the SRAM chip also becomes short. Shorter interconnect length reduces power consumption of off-chip, and the shorter data bus reduces that of on-chip data communication.



Fig. 6.6. Multi-core processor that features core-to-core connection with TSVs

The 3D processor-memory stack architecture can be extended to have a main memory stacked on top of the SRAM based L2 cache with a very wide data bus between the two. Therefore, a 128Mb DRAM stratum stacked on the third layer acts as a main memory for multi-core processor. Fig. 6.7 shows the frame of main memory. The 128 Mb DRAM is divided into two memory cell arrays. And the two identified arrays are symmetrical in each side of DRAM stratum. The peripheral circuits, decoders and sense amplifiers, are placed in the middle of the chip for optimal distribution of control signals to all the arrays in this tier. The data and addresses then connected peripheral circuits of DRAM cell with L2 cache through signal TSVs.



Fig. 6.7. A 128Mb DRAM stratum as a main memory for multi-core processor.

#### **6.3 Power Delivery for the Processor Memory Stack**

#### **6.3.1 Hierarchical Power Delivery System**

3D integration offers novel architectural opportunities for microprocessors as discussed above. To support such a multiple voltage domain system, the techniques presented in Chapter 3, Chapter 4, and Chapter 5 are combined together to be a hierarchical power delivery system for delivering low-noise, and well-controlled power supplies to the 3D processor memory stack.

The hierarchical power delivery system applied to the processor memory stack is shown in Fig. 6.8. This system contains four noise reduction techniques: active switching DECAPs to reduce the resonant noise caused by package, an area-efficient power TSV optimization method for appropriate TSV planning, linear voltage regulators to provide clean local supply voltages, and active substrate decoupler (ASD) to suppress coupling noise through the TSVs and shared substrate. All these supply stabilization techniques are used to have better power noise suppression both in the global or local power networks.

Since the active DECAPs act as a global noise regulator to reduce the resonant noise caused by package and TSVs, the active DECAPs are therefore sandwiched between the package and the multi-core processor. Moreover, the global and the local power networks are decoupled. Power domains can be defined on local power networks. And each power domain is powered by a dedicated voltage regulator with the requested voltage. In the power hierarchy, the global power network (global power TSVs and active DECAPs) is the first layer and the voltage regulators are the second layer devices. For simplicity, the ASDs are not shown in this figure.



Fig. 6.8. Hierarchical power delivery system applied to the processor memory stack.

The feasible supply voltages of each power domain used in this work are listed in Table 6.1. Both the first two layers, 16-core processor and 1Mb SRAM based L2 cache, are operated at 1.0V. The DRAM cell used a lower VDD, 1.0V, to increase the write ability and reduce power consumption. Whereas the peripheral circuits used a higher operating voltage, 1.35V, to increase the operation speed. As for the front-end circuits, the operating voltage is assumed to be 1.2V. Meanwhile, the inserted voltage regulator will induce a voltage drop. Therefore, the voltage sources of the global power networks are raised (10% VDD<sub>int</sub>) in the proposed structure to endure these voltage drops.

Table 6.1. Supply voltages of each power domain.

| Stacking order | Circuits          | VDD <sub>ext</sub>            | VDD <sub>int</sub>             |
|----------------|-------------------|-------------------------------|--------------------------------|
| Layer 1        | 16-core processor | 1.1 V                         | 1.0 V                          |
| Layer 2        | 1Mb SRAM          | 1.1 V <sub>0</sub>            | 1.0 V                          |
| Layer 3        | 128Mb DRAM 18     | 1.1 V (cell)<br>1.5 V (peri.) | 1.0 V (cell)<br>1.35 V (peri.) |
| Layer 4        | Front-End circuit | 1.3 V                         | 1.2 V                          |

#### 6.3.2 Power Delivery Model and Current Profiling Model

In order to analyze power integrity, we first build the power delivery model for the entire 3D processor memory stack system. The on-chip power delivery networks consists of an array of uniformly spaced metal wires. Fed from the external power supplies, power transmitted from the bottom tier to the top tier through power TSVs. Due to the power hierarchy, the local voltage regulators then convert the requested voltage to each local power domain, as shown in Fig. 6.9.

The power grid is modeled as a regular RL matrix. Here, we assume the footprint of the processor memory stack is 2mm x 2mm. The width and the pitch of the power lines are set to  $10\mu m$  and  $100\mu m$ . The unit length inductance is 0.05pF and the sheet resistance is  $0.150m\Omega$ . Thus, the power grid of each layer can be constructed as a  $20 \times 20 = 100 \times 100 \times 100 = 100 \times 100$ 



Fig. 6.9. Power delivery network model.

By adjusting the power information of [6.1], [6.6], [6.9], [6.11] to fit our process memory stack structure, we assume the maximum power consumption of multi-core processor, SRAM based L2 cache, DRAM memory, and front-end circuits to be 1.6W, 0.36W, 1.5W, and 0.25W, respectively. And the ratio between dynamic power and static power is set around 70%:30%. Table 6.2 shows the parameters of current loads model. Typically, the switching circuits are represented by triangular waveforms [6.9], [6.10]. Thus, the current loads drawn from the grid are modeled as triangular waveforms except for the front-end circuits, as shown in Fig. 6.10. Since the front-end

circuit receives and transmits analog signal processing, the current load is modeled as a sinusoidal waveform. Moreover, the current model is flexible enough to represent various switching activities and frequencies within switching circuits by assigning appropriate peak current, rise and fall times, and frequencies to each current source.

Evenly distributing the current profiling models to the corresponding power grids, we can capture the effect of supply noise on the power delivery network to mimic the actual switching circuits.

Circuits **Current load model**  $(T_r, T_f, T)$  $I_{MAX}$ I<sub>Leakage</sub> 30mA Per processor core Triangular waveform 100mA (100ps, 150ps, 500ps) 22.5mA 9mA Per processor block Triangular waveform (1ns, 1.5ns, 5ns) 128Mb DRAM Triangular waveform 750mA(cell) 250mA(cell) (1.5ns, 1.5ns, 5ns) 555mA(peri.) 165mA(peri.) (1.5ns, 1.5ns, 5ns) Sinusoidal waveform 80mA Front-End circuit 200mA (N/A, N/A, 400ps)

Table 6.2. Parameters of current load model



Fig. 6.10. Current load model

#### 6.3.3 TSV planning

The power TSVs of the processor memory stack structure can be categorized into three groups by their supply voltages. The first group, named the major power TSV group, delivers the unregulated 1.1V supply to local voltage regulators of multi-core processor, SRAM, and DRAM array cells. The second and the third power TSV

groups are used as global power supply sources of the DRAM peripheral circuits and the front-end circuits, respectively.

In order to minimize the area occupied by power TSV groups, an area-efficient power TSV planning method is proposed in chapter 3. Review the equation derived in section 3.3.4, the supply noise in a T-layer 3D structure can be estimated as following equation.

$$V_{noise} = 2 \times \frac{I_{supply}}{N} \times T \times \frac{(aD_{\mu m} + b)^{2}}{t_{r(ns)}(aD_{\mu m} + b) - t_{r(ns)}^{2} \frac{K}{3D_{\mu m}^{2}}}$$
(6.1).

In this structure, the filling material of power TSV is assumed to be copper and the height of power TSV is set 100µm. The other related design parameters are shown in Table 6.3. By substituting the parameters into the equation (6.1), we can obtain the most appropriate pair numbers and diameter of each power TSV group.

Table 6.3. Parameters and results of TSV planning.

| TSV group | Current loads | Tr      | ${ m V}_{ m tolerance}$ | TSV<br>(Number, Size) | Area <sub>TSV</sub>  |
|-----------|---------------|---------|-------------------------|-----------------------|----------------------|
|           | 1600mA (core) |         | III.                    |                       | _                    |
| Major     | 360mA (SRAM)  | 0.1ns   | 100mV                   | (107,10)              | 63900μm <sup>2</sup> |
|           | 750mA (DRAM)  |         |                         |                       |                      |
| DRAM      | 555mA         | 1.5ns   | 100mV                   | (6.10)                | 3300µm <sup>2</sup>  |
| (peri.)   | JJJIIIA       | 1.3118  | TOOM                    | (6,10)                | 3300μπ               |
| Front-End | 200mA         | 0.05ns  | 50mV                    | (16.10)               | 9300μm <sup>2</sup>  |
| circuits  | ZUUIIIA       | U.USIIS | JUIIIV                  | (16,10)               | 9300μπ               |

For example, if we want to plan the third power TSV group, which is the global supply source of front-end circuits, the parameters are replaced with  $I_{supply}$ =200mA and  $T_r$ =0.05ns into the equation (6.1). Since analog circuitry is much sensitive to coupling noise than digital circuitry, the tolerant voltage of front-end circuit is set 50mV whereas the tolerant voltages of the other digital circuits are set 100mV. By

multiple iterations with equation (6.1), the optimized pair numbers and diameter of the power TSV group used for front-end circuits were chosen as six pairs and 10µm. Similar results can be obtained with the major TSV group and the TSV group used for DRAM peripheral circuits. Note that, although the current load of DRAM peripheral circuits is 2.5 times than front-end circuits, the pair numbers of second power TSV group is still less than that of third power TSV group. This is because the switching time has more dominance in the supply noise than total current loads.

Once the pair numbers and diameter of each power TSV group has been decided, the three power groups are applied to the processor memory stack structure. Since the architecture is symmetrical with respect to the x-axis, we reserve the power TSV regions either at both sides or in the middle of each tier for TSV placement. Considering the floorplan of the architecture, the power TSV group used for DRAM peripheral circuits is placed in the middle of this stack. On the other hand, the power TSV group used for front-end circuits is divided into two bundles. And these two bundles are placed at both sides of this stack. Finally, the major power TSV group is evenly distributed within the remainder power TSV regions.

#### **6.4 Simulation Results**

The 3D heterogeneous integration of processor memory stack is simulated based on UMC 65nm CMOS technology and the TSV model [6.12]. The nominal supply voltage for processor, SRAM blocks, and DRAM array cells is 1.0V. Moreover, the nominal supply voltages for DRAM peripheral circuits and front-end circuits are 1.35V and 1.2V. There are 8 power pins surrounding this structure. 8, 4, and 2 power pins are connecting external 1.0V, 1.35V and 1.2V power supply, respectively. Four of 1.0V power pins are located at each side and the other four 1.0V power pins are located in the middle of the stack. On the other hand, both the four 1.35V power pins

are located in the middle of the stack. And the 1.2V power pins are located at both sides with one power pin each side.

The footprint of this structure is assumed to be 2mm x 2mm. And the width and the pitch of the local power lines are 10µm and 100µm. Thus, the resolution of the process memory stack is 20 x 20 of each layer. By evenly distributing the current profiling models to the corresponding power grids, we can capture the effect of supply noise on the power delivery network to mimic the actual switching circuits.

To support such a multiple voltage domain system, the techniques presented in Chapter 3, Chapter 4, and Chapter 5 are combined together to be a hierarchical power delivery system for delivering low-noise, and well-controlled power supplies to the 3D processor memory stack. Based on the power design flow discussed in chapter 3, the noise reduction techniques are utilized into the stack sequentially.

The active DECAPs are used to reduce resonant noises caused by packages and TSVs at the first step. Fig. 6.11 shows the simulation waveforms of power supplies use active DECAPs and that without active DECAPs. By using two active DECAPs for each power VDD pin on the bottom layer, the total noise of 1.0V power pair (VDD and GND) reduced from 75.86mV to 63.52mV. Similarly, the total noise of 1.35V power pair reduced from 75.62mV to 66.07mV and those of 1.2V power pair reduced from 21.81mV to 15.17mV. Here, the noise means the root mean square (RMS) voltage of the voltage difference between VDD/GND and its nominal voltage. For example, the supply noise on a nominal 1.0V power supply is calculated as |VDD<sub>i</sub>-1.0|<sub>RMS</sub>. Thus, the total noise of 1.0V power pair is calculated as |VDD<sub>i</sub>-1.0|<sub>RMS</sub>.



Fig. 6.11. Simulation waveforms of voltage performance while using active DECAPs (the blue line) and that without active DECAPs (the grey line).

Moreover, the global and the local power networks are decoupled by adaptively biased voltage regulators. Each power domain is powered by a dedicated voltage regulator with the requested voltage. According to the workloads on the local power networks, there are 8, 4, 8, and 2 voltage regulators used on layer 1 to layer 4, respectively. Since the output voltage of voltage regulator is locked to reference voltage through an error amplifier, the power supplies provided by local voltage regulators have more stable voltage performance than those connected to power TSVs directly, as shown in Fig. 6.12. The noise reductions of 1.0V, 1.35V, and 1.0V power supply are 75.71%, 53.76%, and 76.06%, respectively.

Additionally, the active substrate decouplers (ASDs) are used to suppress the substrate noises and coupling noises in the 3D structure. According to the simulation results in chapter 5, the ASDs are suggested to be distributed on the noise propagation path to have better noise reduction. Thus, 8 ASDs are distributed around the ground TSVs in each layer to reduce more coupling noises. Fig. 6.13 shows the simulation

waveforms of three ground supplies. By absorbing the substrate noise current and virtually shorting to reference ground, ASDs keep the ground supplies quiescent. The noise reductions of ground supplies for processor, DRAM peripheral circuits, and front-end circuits are 59.05%, 22.89%, and 50.40%, respectively.



Fig. 6.12. Simulation waveforms of voltage performance while using local voltage regulators (the blue line) and that connecting to power TSV directly (the grey line).



Fig. 6.13. Simulation waveforms of voltage performance while using ASDs to suppress substrate noises (the blue line) and that without ASDs (the grey line).

The overall comparison of voltage performances using the hierarchical power delivery system and those without the hierarchical power delivery system is show in Fig. 6.14. Thanks to the voltage regulators and ASDs, the supply voltages and the ground voltages become more reliable with less voltage fluctuations. The noise on 1.0V, 1.35V, and 1.2V power supply pairs are reduced by 70.51%, 45.71%, and 71.10%.



Fig. 6.14. Simulation waveforms of voltage performance while using the hierarchical power delivery system (the blue line) and that without hierarchical power delivery system (the grey line).

Fig. 6.15 shows the noise reductions of using hierarchical power delivery system step by step. While only the active DECAPs are used, the noise reductions of 1.0V, 1.35V, and 1.2V power supply pairs are 16.26%, 12.62%, and 30.44%, respectively. However, when the voltage regulator are introduced to the 3D structure, the noise reductions of 1.0V, 1.35V, and 1.2V power supply pairs are greatly improved by 47.38%, 41.29%, and 60.28%, respectively. Moreover, when the ASDs are adopted



Fig. 6.15. Noise reductions of each power supply pair while (a) with active DECAPs only, (b) with active DECAPs and voltage regulators, (c) with active DECAPs, voltage regulators, and ASDs.

into the 3D structure, the noise reductions of 1.0V, 1.35V, and 1.2V power supply pairs are further improved by 70.52%, 45.71%, and 71.10%, respectively. For fair comparison, the extra decoupling capacitance used in active DECAPs, voltage regulators, and ASDs are added to those compared simulations with equivalent capacitance. Take Fig. 6.15(a) as an example, if the active DECAPs are removed from the process-memory stack, an equivalent capacitance of 400pF are filled into the remaining area for experimental control. Similarly, voltage regulators are replaced by a 100pF~200pF size of equivalent capacitance (depending on the total amount of decoupling capacitance used in each layer) while voltage regulators are removed. And ASDs are replaced by a 60pF size of equivalent capacitance in each layer while ASDs are removed. All the equivalent capacitances evenly distributed within the vacant area for experimental control. As a result, the voltage regulators have the greatest effect on noise reduction in the hierarchical power delivery system.



Fig. 6.16. Effective supply voltages across the 3D structure. (a) the effective supply voltages for processor (1.0V), and (b) the effective supply voltages for front-end circuits (1.2V)

Furthermore, Fig. 6.16 shows the effective supply voltages (VDD<sub>min</sub> - GND<sub>max</sub>) for power distribution networks across the 3D structure while using the hierarchical power delivery system. For simplicity, only the simulation data of the bottom and the top layer are shown. The effective supply voltages across the processor tier are in the range of 0.968V~0.976V, as shown in Fig. 6.16(a). The maximum voltage difference is only 8mV. On the other hand, the effective supply voltages across the front-end circuit tier are in the range of 1.185V~1.188V, as shown in Fig. 6.16(b). The maximum voltage difference is only 3mV. Here, the magnitude of effective supply voltages is affected by the location of switching circuits and local voltage regulators. The node which is more close to local voltage regulators has less voltage drop.

Fig. 6.17 shows the power overhead breakdown of each power component. The total power overhead is 41.27mW, where active DECAPs consume 7.99mW, ASDs consume 13.17mW, and adaptively biased regulators consume 20.11mW. And the power overhead is only 1.11% of total power consumption (3.7W) of the processor memory stack.



Fig. 6.17. Power overhead breakdown of each power component.

### **6.5 Summary**

In this chapter, a case study of power integrity for 3D heterogeneous integrations is analyzed. The heterogeneous integration is assumed to be a processors memory stack and simulated by current profiling models. To support such a multiple supply voltages system, the techniques presented in Chapter 3, Chapter 4, and Chapter 5 are combined together to be a hierarchical power delivery system for delivering multiple, low-noise, and well-controlled power supplies to the 3D heterogeneous integration. Furthermore, power integrity based on the proposed hierarchical power delivery system and that on general power delivery structure are compared step by step. As a result, the voltage regulators have the greatest effect on noise reduction in the hierarchical power delivery system. And the noise reductions on power supply pairs (VDD+GND) are suppressed by up to 71.10%. Moreover, with an appropriate power delivery structure, the voltage difference is only 8mV within the entire processor tier. The power overhead of the hierarchical power delivery system is 1.11% of the whole 3D processor memory stack.

## **Conclusion and Future Work**

### 7.1 Conclusion

Three-dimensional (3D) integration technology can provide enormous advantages in achieving multi-functional integration, microminiaturizing form factor, improving system speed and reducing power consumption for future generations of ICs. However, stacking multiple dies would face a severe challenge of power integrity due to the increasing current density and parasitic impedance in TSV 3D-ICs. Moreover, system heterogeneity offered by 3-D circuits has exacerbated the requirement for multiple, wide range, and well-controlled power supplies. In view of these, a hierarchical power delivery system is presented for the power integrity in TSV 3D-ICs.

The proposed hierarchical power delivery system decouples the global and local power networks by voltage regulator modules. The decoupled power delivery structure can reduce the required decoupling capacitors significantly. In addition, an area-efficient TSV planning for choosing appropriate diameter and counts of power TSVs is proposed to optimize the area-occupancy and voltage drop performance.

In order to reduce the resonant noise caused by the package and power TSVs, an active switching DECAP is adopted in the hierarchical power delivery system as the global regulator. Furthermore, a wide bandwidth linear voltage regulator with adaptive biasing technique is proposed to achieve the wide operation frequency range. This adaptively biased regulator enhances the transient response by increasing the bias current in heavy load, while keeps low quiescent current to maintain high current

efficiency in light load. To further exploit the voltage fluctuations in the entire system, the placements of the voltage regulator modules and the sizes of power delivery grids are also introduced.

Consequently, a substrate noise suppression technique is also presented for TSV 3D-ICs by considering both substrate and TSV coupling noises. This substrate noise suppression technique reduces noises using ASDs that utilizes a decoupling capacitor to absorb the substrate noise current. For further achieving effective noise reduction, the ASD placing is also presented for different 3D structures.

A case study for the power integrity of the heterogeneous TSV 3D integration is also investigated in this thesis. As a result, the noise reduction of the case study based on the proposed hierarchical power delivery system is greatly reduced by up to 71.10% with only 1.11% power overhead. Accordingly, the hierarchical power delivery system can be easily adopted in a heterogeneous TSV 3D integration with a little modification of the local power networks. Therefore, the proposed hierarchical power delivery system is very useful for the power integrity of the heterogeneous integration in TSV 3D-ICs.

### 7.2 Future Work

System heterogeneity offered by 3D integration usually requires different supply voltages for different function blocks, ranging from high (3.3V or higher) to ultra-low (sub-threshold operation) voltages. The multiple voltages requirement can be achieved by adopting the proposed hierarchical power delivery structure. As shown in Fig. 7.1, the first layer power TSVs are connected to power source, supplying a high voltage. The clean high voltage is then provided to the high voltage domain through a voltage regulator. Because of the inherent power efficiency limit, the linear regulator

is not suitable for large voltage conversion ratio. An on-chip switching DC-DC converter is a better option [7.1]. The switching DC-DC buck converter can be positioned at the second layer of the power hierarchy as shown in Fig. 7.1 to produce a lower voltage for further usage. The converted low voltage is then fed to low voltage domains. For ultra-low voltage (sub-threshold) domains, switched capacitor DC-DC converters [7.2] can be adopted. By utilizing a combination of linear, switching buck converters and switched capacitor DC-DC converters, high power efficiency of heterogeneous integration is achieved over a wide range of conversion ratios.



Fig. 7.1. Hierarchical power delivery system for wide voltage range heterogeneous integrations.

In addition to power integrity, the heat dissipation is also an extreme challenge in 3D-ICs due to the increased power density and poor thermal conductivity between bonding materials. The excessively high temperature can significantly degrade

interconnect/device reliability and performance. In order to control the hot spot issue, temperature sensors can be integrated into the 3D integration. With the thermal feedback information of temperature sensors, the temperature of the circuit will be controlled below a safety upper bound by slowing the system operating frequency down. Therefore, such a temperature-power management can be adopted for TSV 3D-ICs, as shown in Fig. 7.2.



141

# References

### Chapter 1

- [1.1] Yole Development. (2007). 3DIC & TSV Report [Online]. http://www.yole.fr/pagesan/products/reprot/sample/3dic.pdf
- [1.2] W. Davis, J. Wilson, S. Mick, J. Xu, H. Hua, C. Mineo, A. Sule, M. Steer, and P. Franzon, "Demystifying 3-D ICs: The pros and cons of going vertical," *IEEE Design & Test of Computers*, vol. 22, no. 6, pp. 498-510, Nov. 2005.
- [1.3] N. H. Khan, S. M. Alam, and S. Hassoun, "Power delivery design for 3-D ICs using different TSV technologies," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 19, no. 4, pp. 647-658, April 2011.
- [1.4] P. Jain, D. Jiao, X. Wang, and C. H. Kim, "Measurement, analysis and improvement of supply noise in 3D ICs," accepted by *IEEE VLSI Circuits Symposium*, 2011.
- [1.5] X. Meng and R. Saleh, "An Improved Active Decoupling Capacitor for Hot-Spot Supply Noise Reduction in ASIC Designs," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 2, pp. 584-593, Feb. 2009.

- [2.1] E. Beyne, "The rise of the 3rd dimension for system integration," *IEEE International Interconnect Technology Conference*, 2006, pp.1-5.
- [2.2] R. R. Tummala, V. Sundaram, R. Chatterjee, P.M. Raj, N. Kumbhat, V. Sukumaran, V. Sridharan, A. Choudury, Q. Chen, and T. Bandyopadhyay, "Trend from ICs to 3D ICs to 3D Systems," in Proc. IEEE Conf. Custom Integrated Circuits Conference, pp. 439-444, Sept. 2009.
- [2.3] Yole Development. (2008). 3DIC & TSV Report [Online]. http://www.yole.fr/pagesan/products/report\_sample/3dic.pdf
- [2.4] M. S. Bakir, C. King, D. Sekar, H. Thacker, B. Dan, G. Huang, A. Naeemi, and J. D. Meindl, "3D heterogeneous integrated systems: liquid cooling, power delivery, and implementation," in *Proc. IEEE Conf. Custom Integrated Circuits Conference*, 2008, pp. 663-670.
- [2.5] T. Whipple, T. Kukal, K. Felton, and V. Gerousis, "IC-package co-design and analysis for 3D-IC designs," *IEEE International Conference on 3D System Integration*, 2009, pp. 1-6.
- [2.6] V.F. Pavlidis and E.G. Friedman, "Interconnect-based design methodologies for three-dimensional integrated circuits," in *Proceedings of the IEEE*, vol. 97 no. 1, pp. 123-140, Jan. 2009.
- [2.7] M. Motoyoshi, "Through-silicon via (TSV)," in *Proceedings of the IEEE*, vol. 97, no. 1, pp.43-48, Jan. 2009
- [2.8] M. Koyanagi, T. Fukushima, and T. Tanaka, "High-density through silicon vias for 3-D LSIs," in *Proceedings of the IEEE*, vol. 97 no. 1, pp. 49-59, Jan. 2009.
- [2.9] P. Marchal, B. Bougard, G. Katti, M. Stucchi, W. Dehaene, A. Papanikolaou,

- D. Verkest, B. Swinnen, and E. Beyne, "3-D technology assessment: path-finding the technology/design sweet-spot," in *Proceedings of the IEEE*, vol. 97, no. 1, pp. 96-107, Jan. 2009.
- [2.10] K. N. Chen, and C. S. Tan, "Integration schemes and enabling technologies for three-dimensional integrated circuits," *IET Computers & Digital Techniques*, vol. 5, no. 3, pp.160-168, May 2011.
- [2.11] G. Van der Plas et al., "Design issues and considerations for low-cost 3-D TSV IC technology," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 1, pp. 293-307, Jan. 2011.
- [2.12] J.-F. Li and C.-W. Wu, "Is 3D integration an opportunity or just a hype," *Asia and South Pacific Design Automation Conference (ASP-DAC)*, 2010, pp. 541-543.
- [2.13] T. Zhang, R. Micheloni, G. Zhang, Z.-R. Huang, and J. J. Lu, "3-D data storage, power delivery, and RF/optical transceiver-case studies of 3-D integration from system design perspectives" in *Proceedings of the IEEE*, vol. 97, no. 1, pp. 161-174, Jan. 2009.
- [2.14] Semiconductor Industry Association. International technology roadmap for semiconductors (ITRS), 2004. <a href="http://public.itrs.net/">http://public.itrs.net/</a>.
- [2.15] J. Sun, J. Lu, D. Giuliano, T. P. Chow, and R. J. Gutmann, "3D power delivery for microprocessors and high-performance ASICs." in *IEEE Applied Power Electronics Conference (APEC)*, Feb. 2007, pp. 127-133.
- [2.16] N. Na, T. Budell, C. Chiu, E. Tremble, and I. Wernple, "The effects of on-chip and package decoupling capacitors and an efficient ASIC decoupling methodology." in *Proceedings of the IEEE Electronic Components and Technology Conference*, 2007, pp. 556–567.
- [2.17] E. Hailu, D. Boerstler, K. Miki, J. Qi, M. Wang, and M. Riley, "A circuit for reducing large transient current effects on processor power grids." in *Proceedings of the IEEE International Solid-State Circuits Conference*, 2006, pp. 2238–2245...
- [2.18] G. Huang, M. Bakir, A. Naeemi, H. Chen, and J.D. Meindl, "Power delivery for 3D chip stacks: physical modeling and design implication," in IEEE *Electrical Performance of Electronic Packaging* Conference, 2007, pp. 205-208.
- [2.19] P. Jain, T.-H. Kim; J. Keane, and C.H. Kim, "A multi-story power delivery technique for 3D integrated circuits," *ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED)*, 2008, pp. 57-62.
- [2.20] J. Gu and C. H. Kim, "Multi-story power delivery for supply noise reduction and low voltage operation." *ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED)*, 2005, pp. 192-197.
- [2.21] G. Schrom, P. Hazucha, J. Hahn, V. Kursun, D. Gardner, S. Narendra, T. Karnik, and V. De, "Feasibility of monolithic and 3D-stacked DC-DC

- converters for microprocessors in 90nm technology generation." *ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED)*, 2004, pp. 263–268.
- [2.22] J. Rosenfeld, and E. G. Friedman, "Linear and switch-mode conversion in 3-D circuits," accepted by *IEEE Transactions on Very Large Scale Integration* (VLSI) Systems, 2011.
- [2.23] N. H. Khan, S. M. Alam, S. Hassoun, "System-level comparison of power delivery design for 2D and 3D ICs," *IEEE International Conference on 3D System Integration*, 2009, pp. 1-7.

- [3.1]. N. H. Khan, S. M. Alam, and S. Hassoun "Power delivery design for 3-D ICs using different TSV technologies," *IEEE Transactions on Very Large Scale Integration (VLSI) systems*, vol. 19, no. 4, pp. 647-658, April 2011.
- [3.2]. W. Davis, J. Wilson, S. Mick, J. Xu, H. Hua, C. Mineo, A. Sule, M. Steer, and P. Franzon, "Demystifying 3-D ICs: The pros and cons of going vertical," *IEEE Design & Test of Computers*, vol. 22, no. 6, pp. 498-510, Nov. 2005.
- [3.3]. G. Schrom, P. Hazucha, J. Hahn, V. Kursun, D. Gardner, S. Narendra, T. Karnik, and V. De, "Feasibility of monolithic and 3D-stacked DC-DC converters for microprocessors in 90nm technology generation." *ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED)*, 2004, pp. 263–268.
- [3.4]. P. Jain, T.-.H Kim; J. Keane, and C.H. Kim, "A multi-story power delivery technique for 3D integrated circuits," *ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED)*, 2008, pp.57-62,
- [3.5]. X. Wang, Y. Cai, Q. Zhou, S. X.-D. Tan, and T. Eguia, "Decoupling capacitance efficient placement for reducing transient power supply noise," *ACM/IEEE International Conference on Computer-Aided Design (ICCAD)* 2009, pp.745-751.
- [3.6]. P. Zhou, K. Sridharan, and S.S. Sapatnekar, "Optimizing decoupling capacitors in 3D circuits for power grid integrity," *IEEE Design & Test of Computers*, vol. 26, no. 5, pp.15-25, Sept. 2009.
- [3.7]. T.-H. Lin, "Power integrity in TSV 3D integration," Thesis of NCTU, July, 2010.
- [3.8]. T.-H. Lin, P.-T. Huang, and W. Hwang "Power noise suppression technique using active decoupling capacitor for TSV 3D Integration," *IEEE International SOC Conference (SOCC)*, 2010, pp. 209-212.
- [3.9]. R. Weerasekera, M. Grange, D. Pamunuwa, H. Tenhunen, and L.-R. Zheng, "Compact modelling of through-silicon vias (TSVs) in

- three-dimensional (3-D) integrated circuits," *IEEE International Conference on 3D System Integration*, 2009, pp. 1-8.
- [3.10]. W. Ahmad, L.-R. Zheng, R. Weerasekera, Q. Chen, A.Y. Weldezion, and H. Tenhunen, "Power integrity optimization of 3D chips stacked through TSVs," *IEEE Conference on Electrical Performance of Electronic Packaging and Systems (EPEPS)*, 2009, pp.105-108.
- [3.11]. E. Salman, E. G. Friedman, R. M. Secareanu, and O. L. Hartin, "Worst case power/ground noise estimation using an equivalent transition time for resonance," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 56, no. 5, pp. 997-1004, May 2009.
- [3.12]. X. Meng, and R. Saleh," An improved active decoupling capacitor for hot-spot supply noise reduction in ASIC Designs," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 2, pp. 584-593, Feb. 2009.
- [3.13]. J. Gu, H. Eom, and C. H. Kim, "On-chip supply noise regulation using a low-power digital switched decoupling capacitor circuit," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 6, pp. 1765-1775, Jun. 2009.
- [3.14]. J.-T. Wu and K.-L. Chang, "MOS charge pump for low-voltage operation," *IEEE Journal of Solid-State Circuits*, vol. 33, no. 4, pp. 592-597, April 1998.
- [3.15]. I. Savidis and E. G. Friedman, "Closed-form expressions of 3-D via resistance, inductance, and capacitance," *IEEE Transactiond on Electron Device*, vol. 56, no. 9, pp. 1873-1881, Sept. 2009.

- [4.1] Y.-H. Lam, and W.-H. Ki, "A 0.9V 0.35 µm adaptively biased CMOS LDO regulator with fast transient response," in *Proceedings of International Solid-State Circuits Conference (ISSCC)*, 2008, pp. 442–444.
- [4.2] C.-C. Zhan, and W.-H. Ki, "Output-capacitor-free adaptively biased low-dropout regulator for system-on-chips," *IEEE Transactions on Circuits and System I: Regular Papers*, vol. 57, no. 5, pp. 1017-1028, May 2010.
- [4.3] K.-H. Chen, H.-W. Huang, and S.-Y.Kuo, "Fast transient DC–DC converter with on-chip compensated error amplifier," *IEEE Transactions on Circuits and Systems II: Exp. Briefs*, vol. 54, no. 12, pp. 1150-1154, Dec. 2007.
- [4.4] M. Al-Shyoukh, H. Lee, and R. Perez, "A transient-enhanced low-quiescent current low-dropout regulator with buffer impedance attenuation," *IEEE Journal of Solid-State Circuits*, vol. 42, no. 8, pp. 1732-1742, Aug. 2007.
- [4.5] P. Hazucha, S.-T. Moon, G. Schrom, F. Paillet, D. Garner, S. Rajapandian, and T. Karnik, "High voltage tolerant linear regulator with fast digital control for biasing of integrated DC-DC converters," *IEEE Journal of Solid-State Circuits*,

- vol. 42, no. 1, pp. 66-73, Aug. 2007.
- [4.6] T.-Y. Man, P. Mok, and M.Chan, "A high slew-rate push–pull output amplifier for low-quiescent current low-dropout regulators with transient-response improvement," *IEEE Transactions on Circuits and Systems II: Exp. Briefs*, vol. 54, no. 9, pp. 755-759, Sept. 2007.
- [4.7] C.-H. Lin, K.-H. Chen, and H.-W. Huang, "Low-dropout regulators with adaptive reference control and dynamic push–pull techniques for enhancing transient performance," *IEEE Transactions on Power Electronic*, vol. 24, no. 4, pp. 1016–1022, April 2009.
- [4.8] P.-Y. Or, and K.-N. Leung, "An output-capacitorless low-dropout regulator with direct voltage-spike detection," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 2, pp. 458-466, Feb. 2010.
- [4.9] P. Hazucha, T. Karnik, B. Bloechel, C. Parsons, D. Finan, and S. Borkar, "Area-efficient linear regulator with ultra-fast load regulation," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 4 pp. 933-940, April 2005.
- [4.10] E. Alon, J. Kim, S. Pamarti, K. Chang, and M. Horowitz, "Replica compensated linear regulators for supply-regulated phase-locked loops," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 2, pp. 413-424, Feb. 2006
- [4.11] C.-Y. Tseng, L.-W. Wang, and P.-C. Huang, "An integrated linear regulator with fast output voltage transition for dual-supply SRAMs in DVFS systems," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 11, pp. 2239-2249, Nov. 2010
- [4.12] Y. Okuma, K. Ishida, Y. Ryu, X. Zhang, P.-H. Chen, K. Watanabe, M. Takamiya, and T. Sakurai, "0.5-V input digital LDO with 98.7% current efficiency and 2.7-μA quiescent current in 65nm CMOS," in *Proceedings of International Solid-State Circuits Conference (ISSCC)*, 2010. pp. 1-4.
- [4.13] W.-C. Hsien, and W. Hwang, "Low quiescent current variable output digital controlled voltage regulator," in *Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS)*, 2010, pp. 609-612.
- [4.14] M. El-Nozahi, A. Amer, J. Torres, K. Entesari, and E. Sanchez-Sinencio, "High PSR low drop-out regulator with feed-forward ripple cancellation technique,". *IEEE Journal of Solid-State Circuits*, vol. 45, no. 3 pp. 565-577, Mar. 2010.
- [4.15] G. Huang, D.C. Sekar, A. Naeemi, K. Shakeri, and J.D. Meindl, "Compact physical models for power supply noise and chip/package co-design of gigascale integration," in *Proceedings of Electronic Components and Technology Conference (ECTC)*, 2007, pp.1659-1666.

- [4.16] D. E. Khalil, and Y. Ismail, "Optimum sizing of power grids for IR drop," in *Proceedings of IEEE International Symposium on Circuits and Systems* (ISCAS), 2006, pp 480-484.
- [4.17] W. H. Lee, S. Pant, and D. Blaauw, "Analysis and reduction of on-chip inductance effects in power supply grids," in *Proceedings of International Symposium on Quality Electronic Design (ISQED)*, 2004, pp. 131-136.
- [4.18] R. Jakushokas, and E.G. Friedman, "Multi-layer interdigitated power distribution networks," in *IEEE Transactions on Very Large Scale Integration* (VLSI) Systems, vol. 19, no. 5, pp. 774-786, May 2011.
- [4.19] M. Popovich, E.G. Friedman, M. sotman, and A. Kolodny, "On-chip power distribution grids with multiple supply voltages for high-performance integrated circuits," in *IEEE Transactions on Very Large Scale Integration* (VLSI) Systems, vol. 16, no. 7, pp. 908-921, July 2008.
- [4.20] M. S. Gupta, J. L. Oatley, G.-Y. Wei, and D. M. brooks, "Understanding voltage variations in chip multiprocessors using a distributed power-delivery network," *Design, Automation & Test in Europe Conference & Exhibition (DATE)*, 2007, pp. 1-6.

- [5.1] Jonghyun Cho, Jongjoo Shim, Eakhwan Song, Jun So Pak, Junho Lee, Hyungdong Lee, Kunwoo Park, and Joungho Kim, "Active circuit to through silicon via (TSV) noise coupling," *IEEE Conference on Electrical Performance of Electronic Packaging and Systems (EPEPS)*, 2009, pp.97-100.
- [5.2] W. Ahmad, L.-R. Zheng, R. Weerasekera, Q. Chen, A.Y. Weldezion, and H. Tenhunen, "Power integrity optimization of 3D chips stacked through TSVs," *IEEE Conference on Electrical Performance of Electronic Packaging and Systems (EPEPS)*, 2009, pp 105-108.
- [5.3] T.-H. Lin, P.-T.Huang, and W. Hwang "Power noise suppression technique using active decoupling capacitor for TSV 3D Integration," *IEEE International SOC Conference (SOCC)*, 2010, pp. 209-212.
- [5.4] M. Bardaroglu, P. Wambacq, G. V. Plas, S. Donnay, G. E. Gielen, and H.J. De Man, "Evolution of substrate noise generation mechanisms with CMOS technology scaling," *IEEE Transactions on Circuit and System I: Regular Papers*, vol. 53, no. 2, pp. 296-305, Feb. 2006.
- [5.5] A. Afzali-Kusha, M. Nagata, N. K. Verghese, and D. J. Allstot, "Substrate noise coupling in SoC design: modeling, avoidance, and validation," in *Proceedings of the IEEE JPROC*, vol.94, no.12, pp.2109-2138, Dec. 2006.

- [5.6] T. Nakura, M. Ikeda, and K. Asada, "Feedforward active substrate noise cancelling technique using power supply di/dt detector," *IEEE Symposium on VLSI Circuits, Digest of Technical Papers*, 2005, pp. 284-287.
- [5.7] T. Kazama, T. Nakura, M. Ikeda, and K. Asada, "Design of active substrate noise canceller using power supply di/dt dectect," in *Proceedings of IEEE Asia and South Pacific Design Automation Conference (ASP-DAC)*, 2007, pp.100-101.
- [5.8] T. Kazama, T. Nakura, M. Ikeda, and K. Asada, "Optimization of active substrate noise cancelling techniques using power line di/dt detector," in *Proceedings of Asian Solid-State Circuits Conference (ASSCC)*, 2006 pp.239-242.
- [5.9] T. Tsukada, Y. Hashimoto, K. Sakata, H. Okada, and K. Ishibashi, "An on-chip active decoupling circuit to suppress crosstalk in deep-submicron CMOS mixed-signal SoCs," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 1, pp. 67-79, Jan. 2005.
- [5.10] Haitao Dai, and R.W. Knepper, "Modeling and experimental measurement of active substrate-noise suppression in mixed-signal 0.18-μm BiCMOS technology," *IEEE Transactions on Computer-Aided Design (TCAD) of Integrated Circuits and Systems*, vol. 28, no. 6, pp. 826-836, June 2009.
- [5.11] G. Blakiewicz, "Active suppression of substrate noise in CMOS integrate circuits," *IEEE International Conference on Mixed Design of Integrated Circuits and Systems (MIXDES)*, 2007, pp.219-224.
- [5.12] J. Le, C. Hanken, M. Held, M. S. Hagedorn, K. Mayaram, and T. S. Fier, "Experimental characterization and analysis of an asynchronous approach for reduction of substrate noise in digital circuitry," accepted by *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, 2011.
- [5.13] I. Savidis and E.G. Friedman, "Closed-form expressions of 3-D via resistance, inductance, and capacitance," *IEEE Transactiond on Electron Device*, vol. 56, no. 9, pp. 1873-1881, Sept. 2009.
- [5.14] A. Shayan, X. Hu, A.E. Engin, and X. Chen, "3D stacked power distribution considering substrate coupling," *IEEE International Conference on Computer Design (ICCD)*, 2009, pp. 225-230.

[6.1]. H. Saito, M. Nakajima, T. Okamoto, Y. Yamada, A. Ohuchi, N. Iguchi, T. Sakamoto, K. Yamaguchi, and M. Mizuno, "A chip-stacked memory for on-chip SRAM-rich SoCs and processors," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 1, pp. 15-22, Jan. 2010.

- [6.2]. H. H. Nho, M. Horowitz, and S. S. Wong, "A high-speed, low-power 3D-SRAM architecture," *IEEE Custom Integrated Circuits Conference* (CICC), 2008, pp. 201-204.
- [6.3]. S. Borkar, "3D integration for energy efficient system design," *IEEE Symposium on VLSI Technology*, 2009, pp. 58-59.
- [6.4]. A. Zia, P. Jacob, J.-W. Kim, M. Chu, R. P. Kraft, and J. F. McDonald, "A 3-D cache with ultra-wide data bus for 3-D processor-memory integration," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 18, no. 6, pp. 967-977, Jun. 2010.
- [6.5]. U. Kang, H.-J. Chung, S. Heo, D.-H. Park, H. Lee, J.-H. Kim, S.-H. Ahn, S.-H. Cha, J. Ahn, D. Kwon, J.-W. Lee, H.-S. Joo, W.-S. Kim, D.-H Jang, N.-S. Kim, J.-H. Choi, T.-G. Chung, J.-H. Yoo, J.-S. Choi, C. Kim, and Y.-H. Jun, "8 Gb 3-D DDR3 DRAM using through-silicon-via technology," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 1, pp. 111-117, Jan. 2010.
- [6.6]. T. Sekiguchi, K. Ono, A. Kotabe, and Y. Yanagawa, "1-Tbytes 1-Gbit DRAM architecture using 3-D interconnect for high-throughput computing," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 4, pp. 828-837, April 2011.
- [6.7]. S. Vangal, J. Howard, G. Ruhl, S. Dighe, H, Wilson, J. Tschanz, D. Finan, A. Singh, T. Jacob, S. Jain, V. Erraguntla, C. Roberts, Y. Hoskote, N. Borkar, and S. Borkar, "An 80-Tile sub-100-W TeraFLOPS processor in 65-nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 1, pp. 29-41, Jan. 2008.
- [6.8]. P. Jacob, A. Zia, O. Erdogan, P. M. Belemjian, J.-W. Kim, M. Chu, R. P. Kraft, J. F. McDonald, and K. Bernstein, "Mitigating memory wall effects in high clock rate and multi-core CMOS 3D ICs processor memory stacks," in *Proceedings of the IEEE JPROC*, vol. 97, no. 1, pp. 108-122, Jan. 2009.
- [6.9]. A. Todri, and M. Marek-Sadowska, "Power delivery for multicore systems," accepted by *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, 2011.
- [6.10]. Q. Wu, and T. Zhang, "Design techniques to facilitate processor power delivery in 3-D processor-DRAM integrated systems," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 19, no. 9, pp.1655-1666, Sept. 2011.
- [6.11]. M. Ingels, V. Giannini, J. Borremans, G. Mandal, B. Debaillie, P. V. Wesemael, T. Sano, T. Yamamoto, D. Hauspie, J. V. Driessche, and J. Craninckx, "A 5 mm² 40 nm LP CMOS transceiver for a software-defined radio platform," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 12, pp. 2794-2805, Dec. 2010.

[6.12]. I. Savidis and E.G. Friedman, "Closed-form expressions of 3-D via resistance, inductance, and capacitance," *IEEE Transactions on Electron Device*, vol. 56, no. 9, pp. 1873-1881, Sept. 2009.

- [7.1]. J. Rosenfled, and E. G. Friedman, "A distributed filter within a switching converter for application to 3-D integrated circuits," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 19, no. 6, pp. 1075-1085, June 2011.
- [7.2]. Y. K. Ramadass, and A. P. Chandrakasan, "Voltage scalable switched capacitor DC-DC converter for ultra-low-power on-chip applications," *IEEE Power Electronics Specialists Conference (PESC)*, 2007, pp. 2353-2359.



# Vita

## 楊博任 Po-Jen Yang

### PERSONAL INFORMATION

Birth Date: September 19, 1986.

Birth Place: HsinChu, TAIWAN.

E-Mail Address: balloon.yang@gmail.com

### **EDUCATION**

09/2009 - 09/2011 M.S. in Electronics Engineering, National Chiao Tung University

Thesis: Power Integrity for TSV 3D Integration

09/2004 – 06/2009 B.S. in Electrical Engineering, National Chung Cheng University

### **PUBLICATIONS**

Po-Jen Yang, Po-Tsang Huang, and Wei Hwang, "Substrate Noise Suppression Technique for Power Integrity of TSV 3D Integration" *in Proc. IEEE System on Chip Conference*, SOCC, 2011. (Submitted)