

Stochastic Thermal Simulator Considering Within-die

Spatial Correlation under Process Variations

研究生:吴佳鴻

指導教授:李育民 教授

### 中華民國九十六年十月

考慮晶片上具有空間相關製程變異的

統計型晶片熱分佈模擬器

學生:吳佳鴻

指導教授:李育民 博士

國立交通大學電信工程學系碩士班

#### 摘 要

傳統上晶片上熱分佈分析主要是考慮不具隨機性的功率消耗的熱傳方程式, 然而,隨著製程的演進,導致在參數如電晶體通道長度和氧化層厚度的變異 波動對於電路的效能、功率消耗、可靠度上有重大的影響。在晶片設計階段 時忽略製程上的變異將會造成嚴重的良率問題。在這篇論文,我們提出一個 方法分析晶片上統計型溫度分析考慮晶片上具有空間相關的製程變異。這篇 論文是第一篇考慮晶片上具有空間相關製程變異的統計型晶片熱分佈模擬 器,利用卡洛轉換(Karhunent-Loeve transformation)處理具有空間相關隨機過 程並且利用正交多項式(Polynomial Chaos)和隨機加勒金法(Stochastic Galerkin method)解統計型熱傳方程式。與蒙地卡羅模擬法(Monte Carlo simulation)比較 來說明我們所提出方法的正確性和效率性。模擬的結果可以保證提供可靠的 溫度分佈良率,並且指引設計者去避免晶片熱毀壞的問題在次微米半導體時 代。最後我們更指出提供精確的晶片上溫度分佈不能忽略空間相關製程變異。

### Stochastic Thermal Simulator Considering Within-die Spatial Correlation under Process Variations

Student : Jia-Hong Wu

Advisor : Dr. Yu-Min Lee

Department of Communication Engineering National Chiao Tung University

#### ABSTRACT

Traditionally, the thermal analysis methods of chip have been conducted by solving the heat transfer equation with deterministic heat sources. However, the technology scaling leads to that the fluctuations in physical parameters such as channel length and oxide thickness have a substantial impact on circuit performance, power consumption, and reliability. Ignoring the manufactured process variations at the design stage can cause aggravated yield losses. In this paper, we present a method to analyze the statistical temperature distribution of full chip under considering process variations with a known within-die spatial correlation function. To the author's best knowledge, this is the first stochastic thermal simulator of full chip with considering within-die process variations. This work makes use of the Karhunen-Loeve transformation to deal with the physical parameters with spatial correlation and takes advantage of polynomial chaos and stochastic Galerkin method to tackle the stochastic heat transfer equation. We demonstrate the accuracy and efficiency of the proposed methodology in comparison to Monte Carlo simulation. The simulation results guarantee the robust thermal yield and can guide designers to avoid the thermal failure in nano-meter technology. Furthermore, we point out that the within-die spatial correlation can not be neglected for the accurate temperature estimation.

誌 謝

這篇論文能夠順利地完成,首先要由衷地感謝我的指導教授 李 育民博士,每當我困惑徬徨時,老師總能指引我一條光明的路,以致 能夠讓這篇論文完成。在這碩士的兩年裡,實驗室與老師所給予我的 訓練和教導,對於我之後就業或是進修的影響,是具有相當大的部份 ,在實驗室的這兩年,我發現我已成長不少,更學習到許多研究與處 理問題的方法,最後我要感謝老師與實驗室在這兩年來所帶給我的一 切。

關於實驗的進行,相當感謝李義明老師能夠提供我們台積電 65 奈米製程參數,如果沒有李義明老師的幫忙,實驗的進行將會變的相 當困難。在此深深致上對李老師的謝意。

在實驗室裡,感謝培育學長、柏毅學長寶貴的知識經驗傳承,以 及國富哥與實驗室學弟焯基、斯安、懷中、宗祐平日的關心與幫忙, 紓解生活壓力,一路相伴與成長。

最後要深深地感謝我的父母親,你們的辛苦及無悔的付出栽培我 能夠順利完成碩士學業,也感謝 Doci 的陪伴與關懷。僅在此將本論 文獻給你們,共享這份喜悅與榮耀。

# Contents

| 1 | Intro | oduction                                          | 1  |
|---|-------|---------------------------------------------------|----|
|   | 1.1   | Introduction                                      | 1  |
|   | 1.2   | Motivation                                        | 3  |
|   | 1.3   | Our Contributions                                 | 3  |
|   | 1.4   | Organization of the Thesis                        | 4  |
| 2 | Preli | iminaries                                         | 5  |
|   | 2.1   | Physical Design Flow                              | 5  |
|   | 2.2   | Parameter Modeling                                | 7  |
|   |       | 2.2.1 Karhunen-Loeve Expansion                    | 7  |
|   |       | 2.2.2 Spatial Correlation Modeling                | 9  |
|   | 2.3   | The Bases for Random Space : Polynomial Chaos     | 16 |
|   | 2.4   | Statistical Leakage Power Modeling                | 17 |
|   |       | 2.4.1 Gate Tunneling Leakage Current              | 17 |
|   |       | 2.4.2 Subthreshold Leakage Current                | 18 |
|   | 2.5   | Monte Carlo Technique                             | 20 |
| 3 | Stoc  | hastic Thermal Simulation Methodology             | 22 |
|   | 3.1   | Stochastic Thermal Simulation Problem Formulation | 22 |
|   | 3.2   | Stochastic Thermal Simulation Flowchart           | 24 |
|   | 3.3   | Stochastic Galerkin Procedure                     | 26 |
| 4 | Exp   | erimental Results                                 | 31 |
| 5 | Con   | clusion                                           | 41 |

# **List of Figures**

| 1.1 | Leakage variations [13]                                                                                      | 2        |
|-----|--------------------------------------------------------------------------------------------------------------|----------|
| 2.1 | Physical design flow                                                                                         | 6        |
| 2.2 | The eigenvalues decay rate                                                                                   | 13       |
| 2.3 | The exponential covariance kernel                                                                            | 14       |
| 2.4 | (a) 25-term approximation of covariance surface ; $\sigma = 4.3$ . (b) 25-term relative                      |          |
|     | error surface of covariance approxiamtion.                                                                   | 15       |
| 2.5 | (a) 75-term approximation of covariance surface ; $\sigma = 4.3$ . (b) 75-term relative                      |          |
|     | error surface of covariance approxiamtion.                                                                   | 15       |
| 2.6 | Procedure of Multinormal-Cholesky                                                                            | 20       |
| 2.7 | Procedure of Cholesky-Decomposition                                                                          | 21       |
| 3.1 | Compact thermal model of the early design stage for stochastic heat sources                                  | 22       |
| 3.2 | Stochastic thermal simulation flowchart                                                                      | 25       |
| 3.3 | Procedure of gate power projection                                                                           | 28       |
| 3.4 | Procedure of sub power projection                                                                            | 30       |
| 4.1 | The floorplan of our test circuit                                                                            | 32       |
| 4.2 | (a) The nominal power distribution at the top surface of die, (b) The mean power                             |          |
|     | distribution at the top surface of die                                                                       | 34       |
| 4.3 | (a) The 3D nominal temperature distribution at the top surface of die, (b) The                               |          |
|     | 3D mean temperature distribution at the top surface of die                                                   | 35       |
| 4.4 | (a) The 2D nominal temperature distribution at the top surface of die, (b) The                               |          |
|     | 2D mean temperature distribution at the top surface of die                                                   | 35       |
| 4.5 | (a) The 3D standard deviation temperature distribution at the top surface of die                             |          |
|     | with considering spatial correlation, (b) The 3D standard deviation temperature                              |          |
|     | distribution at the top surface of die without considering spatial correlation.                              | 37       |
| 4.6 | (a) The 2D standard deviation temperature distribution at the top surface of die                             |          |
|     | with considering spatial correlation, (b) The 2D standard deviation temperature                              | 07       |
| 47  | distribution at the top surface of die without considering spatial correlation.                              | 37       |
| 4./ | The mean value plus three standard deviation of temperature distribution The temperature $f_{1}$ and $f_{2}$ | 39<br>40 |
| 4.8 | The temperature profile at $y = 5 \ (mm)$ .                                                                  | 40       |

# **List of Tables**

| 4.1 | Accuracy and Efficiency | Compared to Monte Carlo Method |  | 33 |
|-----|-------------------------|--------------------------------|--|----|
|-----|-------------------------|--------------------------------|--|----|

# Chapter 1

## Introduction

#### **1.1 Introduction**

Because of the drastic increase in power consumption of integrated circuits, thermal issues have become the important concerns in VLSI manufacturing. The high temperature distribution and thermal gradients have substantial impacts on timing, performance, power, and reliability. The elevated temperature is mainly caused by huge power consumption. Furthermore, the leakage power is expected to increase drastically and become the dominated part of total power consumption. The subthreshold leakage and gate tunneling leakage are the major components of leakage in advanced CMOS technology. The lower quantity of transistor threshold voltage results in an exponential increase in subthreshold leakage current. To control the short channel effect and to enhance transistor driving strength, the thinner oxide thickness causes the considerable gate tunneling leakage current.

An important concern of VLSI design and manufacturing in nanometer technology is the process variation. As the technology scaling, the decreased controllability of processes has resulted in the substantial variations of circuit performance. Generally, process variations can be classified into die-to-die variation and within-die variation. The die-to-die variation is mainly caused by the thermal gradients, equipment properties, wafer polishing, and wafer placement. The die-to-die variation varies slowly and behaves smoothly at the large scale chip. Thus, the die-to-die variation can be averaged over the die and be incorporated into mean value [1]. The within-die variation is generally caused by pattern planarization in chemical mechanical polishing, and lithography effects [2]. The within-die variation affects the same type device at different location within a chip differently and exhibits spatial correlation. The spatial correla-

tion within die means that the devices close to each other have more similar behavior than those which are located far away. It has been shown that the within-die variations are the most important parts of all system variations which influence circuit performance [3]. Noted that a 10% variation in oxide thickness causes a 15X difference in gate tunneling current for the 100nm BPTM process technology [13] (Fig. 1.1). Moreover, the considerable variations in chip level leakage power is expected as high as 20X in the literature [24] and the related fluctuation in temperature distribution is considerable. It is worth to note that the phenomenon of variations on leakage powers is more aggravated beyond the 65nm technology. The worst case deterministic simulation can result in immoderate guard-banding, and causes low performance [23]. Furthermore, the underestimation in temperature and power consumption of circuits can lead to unnecessary low yield. These undesirable phenomena bring about the statistical thermal simulation being essential, especially for the leakage power dominated technology.



Fig. 1.1: Leakage variations [13]

### **1.2 Motivation**

The existing thermal simulating techniques can be classified into two categories, numerical methods and analytical methods. The numerical based thermal simulators convert heat transfer equations to equivalent RC networks and several efficient approaches have been developed [16, 17, 18]. Analytical methods which avoid performing directly the volume meshing of entire substrate apply closed-forms to represent the temperature distribution, such as the Green's function based method [19], and the Generalized Integral Transforms (GIT) based method [14]. However, all these works view the power dissipation of chip to be deterministic, and result in the optimistic estimate of temperature distribution. Those optimistic simulation results can gravely decrease the manufactured yield. Therefore, reliable and robust thermal simulator must involve process variations in physical parameters and leakage power consumption.

In this work, we propose a stochastic thermal simulation procedure and consider withindie process variations for the leakage power dominated design. With the help of Karhunen-Loeve expansion [9], we transform the random processes of physical parameters such as channel length and oxide thickness with known spatial correlation to a set of uncorrelated random variables. After transforming parameters, we employ the polynomial chaos scheme and stochastic Galerkin procedure to convert the stochastic heat transfer equations to a set of deterministic problems. The formulas of temperature distribution are determined by applying an efficient deterministic thermal solver [14] to deal with the set of deterministic heat transfer equations. In this work, we are going to evaluate the means and variances of full-chip temperature distribution and demonstrate the huge difference of simulation results between considering spatial correlation and ignoring spatial correlation.

## **1.3 Our Contributions**

In this work, we propose a stochastic thermal simulator and consider within-die process variations for leakage power dominated design. By using Karhunen-Loeve expansion, we transform the random processes of physical parameters (channel length and oxide thickness) with spatial correlation to a set of uncorrelated random variables. After transforming parameters, we employ the polynomial chaos scheme and stochastic Galerkin procedure to convert the stochastic thermal problem to a set of deterministic problems. Then, applying an efficient deterministic thermal solver [14] to obtain the final solutions. We are going to evaluate the mean value and variance of full-chip temperature distribution and demonstrate the huge difference of simulation results between considering spatial correlation and ignoring spatial correlation. To the author's best knowledge, this is the first stochastic thermal simulator of full chip with considering within-die process variations in the nano-meter technology.

## **1.4 Organization of the Thesis**

The rest of this thesis is organized as the follows. First, the detailed parameters modeling in this work, polynomial chaos, Karhunen-Loeve expansion, and Monte Carlo technique are introduced in chapter 2. The problem formulation, simulation flowchart and stochastic Galerkin procedure utilized in this work are addressed in chapter 3. Finally, the experimental results and conclusion are given in chapter 4 and 5, respectively.

# Chapter 2

# **Preliminaries**

In this chapter, we first introduce the physical design flow. Then, the parameter modeling is presented in chapter 2.2. After that, the polynomial chaos for random space is shown in chapter 2.3. Finally, we introduce the statistical leakage power modeling in chapter 2.4 and Monte Carlo technique in chapter 2.5.

## 2.1 Physical Design Flow

The physical design flow of a circuit shown in Fig. 2.1 is the phase that precedes the fabrication of a circuit. In most general terms, physical design refers to all synthesis steps succeeding logic design and preceding fabrication. These include logic partitioning, floorplanning, placement, routing , compaction, extraction, and verification. Floorplanning is an essential design step when a hierarchical/building design methodology is used. Floorplanning helps designers define the layout hierarchy, estimate the overall required area, determine the aspect ratio for each module. It is closely related to placement. For thermal-driven floorplanning, there are a greater deal of flexibilities in mitigating thermal problems, but also large quantities of uncertainties with regard to the accurate thermal profile. The placement stage of physical design flow is the process of arranging the circuit components on a layout surface. The general purpose of the thermal-aware placement methodology is to minimize the maximal temperature gradient over the chip and get the uniform temperature distribution. Moreover, the thermal-driven methodologies consider the more detailed thermal model with interconnects at the routing stage of physical design. However, the literature [21] pointed out that the thermal-driven methodologies making efforts at the early stage of physical design flow can gain more benefits. Thus, in this work, we will

propose a thermal simulator which is suitable to the the early stage of physical design.



Fig. 2.1: Physical design flow

### 2.2 Parameter Modeling

Process parameter spatial correlation has received increased attention recently. The consideration of within-die spatial correlation causes the increasing number of RVs that we deal with and the computation costs. A conventional technique partitions the layout plane into several grids, assumes perfect correlation for all random variables (channel length, oxide thickness) in the same grid cell, and computes a correlation matrix for those grid cells [25]. The number of correlated random variables can be further reduced by applying principle component analysis (PCA) [26] [27]. In PCA, linear variable transformations are used for the largest data variance, or principle components. However, the nature of PCA often limits its capability in modeling high dimensional parameter variations for performance modleing. An alternative formulation to tackle with the correlated parameters is the Karhunen-Loeve expansion. A random process can be expanded in terms of a denumerable set of orthogonal random variables with deterministic functions which are related to the corresponding covariance kernel. The detailed parameters modeling and Karhunen-Loeve expansion are presented in the following subsection.

#### 2.2.1 Karhunen-Loeve Expansion

In the presence of process variations, the physical parameters such as channel length, and oxide thickness can be modeled as random processes with given spatial covariance functions. Since the values of physical parameters are bounded above and below, we can assume that the random process  $\alpha(\mathbf{x}, \vartheta)$  of each parameter under consideration is a second-order stochastic process, where  $\vartheta \in \Omega$ , and  $\mathbf{x} = (x, y) \in D_p$ . Here,  $\Omega$  is the set of manufacturing outcomes for a specific physical parameter, and  $D_p$  is the domain in x- and y- directions of chip.

DEFINITION 2.1 [6] A second-order random variable  $\alpha(\vartheta)$  is one satisfying  $E[|\alpha(\vartheta)|^2] < \infty$ . A second order stochastic process  $\alpha(\mathbf{x}, \vartheta)$  is a family of second-order random variables.

DEFINITION 2.2 [6] A second-order process  $\alpha(\vartheta)$  is continuous in quadratic mean (q.m.) if  $E[|\alpha(\mathbf{x} + \mathbf{h}, \vartheta) - \alpha(\mathbf{x}, \vartheta)|^2] \rightarrow 0$  as  $||\mathbf{h}|| \rightarrow 0$  for all  $\mathbf{x} \in D_p$ .

where  $\|\cdot\|$  is the Euclidean norm. Due to the absence of the explicit form of  $\alpha(\mathbf{x}, \vartheta)$ , we utilize the following theorem to guarantee its q.m. continuity.

THEOREM 2.1 [6] A second-order process  $\alpha(\mathbf{x}, \vartheta)$  is continuous in q.m. at  $\mathbf{x} \in D_p$  if and

only if, its covariance function  $C(\mathbf{x}_1, \mathbf{x}_2)$  is continuous at  $(\mathbf{x}, \mathbf{x})$ .

DEFINITION 2.3 [6] A second order q.m. continuous process  $\alpha(\mathbf{x}, \vartheta)$  on a closed interval  $D_p$  has an orthogonal decomposition

$$\alpha(\mathbf{x},\vartheta) = \tilde{\alpha}(\mathbf{x}) + \sum_{i=1}^{\infty} \sqrt{\lambda_i} f_i(\mathbf{x}) \zeta_i(\vartheta)$$
(2.1)

$$E[\zeta_m \zeta_n] = \delta_{mn} \; ; \; \int_D f_m(\mathbf{x}) f_n(\mathbf{x}) d\mathbf{x} = \delta_{mn}$$
(2.2)

where  $\tilde{\alpha}(\mathbf{x})$  is the mathematical expectation of the process  $\alpha(\mathbf{x}, \vartheta)$ , if, and only if, the  $\lambda_i$  are the eigenvalues and  $f_i(\mathbf{x})$  are the orthonormalized eigenfunctions of  $C(\mathbf{x}_1, \mathbf{x}_2)$ . Then the expansion converges in q.m. uniformly on  $D_p$ .

The Fourier-type series expansion form in equation (2.1) is the Karhunen-Loeve expansion (KLE) which is an optimal way of representing a random process based on the spectral decomposition of the given covariance kernel. The expansion converts a random process into a model with a minimum degree of freedom and minimizes the mean-square error of the finiteterm representation [9]. From the viewpoint of practice, the summation terms of equation (2.1) can be truncate at finite number  $N_{kl}$  which is determined by the decay trend of eignevalues to ensure the acceptable error. Notes that the decay trend is crucial, since the truncated number determines the computational efficiency and complexity of the work. General speaking, the smoother covariance function behaves the faster eigenvalues of KLE decay. The eigenvalues and eigenfunctions can be derived form the following Fredholm integral equation:

$$\int_{D_p} C(\mathbf{x}_1, \mathbf{x}_2) f_n(\mathbf{x}_2) d\mathbf{x}_2 = \lambda_n f_n(\mathbf{x}_1)$$
(2.3)

Form the definition of covariance function, it's with bounded, symmetric and positive definite property. The fact guarantees a number of properties for the eigenfunctions and the eigenvalues that are the solutions of equation (2.3).

- 1. The eigenfunction set  $f_i(\mathbf{x})$  is orthogonal and complete.
- 2. The eigenvalues are all positive real numbers.
- 3. There are at most countably infinite set of eigenvalues.

4. The covariance kernel admits of the following uniformly convergent expansion

$$C(\mathbf{x}_1, \mathbf{x}_2) = \sum_{k=1}^{\infty} \lambda_k f_k(\mathbf{x}_1) f_k(\mathbf{x}_2)$$
(2.4)

The physical parameter such as channel length with spatial correlation can be expanded to a Fourier-type series by Karhunen-Loeve expansion.

$$L(\mathbf{x},\theta) \simeq \tilde{L}(\mathbf{x}) + \sum_{i=1}^{N_{kl}} \sqrt{\lambda_i} f_i(\mathbf{x}) \zeta_i(\theta)$$
(2.5)

where  $\tilde{L}(\mathbf{x})$  is the mean value function of channel length,  $\lambda_i$  and  $f_i(\mathbf{x})$  are the eigenvalues and eigenfunctions corresponding to the given covariance function, respectively, and  $\theta \in \Omega_L$ . Here,  $\Omega_L$  is the set of manufacturing outcomes for the channel length. The  $\{\zeta_i\}$  is a set of orthonormal random variables with zero mean and unit variance. The system random process can be assumed to be a Gaussian process [30], then the  $\{\zeta_i\}$  is a set of standard Gaussian random variables. It can be shown that the Karhunen-Loeve expansion transforms the random process with spatial correlation to a set of uncorrelated orthonormal random variables and greatly reduces the dimension of random variables. In the same way, the oxide thickness random process can be transformed into a set of standard Gaussian random variables  $\{\varsigma_i\}$ . Indeed, we assume that the oxide thickness random process is independent of channel length, so the two random variables sets ( $\{\zeta_i\}, \{\varsigma_i\}$ ) are independent. For notation, we arrange  $\{\zeta_i\}$  and  $\{\varsigma_i\}$  as  $\{\xi_i\} = \{\zeta_i, \varsigma_j\}$  and  $\{\xi_i\}$  is used as the set of system random variables to expand bases for the random space. In the following subsection, we will introduce the adoptive covariance function for physical parameter random processes, and derive the eigenvalues and eigenfunctions of Karhunen-Loeve expansion for the corresponding covariance function.

#### 2.2.2 Spatial Correlation Modeling

Recently, the spatial correlation of within-die variations has been seriously taken into account of the VLSI verification flow, and a precise spatial correlation function is necessary for catching the manufacturing information. To extract the features of process variations for modeling and constructing the covariance kernel is mainly based on the measured data, and several robust techniques have been proposed to build the valid spatial covariance function [4] [5] with having the positive semidefinite property [8]. In this paper, we adopt the spatial covariance function introduced in  $[4] [6]^1$ .

$$C(\mathbf{x_1}, \mathbf{x_2}) = \sigma^2 e^{-\frac{|x_1 - x_2|}{\eta_x}} e^{-\frac{|y_1 - y_2|}{\eta_y}}$$
(2.6)

where  $\mathbf{x_1} = (x_1, y_1)$ ,  $\mathbf{x_2} = (x_2, y_2)$ , and  $\eta_x$  and  $\eta_y$  are the correlation lengthes in the *x*- and *y*- directions, respectively. The term  $\sigma$  indicates the standard deviation of the random process. The covariance kernel is defined in the rectangular domain  $D_p$ . In general, the correlation of parameters for two identical devices drops down as their distance increases. The literature [7] further pointed that the correlation approaches zero as the distance nears half of chip dimension for two logical gates. This observation means that the ratio of correlation length for different chip sizes is constant for different design benchmark.

For the multi-dimensional problem, if we assume that the given covariance function is separable, the solutions of equation (2.3) can be derived independently for  $x_1$  and  $y_1$  directions. For the covariance kernel in this work, the eigenvalues and eigenfunctions of one dimension problem can be expressed as the following [9]:

$$\lambda_{n,x} = \frac{2\eta_x \sigma^2}{\eta_x^2 \omega_{n,x}^2 + 1}$$
(2.7)

$$f_n(x) = a_{n,1}\cos(w_{n,x}x) + a_{n,2}\sin(w_{n,x}x)$$
(2.8)

$$a_{n,2} = \frac{1}{\sqrt{(\eta_x^2 w_{n,x}^2 + 1)L_x/2 + \eta_x}}$$
(2.9)

$$a_{n,1} = \eta_x w_{n,x} a_{n,2} \tag{2.10}$$

where  $w_{n,x}$  are the positive roots of the following characteristic equation

$$(\eta_x^2 w_x^2 - 1) \sin(w_x L_x) = 2\eta_x w_x \cos(w_x L_x)$$
(2.11)

the above characteristic equation can be obtained from combining the boundary condition and equation 2.3.

The multi-dimensional eigenvlaues and eigenfunctions can be combined from x and y directions.

$$\lambda_n = \frac{4\eta_x \eta_y \sigma^2}{[\eta_x^2 \omega_{i,x}^2 + 1] [\eta_y^2 \omega_{j,y}^2 + 1]}$$
(2.12)

$$f_n(x,y) = f_i(x)f_j(y)$$
 (2.13)

<sup>&</sup>lt;sup>1</sup>Although we choose this specific spatial covariance function in this work, our simulation flow can be applied to any valid spatial covariance function.

where  $\omega_{i,x}$  and  $\omega_{j,y}$  are the solutions of equation 2.11 for setting parameters  $(L_x, \eta_x)$  and  $(L_y, \eta_y)$ , respectively. The relationship between the indices (i, j) and index n leads to the eigenvalues  $\lambda_n$  form a decreasing series.

The detailed procedure of solving the eigenvalues and eigenfunctions of given covariance kernel in this work is presented. From the following Fredholm equation:

$$\int_{D} C(x_1, x_2, y_1, y_2) f(x_1, y_1) dx_1 dy_1 = \lambda f(x_2, y_2)$$
(2.14)

The above eigenvalues problem can be solved independently for x and y directions to obtain eigenvalues  $\lambda_x$  and  $\lambda_y$ , and eigenfunctions  $f_x(x)$  and  $f_y(y)$ .

$$\int_{D} \sigma^{2} e^{-\frac{|x_{1}-x_{2}|}{\eta_{x}}} e^{-\frac{|y_{1}-y_{2}|}{\eta_{y}}} f(x_{1},y_{1}) dx_{1} dy_{1} = \lambda f(x_{2},y_{2})$$
(2.15)

$$\int_{D} \sigma^{2} e^{-\frac{|x_{1}-x_{2}|}{\eta_{x}}} e^{-\frac{|y_{1}-y_{2}|}{\eta_{y}}} f_{x}(x_{1}) f_{y}(y_{1}) dx_{1} dy_{1} = \lambda_{x} \lambda_{y} f_{x}(x_{2}) f_{y}(y_{2})$$
(2.16)

For the separable multi-dimension problem, we consider the following one-dimensional eigenvalues problem and extend to the two-dimension soultions :

$$\int_{0}^{L_{x}} \sigma^{2} e^{-\frac{|x_{1}-x_{2}|}{\eta_{x}}} f_{x}(x_{1}) dx_{1} = \lambda_{x} f_{x}(x_{2})$$
(2.17)

Taking derivative of equation 2.17 with respect to  $x_2$  yields

$$\frac{-1}{\eta_x} \int_0^{x_2} e^{\frac{x_1 - x_2}{\eta_x}} f_x(x_1) dx_1 + \frac{1}{\eta_x} \int_{x_2}^{L_x} e^{\frac{x_2 - x_1}{\eta_x}} f_x(x_1) dx_1 = \frac{\lambda_x}{\sigma^2} f'_x(x_2)$$
(2.18)

Taking derivative again of equation 2.18 with respect to  $x_2$  gives

$$\frac{-1}{\eta_x} [f_x(x_2) - \frac{1}{\eta_x} \int_0^{x_2} e^{\frac{x_1 - x_2}{\eta_x}} f_x(x_1) dx_1] - \frac{1}{\eta_x} [f_x(x_2) + \int_{L_x}^{x_2} e^{\frac{x_2 - x_1}{\eta_x}} \phi_x(x_1) dx_1] = \frac{\lambda_x}{\sigma^2} f_x''(x_2)$$
(2.19)

$$\frac{-2}{\eta_x}f_x(x_2) + \frac{1}{\eta_x^2} \left[ \int_0^{x_2} e^{\frac{x_1 - x_2}{\eta_x}} f_x(x_1) dx_1 + \int_{x_2}^{L_x} e^{\frac{x_2 - x_1}{\eta_x}} f_x(x_1) dx_1 \right] = \frac{\lambda_x}{\sigma^2} f_x''(x_2)$$
(2.20)

$$\frac{\lambda^x}{\sigma^2} f_x''(x_2) = \frac{-2}{\eta_x} f_x(x_2) + \frac{1}{\eta_x^2} [\frac{\lambda_x}{\sigma^2} f_x(x_2)]$$
(2.21)

Moreover,

$$f_{x}''(x_{2}) + \frac{2\eta_{x}\sigma^{2} - \lambda_{x}}{\lambda_{x}\eta_{x}^{2}}f_{x}(x_{2}) = 0$$
(2.22)

The boundary condition associated with equation 2.22 can be determined from equation (2.17) by setting  $x_2 = 0$  and  $x_2 = L_x$ .

$$\eta_x f_x'(0) = f_x(0)$$
  
 $\eta_x f_x'(L_x) = -f_x(L_x)$ 

The general solution of equation 2.22 is

$$\phi_x(x) = c_1 \cos(w_x x) + c_2 \sin(w_x x)$$
(2.23)

$$w_x^2 = \frac{2\eta_x \sigma^2 - \lambda_x}{\lambda_x \eta_x^2} \tag{2.24}$$

Based on the boundary condition, we can obtain the following equations for determining coefficients  $c_1$  and  $c_2$ .

$$c_1 - \eta_x w_x c_2 = 0 \tag{2.25}$$

$$[-\eta_x w_x \sin(w_x L_x) + \cos(w_x L_x)]c_1 + [\eta_x w_x \cos(w_x L_x) + \sin(w_x L_x)]c_2 = 0$$
(2.26)

Limiting to nontrivial solutions of equation 2.26 yields an equations for  $w_x$ ,

$$(\eta_x^2 w_x^2 - 1) \sin(w_x L_x) = 2\eta_x w_x \cos(w_x L_x)$$
(2.27)

For given  $\eta_x$  and  $L_x$ , the equation 2.26 can be solved to get a series of (positive) $w_{i,x}$ , i = 1, 2, ...The eigenvalues corresponding to  $w_{i,x}$  can be determined as the following :

$$\lambda_{i,x} = \frac{2\eta_x \sigma^2}{\eta_x^2 w_{i,x}^2 + 1}$$
(2.28)

In fact, the different  $w_{i,x}$  gives the different coefficients  $c_{i,1}$  and  $c_{i,2}$  for eigenfuctions.

$$f_{i,x}(x) = c_{i,1}cos(w_{i,x}x) + c_{i,2}sin(w_{i,x}x)$$

The coefficients  $c_{i,1}$  and  $c_{i,2}$  can be determined by the condition that the eigenfunctions are normalized.

$$\int_{D_p} f_{i,x}^2(x) dx = 1$$
(2.29)

$$c_{i,2} = \frac{1}{\sqrt{(\eta_x^2 w_{i,x}^2 + 1)L_x/2 + \eta_x}}$$
(2.30)

$$c_{i,1} = \eta_x w_{i,x} c_{i,2} \tag{2.31}$$

Trends of the two dimensional eigenvalues  $\lambda_n$  of exponential kernel for the correlation lengths ( $\eta_x$  and  $\eta_y$ ) which are equal to 0.31 can be shown in Fig. 2.2.



Fig. 2.2: The eigenvalues decay rate

The exponential kernel  $C(\mathbf{x_1}, \mathbf{x_2}) = \sigma^2 e^{-\frac{|x_1-x_2|}{\eta_x}} e^{-\frac{|y_1-y_2|}{\eta_y}}$  can be expanded as its eigenfunctions and eigenvalues.

$$C(\mathbf{x}_1, \mathbf{x}_2) = \sum_{k=1}^{N_{kl}} \lambda_k f_k(\mathbf{x}_1) f_k(\mathbf{x}_2)$$
(2.32)

where  $N_{kl}$  is the truncated number for approximation. As shown in Fig. 2.3, it's the adoptive exponential covariance surface in this work. The approximated covariance surface and relative errors for truncated number  $N_{kl}$  which is equal to 25 are shown in Fig. 2.4. Moreover, the approximated covariance surface for truncated number  $N_{kl}$  which is equal to 75 are shown in Fig. 2.5.



Fig. 2.3: The exponential covariance kernel



Fig. 2.4: (a) 25-term approximation of covariance surface ;  $\sigma = 4.3$ . (b) 25-term relative error surface of covariance approxiamtion.



Fig. 2.5: (a) 75-term approximation of covariance surface ;  $\sigma = 4.3$ . (b) 75-term relative error surface of covariance approxiamtion.

#### **2.3** The Bases for Random Space : Polynomial Chaos

The generalized polynomial chaos, also called the Askey-Chaos, utilizes the orthogonal polynomials as the trial basis in the random space to expand the stochastic process. The original polynomial chaos which is termed as the Hermite chaos was first introduced by Wiener [9]. Ghanem and Spanos are the pioneers that employ the Hermite orthogonal polynomials in terms of Gaussian random variable to deal with various problems in mechanics [9]. The theorem of Cameron and Martin [15] guarantees that a general second-order random process  $u(\theta)$  can be represented in the following form:

$$u(\theta) = c_0 \Gamma_0 + \sum_{i_1=1}^{\infty} c_{i_1} \Gamma_1(\xi_{i_1}) + \sum_{i_1=1}^{\infty} \sum_{i_2=1}^{i_1} c_{i_1 i_2} \Gamma_2(\xi_{i_1}, \xi_{i_2}) + \sum_{i_1=1}^{\infty} \sum_{i_2=1}^{i_1} \sum_{i_3=1}^{i_2} c_{i_1 i_2 i_3} \Gamma_3(\xi_{i_1}, \xi_{i_2}, \xi_{i_3}) + \dots$$

where  $\Gamma_r(\xi_{i_1}, ..., \xi_{i_n})$  represents the polynomial chaos of order r in terms of the N-dimensional random variables  $\vec{\xi} = (\xi_{i_1}, ..., \xi_{i_n})$ . The polynomial chaos was so-called Hermite polynomial chaos for the Gaussian random variables. For the Hermite polynomials with multi-dimension  $\Gamma_r(\xi_{i_1}, ..., \xi_{i_N})$ , the general expression form can be obtained as

$$\Gamma_r(\xi_{i_1},...,\xi_{i_N}) = (-1)^r \frac{\partial^n}{\partial \xi_{i_1}...\xi_{i_n}} e^{-\frac{1}{2}\vec{\xi}^T\vec{\xi}}$$

The zero, first, and second-order Hermite polynomial chaos can be given by:

$$\Gamma_0 = 1; \ \Gamma_1(\xi_i) = \xi_i; \ \Gamma_2(\xi_i) = \xi_i \xi_j - \delta_{ij}$$

where  $\delta_{ij}$  is the Kronecker delta. For charity, the above general second-order random process  $u(\theta)$  can be expressed as more concise form

$$u(\theta) = \sum_{j=1}^{\infty} \hat{a}_j \Phi_j(\vec{\xi})$$
(2.33)

where there is a one-to-one mapping between the polynomial chaos  $\Gamma[.]$  and  $\Phi[.]$ , and also between the coefficients  $\hat{a}_j$  and  $c_{i_1...i_n}$ . The polynomial chaos of the same order with different argument list are orthogonal to each other, so are ones of the different order. For notation, the polynomial chaos satisfy the following orthogonality property:

$$<\Phi_i\Phi_j>=<\Phi_i^2>\delta_{ij}$$

where < . > denotes the inter product defined in the following:

$$< f(\vec{\xi})g(\vec{\xi}) > = \frac{1}{\sqrt{(2\pi)^n}} \int f(\vec{\xi})g(\vec{\xi})e^{-\frac{1}{2}\vec{\xi}^T\vec{\xi}}d\vec{\xi}$$

### 2.4 Statistical Leakage Power Modeling

In this section, we will introduce the empirical models for subthreshold and gate leakage currents with the uncertainty in physical parameters such as channel length and oxide thickness. Actually, the leakage current depends the input pattern and logic topology. We evaluate the average leakage based on HSPICE simulation for various types logic gate with considering input pattern. From the the HSPICE simulation results, we obtain the fitting constants of the empirical current models based on least square method. Moreover, the maximum errors of fitting model are no more than 2% in comparison with HSPICE simulation results.

#### 2.4.1 Gate Tunneling Leakage Current

According to quantum mechanics, there is a finite probability that carriers will tunnel through the gate oxide. The result is so-called that gate tunneling leakage current flows into the gate. The finite probability is exponential function of oxide thickness. The gate tunneling leakage current increases exponentially as gate oxide decreases. When the oxide thickness is thicker than 20Å, the gate tunneling leakage current is relatively small in comparison to other component leakage current such as sub-threshold leakage current. For oxide thickness thinner than 15-20Å, tunneling current becomes a important factor and may become comparable to subthreshold leakage current in advanced process. To put it briefly, the dependence of gate leakage current on oxide thickness is given by the following formula [32]:

$$I_{gate} = (A \cdot C)(W \cdot L)e^{-B \cdot \frac{T_{ox}}{V_{gs}}\alpha}$$

where  $A = q_3/8\pi h\phi_b$ ,  $B = 8\pi\sqrt{2m_{ox}}\phi_b^{3/2}/3hq$ ,  $C = (V_{gs}/T_{ox})^2$ ,  $\alpha$  is a parameter which is ranged from 1 to 0.1 depending on the voltage drop across the oxide, H is the Plancks constant,

and  $\phi_b$  is the barrier height for electronics/holes in the conduction/valance band. Noted that the parameter variations are in general around 10-20% [33]. Hence, we make use of a first-order Taylor expansion at the nominal value of parameter oxide thickness and utilize the following gate tunneling leakage current model derived in [28].

$$I_{gate} = a_0 e^{a_1 \Delta T_{ox}} \tag{2.34}$$

where  $a_0$ , and  $a_1$  are the fitting constant,  $\Delta T_{ox}$  indicates the fractional variations at nominal value of oxide thickness. We incorporate the current model with the physical parameter random process expanded by KLE and set the supply voltage equal to 1 volt, the stochastic gate tunneling leakage power may be expressed as:

$$p_g(\vec{\varsigma}) = \tilde{p}_g e^{a_1 \sum_{i=1}^{N'_{kl}} f'_i(x^*, y^*)\varsigma_i}$$
(2.35)

where  $\tilde{p}_g = a_0 e^{a_1 \tilde{T}_{ox}}$  denotes the deterministic nominal gate leakage power,  $\tilde{T}_{ox}$  is the nominal value of oxide thickness,  $N'_{kl}$  is the truncated number of KLE for oxide thickness,  $f'_i(x^*, y^*)$  is the eigenfunction combined with eigenvalue term of oxide thickness for some logic gate at position  $(x^*, y^*)$ . The gate power random process can be represent in terms of Hermite polynomials,

$$p_g(\vec{\varsigma}) = \sum_{i=1}^{\infty} \gamma_i \Phi_i(\vec{\xi})$$
(2.36)

$$\gamma_i = \frac{\langle p_g(\vec{\varsigma})\Phi_i(\vec{\xi}) \rangle}{\langle \Phi_i^2(\vec{\xi}) \rangle}$$
(2.37)

#### 2.4.2 Subthreshold Leakage Current

The subthreshold leakage current is defined as the conduction current between source and drain in an "off" state MOS transistor. We apply the following empirical model was introduced in [10] to describe subthreshold leakage current.

$$I_{sub} = b_0 e^{b_1 L + b_2 L^2} \tag{2.38}$$

where  $b_0$ ,  $b_1$ , and  $b_2$  are the fitting constant, L indicates channel length. Substituting the Karhunen-Loeve expansion form of channel length random process into equation (2.38), the

statistical subthreshold leakage power may be given by:

$$p_s(\vec{\zeta}) = \tilde{p}_s e^{(b_1 + 2b_2\tilde{L}) \sum_{i=1}^{N''_{kl}} f''_i(x^*, y^*)\zeta_i + b_2 q(x^*, y^*, \vec{\zeta})}$$
(2.39)

$$q(x^*, y^*, \vec{\zeta}) = \{\sum_{i=1}^{N_{kl}} f_i''(x^*, y^*)\zeta_i\}^2$$
(2.40)

where  $\tilde{p}_s$  denotes the nominal subthreshold leakage power,  $\tilde{L}$  is the nominal value of channel length,  $N_{kl}''$  is the truncated number of KLE for channel length,  $f_i''(x^*, y^*)$  is the eigenfunction combined with eigenvalue term of channel length for some logic gate at position  $(x^*, y^*)$ . Expanding the subthreshold power random process as Hermite polynomials expansion,

$$p_s(\vec{\zeta}) = \sum_{i=1}^{\infty} \alpha_i \Phi_i(\vec{\xi})$$
(2.41)

$$\alpha_i = \frac{\langle p_s(\vec{\zeta})\Phi_i(\vec{\xi}) \rangle}{\langle \Phi_i^2(\vec{\xi}) \rangle}$$
(2.42)

The key point is that how to obtain the coefficients  $\{\gamma_i\}$  and  $\{\alpha_i\}$ . The computation of the the coefficients will be introduced in chapter 3.3.

Actually, the subthreshold and gate leakage strongly depends on the input pattern and circuits topology. We evaluate the leakage currents with considering input pattern to obtain the average leakage current from HSPICE simulation based on TSMC 65nm technology model [29]. The fitting constants  $a_0$ ,  $a_1$ ,  $b_0$ ,  $b_1$ ,  $b_2$  are obtained from the least square fitting method with maximum errors no more than 2%.

Noted that the total number of polynomial chaos depends on the value  $(N'_{kl} + N''_{kl})$ , dimension of random variables set  $\{\xi_i\}$  truncated from KLE. In fact, the set of polynomial chaos can be reduced to a new one, because the projection value of gate and subthreshold leakage power upon the polynomial basis which is function of  $\vec{\varsigma}$  and  $\vec{\zeta}$  will be equal to zero. For example,

$$\langle p_s(\vec{\zeta})\zeta_i\varsigma_j \rangle = \langle p_s(\vec{\zeta})\zeta_i \rangle \langle \varsigma_j \rangle = 0$$

$$\langle p_g(\vec{\varsigma})\zeta_i\varsigma_j \rangle = \langle p_g(\vec{\varsigma})\varsigma_j \rangle \langle \zeta_i \rangle = 0$$

$$(2.43)$$

Thus, polynomial basis of the new set of polynomial chaos is function of either  $\vec{\zeta}$  or  $\vec{\zeta}$ . The total number of new polynomial chaos set may be given by,

$$N_{pc} = 1 + \sum_{s=1}^{p} \prod_{r=0}^{s-1} (N'_{kl} + r) + \sum_{s=1}^{p} \prod_{r=0}^{s-1} (N''_{kl} + r)$$
(2.44)

where p is the order of polynomial chaos.

## 2.5 Monte Carlo Technique

Numerical methods that make use of random variables are known as Monte Carlo methods. This will serve as a benchmark against which all modeling and analysis techniques will be tested. In this work, we perform the Monte Carlo method to simulate the golden solutions for stochastic thermal analysis of full chip with considering within-die spatial correlation under process variations. An important key point is that how to generate multinormal distribution random variables. First, the within-die spatial correlation of parameters are modeled by partitioning the chip into N grid cells. Moreover, we assume that perfect correlations among the devices in the same grid cell, high correlations among those in close grid cells and low or zero correlations in far-away grid cells. Noted that the dimension of random variables set is N. Then, we construct the covariance matrix  $\Sigma$  with dimension N by N based on the given covariance kernel. From the covariance matrix  $\Sigma$ , we generate the multinormal distribution random variables by applying Cholesky factorization and the detail procedures are given in Fig. 2.6 and 2.7 [34].

| $\mathbf{A}$ | gorithm Multinormal-Cholesky                                                                                          |
|--------------|-----------------------------------------------------------------------------------------------------------------------|
| In           | <b>put:</b> Dimension N, covariance matrix $\Sigma$ .                                                                 |
| 0            | <b>utput:</b> Multinormal distributed vector $\vec{X}$ with mean 0 and covariance $\Sigma$ .                          |
| 1            | Begin                                                                                                                 |
| 2            | Compute Cholesky factor L of $\Sigma$ by Algorithm Cholesky-Decomposition in Fig. 2.7                                 |
| 3            | Generate vector $\vec{\epsilon} = (\epsilon_1, \epsilon_2,, \epsilon_N)^T$ of N independent standard normal variates. |
| 4            | $ec{X} \leftarrow \mathbf{L} ec{\epsilon}$                                                                            |
| 5            | Return vector $\vec{X}$                                                                                               |

6 **End** 

Fig. 2.6: Procedure of Multinormal-Cholesky

| Algorithm Cholesky-Decomposition                                                                           |  |  |  |
|------------------------------------------------------------------------------------------------------------|--|--|--|
| <b>Input:</b> Positive definite $N \times N$ matrix $\Sigma = \sigma_{ij}$ .                               |  |  |  |
| <b>Output:</b> Lower triangular matrix $\mathbf{L} = l_{ij}$ such that $\mathbf{L}\mathbf{L}^{T} = \Sigma$ |  |  |  |
| 1 Begin                                                                                                    |  |  |  |
| 2 for $i = 1$ to N do                                                                                      |  |  |  |
| 3 $l_{i1} \leftarrow \sigma_{i1} / \sqrt{\sigma_{i1}}$                                                     |  |  |  |
| 4 for $i = 2$ to $N$ do                                                                                    |  |  |  |
| 5 <b>for</b> $j = 2$ to $i - 1$ <b>do</b>                                                                  |  |  |  |
| $6 	 l_{i1} \leftarrow (\sigma_{ij} - \sum_{k=1}^{j-1} l_{ik} l_{jk})/l_{jj}$                              |  |  |  |
| 7 $x \leftarrow \sigma_{ii} - \sum_{j=1}^{i-1} l_{ij}^2$                                                   |  |  |  |
| 8 if $x \ge 0$ then                                                                                        |  |  |  |
| 9 $l_{ii} \leftarrow \sqrt{x}$                                                                             |  |  |  |
| 10 else                                                                                                    |  |  |  |
| 11 <b>abort</b> ( $\Sigma$ not positive definite)                                                          |  |  |  |
| 12 $l_{ij} \leftarrow \sqrt{x}$ for $1 \le i \le j \le N$                                                  |  |  |  |
| 13 Return matrix $\mathbf{L} = (l_{ij})$                                                                   |  |  |  |
| 14 End                                                                                                     |  |  |  |

Fig. 2.7: Procedure of Cholesky-Decomposition

Using the technique, samples of the required random variables to perform Monte Carlo analysis can be generated. For most purposes, variations in VLSI design are assumed to be Gaussian. Consequently, while analyzing intra-die variations, we need to generate samples of multinormal random variables.

## Chapter 3

# **Stochastic Thermal Simulation Methodology**

### **3.1** Stochastic Thermal Simulation Problem Formulation

The silicon die consumes dynamic power and leakage power, and is the main source of heat generation. Heat of the silicon die can be transfered to the ambient by two heat flow paths. The first primary heat flow path is through thermal interface material, heat spreader, and heat sink. The secondary heat flow path is through the interconnect layers, ceramic substrate, and printed-circuit board. The typical compact thermal model for the early stage VLSI design flow is shown in Fig. 3.1.



Fig. 3.1: Compact thermal model of the early design stage for stochastic heat sources.

Generally, the dynamic power is insensitive to process variations and can be assumed to be

deterministic [31]. However, the leakage power dissipation of ICs is not deterministic any more for sub-65nm technology. As the CMOS technology continuously scales down, the existing fluctuations in physical parameters such as channel length and oxide thickness result in the leakage power consumption with uncertainty. Moreover, the leakage power has became the major contributor of the total power consumption for VLSI in today's technology. Thus, the thermal simulation in leakage power dominated technology must combine into statistics. By combining the boundary condition for compact thermal model and stochastic power dissipation process, the stochastic heat transfer equation with boundary conditions is given as [20]

$$\nabla \cdot (\kappa(\mathbf{r})\nabla T(\mathbf{r}, t, \theta, \varpi)) = \sigma(\mathbf{r}) \frac{\partial T(\mathbf{r}, t, \theta, \varpi)}{\partial t} ; \mathbf{r} \in D$$
(3.1)

$$\kappa(\mathbf{r})\frac{\partial T(\mathbf{r}, t, \theta, \varpi)}{\partial n_{b_s}} + h_{b_s}T(\mathbf{r}, t, \theta, \varpi) = f_{b_s}(\mathbf{r})$$
(3.2)

where  $\mathbf{r} = (x, y, z)$  is defined in the system domain  $D = \{(0, L_x) \times (0, L_y) \times (-L_z, 0)\}, L_x$ and  $L_y$  are the lateral sizes of die,  $L_z$  is the thickness of die,  $\kappa(\mathbf{r})$  is the thermal conductivity  $(W/m \cdot \circ C)$  of die,  $\sigma(\mathbf{r})$  is the product of the material density and specific heat  $(J/m^3 \cdot \circ C)$  of die,  $\nabla$  is the diverge operator,  $h_{b_s}$  is the heat-transfer coefficient on the boundary surface,  $b_s$ , of die,  $f_{b_s}(\mathbf{r})$  is the heat flux function on the boundary surface, and  $\partial/\partial n_{b_s}$  is the differentiation along the outward direction normal to the boundary surface.  $\theta$  and  $\varpi$  belong to the set of manufacturing outcomes for channel length  $\Omega_L$  and oxide thickness  $\Omega_{T_{ox}}$ , respectively.

From the observations in [21] [22], the heat transfer coefficients of primary path can be modeled as an effective heat transfer coefficient  $h_p$  by combining the effect of each component on the primary path. Hence, the detail information of interconnect layer is not available in the early physical design stage, the interconnect layer was modeled as an equivalent thermal resistance based on the material density of regular structure by [21] [22]. The heat transfer coefficients of secondary path can be simplified to be an equivalent heat transfer coefficient  $h_s$ by stacking the thermal resistance of each interconnect layer, I/O pads, and print circuit board. The boundary condition in vertical surface of chip in Fig. 3.1 can be set to be adiabatic because the area of vertical surface is exceedingly smaller than the area of horizontal surface and the thermal conductivity of air is much less than the values of primary and secondary heat transfer paths [19]. The heat sources generated from different sub-circuits can be attached on the top surface of die for modeling the boundary condition. Although the thermal properties of die,  $\kappa(\mathbf{r})$  and  $\sigma(\mathbf{r})$ , are position-dependent, the variations of these thermal parameters are usually not significant and can be treated as constants while performing the thermal-aware floorplanning and placement.

With the above description, the stochastic heat transfer equation can be rewritten as

$$\kappa \nabla^2 T(\mathbf{r}, t, \theta, \varpi) = \sigma \frac{\partial T(\mathbf{r}, t, \theta, \varpi)}{\partial t}; \mathbf{r} \in D$$
(3.3)

$$\frac{\partial T(\mathbf{r}, t, \theta, \varpi)}{\partial x} \bigg|_{x=0, L_x} = \frac{\partial T(\mathbf{r}, t, \theta, \varpi)}{\partial y} \bigg|_{y=0, L_y} = 0$$
(3.4)

$$\kappa \left. \frac{\partial T(\mathbf{r}, t, \theta, \varpi)}{\partial z} \right|_{z=-L_z} = h_p T(\mathbf{r}, t, \theta, \varpi)|_{z=-L_z}$$
(3.5)

$$\kappa \left. \frac{\partial T(\mathbf{r}, t, \theta, \varpi)}{\partial z} \right|_{z=0} = h_s T(\mathbf{r}, t, \theta, \varpi)|_{z=0}$$

$$+ p(\mathbf{r}, t, L(\mathbf{r}, \theta), T_{ox}(\mathbf{r}, \varpi))|_{z=0}$$
(3.6)

where  $p(\mathbf{r}, t, L(\mathbf{r}, \theta), T_{ox}(\mathbf{r}, \varpi))$  is the random process of total power dissipation and it consists of dynamic power  $p_d(\mathbf{r}, t)$ , subthreshold leakage power  $p_s(\mathbf{r}, t, L(\mathbf{r}))$ , and gate leakage power  $p_s(\mathbf{r}, t, T_{ox}(\mathbf{r}, \varpi))$ . The leakage power is greatly affected by physical parameters with uncertainties such as channel length and oxide thickness, and need to be treated as a random process. The detail illustration of total leakage power random process will be addressed in chapter 2.4.

#### **3.2** Stochastic Thermal Simulation Flowchart

The executing flow of this work can be summarized as Fig. 3.2. Given a spatial covariance function of technology parameter, we construct the eigenvalues and eigenfunctions of the covariance kernel. By applying the Karhunen-Loeve expansion method, the correlated physical parameters random processes (channel length, and oxide thickness) are transformed into a set of uncorrelated random variables based on these eigenvalues and eigenfunctions. With those normalized random variables, we build the polynomial chaoses to serve as polynomial bases for the space of random variables. According to the power consumption, we create the leakage current models for various type logic gates from HSPICE simulation based on TSMC 65nm technology. After the chip geometry, the package configuration, gate level placement, and dynamic power distribution being given, the compact thermal model of Fig. 3.1 in chapter 3.1 can

be built. Then, we employ the stochastic Galerkin projection method to convert the stochastic heat transfer equation to a set of deterministic heat transfer equations. The number of those deterministic heat transfer equations is equal to the total number of polynomial chaoses. Finally, an efficient GIT based analytical thermal simulator [14] is utilized to solve those deterministic heat transfer equations, and the mean value and variance of full-chip temperature distribution can be obtained.



Fig. 3.2: Stochastic thermal simulation flowchart

## 3.3 Stochastic Galerkin Procedure

After applying the Karhunen-Loeve expansion, the system random process with spatial correlation can be transformed into a set of orthonormal standard normal random variables without correlation. From the set of uncorrelated random variables  $\vec{\xi}$ , we construct the Hermite polynomial chaos to serve as the bases for random space. The above-mentioned temperature random process  $T(\mathbf{r}, t, \vec{\varsigma}, \vec{\zeta})$  can be reformed and be expanded as the following expansion form by using the Hermite polynomial chaos expansion.

$$T(\mathbf{r}, t, \vec{\xi}) \simeq \sum_{i=0}^{N_{pc}} \hat{T}_i(\mathbf{r}, t) \Phi_i(\vec{\xi})$$
(3.7)

Substituting the expansion form equation (3.7) into equation (3.3), the residual can be expressed as:

$$R(\mathbf{r},t,\vec{\xi}) \equiv \kappa \nabla^2 \sum_{i=0}^{N_{pc}} \hat{T}_i(\mathbf{r},t) \Phi(\vec{\xi}) - \sigma \frac{\partial}{\partial t} \sum_{i=0}^{N_{pc}} \hat{T}_i(\mathbf{r},t) \Phi(\vec{\xi})$$
(3.8)

Utilizing the stochastic Galerkin principle which enforces the residual to be orthogonal to each of the basis functions,

$$< R(\mathbf{r}, t, \vec{\xi}) \Phi_k(\vec{\xi}) >= 0 \qquad (k = 0, 1, ... N_{pc})$$
 (3.9)

The orthogonality relation results in a set of deterministic equations with dimension  $N_{pc}$ .

$$\kappa \nabla^2 \hat{T}_k(\mathbf{r}, t) = \sigma \frac{\partial \hat{T}_k(\mathbf{r}, t)}{\partial t} \qquad (k = 0, 1, \dots N_{pc})$$
(3.10)

Employing similar Galerkin projection procedure on the system boundary conditions equation (3.4)-(3.5), the resulting transformed equations can be given by,

$$\frac{\partial \hat{T}_k(\mathbf{r}, t)}{\partial x} \bigg|_{x=0, L_x} = \frac{\partial \hat{T}_k(\mathbf{r}, t)}{\partial y} \bigg|_{y=0, L_y} = 0$$
(3.11)

$$\kappa \left. \frac{\partial \hat{T}_k(\mathbf{r}, t)}{\partial z} \right|_{z=-L_z} = h_p \hat{T}_k(\mathbf{r}, t)|_{z=-L_z}$$
(3.12)

Substituting the expansion form equation (3.7) into equation (3.6), the residual can be given by:

$$R'(\mathbf{r},t,\vec{\xi}) \equiv \frac{\partial}{\partial z} \sum_{i=0}^{N_{pc}} \hat{T}_i(\mathbf{r},t) \Phi(\vec{\xi}) - h_s \sum_{i=0}^{N_{pc}} \hat{T}_i(\mathbf{r},t) \Phi(\vec{\xi}) - p(\mathbf{r},t,\vec{\xi})$$
(3.13)

Applying the stochastic Galerkin projection procedure to the residual  $R'(\mathbf{r}, t, \vec{\xi})$ ,

$$\frac{\partial}{\partial z}\hat{T}_{k}(\mathbf{r},t) = h_{s}\hat{T}_{k}(\mathbf{r},t) + \frac{\langle p(\mathbf{r},t,\vec{\xi})\Phi_{k}(\vec{\xi})\rangle}{\langle \Phi_{k}^{2}(\vec{\xi})\rangle}; (k = 1, 2, ...N_{pc})$$
(3.14)

Once the right side second term of "=" for equation (3.14) be determined, the deterministic heat transfer equations can be formulated. Now, focusing on the leakage power projection term,

$$< p(x, y, t, \vec{\xi}) \Phi_{k}(\vec{\xi}) > = p_{d}(x, y, t) < \Phi_{k}(\vec{\xi}) >$$

$$+ H_{g}(x, y, t) < p_{g}(x, y, \varsigma) \Phi_{k}(\vec{\xi}) >$$

$$+ H_{s}(x, y, t) < p_{s}(x, y, \zeta) \Phi_{k}(\vec{\xi}) >$$
(3.15)

where  $H_g(x, y, t)$  and  $H_s(x, y, t)$  are the function of position and switching activity for gate tunneling and subthreshold leakage power, respectively.

Here, considering the gate tunneling power projection term,

$$< p_g(x, y, \vec{\varsigma}) \Phi_k(\vec{\xi}) > = \tilde{p}_g(x, y) < e^{a_1(x, y) \sum_{i=1}^{N_{kl}} f'_i(x, y)\varsigma_i} \Phi_k(\vec{\xi}) >$$
(3.16)

where  $\tilde{p}_g(x, y) = a_0(x, y)e^{a_1(x,y)\tilde{T}_{ox}(x,y)}$  is the deterministic nominal gate leakage power,  $f'_i(x, y)$  is the eigenfunction combined eigenvalue term for oxide thickness,  $a_0(x, y)$  and  $a_1(x, y)$  are the fitting constants of gate leakage power for different logic gate located different position. We take an example for a reference position  $(x^*, y^*)$ , equation (3.16) can be rewritten as

$$< p_g(x^*, y^*, \vec{\varsigma}) \Phi_k(\vec{\xi}) > = \tilde{p}_g^* < e^{a_1^* \sum_{i=1}^{N_{kl}} f_i^{'*\varsigma_i}} \Phi_k(\vec{\xi}) >$$
(3.17)

If the polynomial chaos  $\Phi_k(\vec{\xi})$  is function of  $\vec{\zeta}$ , the value of equation 3.17 is zero. on the contrary, let us consider an example of  $\Phi_k(\vec{\xi})$  is a first order polynomial and function of  $\vec{\zeta}$ . Noted that normal random variables set  $X = \{X_1, X_2, ..., X_n\}$  has the following property,

$$< e^{\sum_{j=1}^{n} \beta_j X_j} X_k > = \beta_k \prod_{i=1}^{n} e^{\frac{\beta_i^2}{2}} \quad \forall k \in 1, 2, ..., n$$
 (3.18)

The computation of equation (3.17) can be calculated based on equation (3.18) for first order polynomial, and based on the following equation (3.19) for second order polynomial.

$$< e^{\sum_{j=1}^{n} \beta_j X_j} X_k^2 > = (\beta_k^2 + 1) \prod_{i=1}^{n} e^{\frac{\beta_i^2}{2}} \quad \forall k \in 1, 2, ..., n$$
 (3.19)

The more detailed computation procedure of equation (3.17) can be shown in Fig. 3.3.

| Algorithm Gate Power Projection Procedure                                                                    |  |  |  |
|--------------------------------------------------------------------------------------------------------------|--|--|--|
| <b>Input:</b> The constants $a_1^*$ , $\{f_i^{\prime*}\}$ , and polynomial $\Phi_k(\vec{\xi})$               |  |  |  |
| <b>Output:</b> Return the value of $< e^{a_1^* \sum_{i=1}^{N'_{kl}} f'^*_i \varsigma_i} \Phi_k(\vec{\xi}) >$ |  |  |  |
| 1 Begin                                                                                                      |  |  |  |
| 2 <b>if</b> the polynomial $\Phi_k(\vec{\xi})$ contains $\zeta_i$                                            |  |  |  |
| 3 Return 0                                                                                                   |  |  |  |
| 4 endif                                                                                                      |  |  |  |
| 5 <b>do</b> $C_g^* = \prod_{i=1}^{N_{kl}'} e^{\frac{(a_1^* f_i^{'*})^2}{2}}$                                 |  |  |  |
| 6 <b>if</b> the order of $\Phi_k(\vec{\xi}) = 1$                                                             |  |  |  |
| 7 Return $f_k^{\prime *} C_a^*$                                                                              |  |  |  |
| 8 endif                                                                                                      |  |  |  |
| 9 elseif the order of $\Phi_k(\vec{\xi}) = 2$                                                                |  |  |  |
| 10 Return $[(f_k'^*)^2 + 1]C_q^*$                                                                            |  |  |  |
| 11 endif                                                                                                     |  |  |  |
| 12 End                                                                                                       |  |  |  |

Fig. 3.3: Procedure of gate power projection

Now, considering the subthreshold leakage power projection term,

$$< p_s(x, y, \vec{\zeta}) \Phi_k(\vec{\xi}) >= \tilde{p}_s(x, y) < e^{B(x, y) \sum_{i=1}^{N_{kl}} f_i''(x, y)\zeta_i + b_2(x, y)q(x, y, \vec{\zeta})} \Phi_k(\vec{\xi}) >$$
(3.20)

where,

$$B(x,y) = b_1(x,y) + 2b_2(x,y)\tilde{L}(x,y)$$
(3.21)

$$q(x, y, \vec{\zeta}) = \{\sum_{i=1}^{N_{kl}} f_i''(x, y)\zeta_i\}^2$$
(3.22)

where  $\tilde{p}_s(x,y) = b_0(x,y)e^{b_1(x,y)\tilde{L}(x,y)+b_2(x,y)\tilde{L}^2(x,y)}$  denotes the deterministic nominal subthreshold leakage power,  $f_i''(x,y)$  is the eigenfunction combined eigenvalue term for channel length,  $b_0(x,y)$ ,  $b_1(x,y)$ , and  $b_2(x,y)$  are the fitting constants of subthreshold leakage power. If the polynomial chaos  $\Phi_k(\vec{\xi})$  is the function of  $\vec{\zeta}$ , the value of equation (3.20) is zero. The quadratic form  $q(x^*, y^*, \vec{\zeta})$  for reference position  $(x^*, y^*)$  can be expressed as  $\vec{\zeta}^T A \vec{\zeta}$ .

The quadratic form  $\vec{\zeta}^T A \vec{\zeta}$  can be reduced into its standard form  $\vec{\nu}^T D \vec{\nu}$ ,  $\vec{\nu} = [\nu_1, \nu_2, ..., \nu_{N_{kl}}]^T$ . The standard form of  $q(x^*, y^*, \vec{\zeta})$  is determined once the eigenvalues of A are known, the transformation between  $\vec{\zeta}$  and  $\vec{\nu}$  is given by  $\vec{\zeta} = Q\vec{\nu}$ . The real symmetric matrix A have the eigenvalues  $\lambda_1^A, \lambda_2^A, ..., \lambda_{N_{kl}}^A$ , and let Q be an orthogonal matrix that diagonalizes A, so that  $Q^T A Q = D$ , where D is a diagonal matrix with the eigenvalues of A as the elements on its leading diagonal. After the eigen-decomposition transformation, equation (3.20) can be rewritten as the following for reference point:

$$< p_{s}(x^{*}, y^{*}, \vec{\zeta}) \Phi_{k}(\vec{\xi}) >= \tilde{p}_{s}^{*} < e^{B^{*} \sum_{i=1}^{N_{kl}} C_{i}\nu_{i} + b_{2}^{*} \sum_{i=1}^{N_{kl}} \lambda_{i}^{A}\nu_{i}^{2}} \Phi_{k}'(\vec{\nu}) >$$
(3.23)

where  $C_i = \sum_{j=1}^{N_{kl}} Q_{ij} f_j^{''*}$ , the indices *i* and *j* are the row and column index of the matrix *Q*, respectively. The computation of equation (3.23) for polynomial chaos  $\Phi'_k(\vec{\nu})$  which is constant value can be based on the following property.

$$< e^{\sum_{j=1}^{n} \alpha_j X_j^2 + \beta_j X_j} > = \prod_{i=1}^{n} \frac{e^{\frac{\beta_j^2}{2 - 4\alpha_j}}}{(1 - 2\alpha_j)^{\frac{1}{2}}}$$
 (3.24)

For the first order polynomial chaos  $\Phi_k^{'}(\vec{\nu})$ ,

$$< e^{\sum_{j=1}^{n} \alpha_j X_j^2 + \beta_j X_j} X_k > = \frac{\beta_k}{1 - 2\alpha_k} \prod_{i=1}^{n} \frac{e^{\frac{\beta_j^2}{2 - 4\alpha_j}}}{(1 - 2\alpha_j)^{\frac{1}{2}}}; \ \forall k \in 1, 2, ..., n$$
(3.25)

Moreover, for the second order,

$$< e^{\sum_{j=1}^{n} \alpha_j X_j^2 + \beta_j X_j} X_k^2 > = \frac{\beta_k^2 - 2\alpha_k + 1}{(1 - 2\alpha_k)^2} \prod_{i=1}^{n} \frac{e^{\frac{\beta_j^2}{2 - 4\alpha_j}}}{(1 - 2\alpha_j)^{\frac{1}{2}}}; \ \forall k \in 1, 2, ..., n$$
(3.26)

The computation of equation (3.23) can be easily derived based on equation (3.24)-(3.26) for different order polynomial chaos over the design system domain.

Noted that the eigen-decomposition transformation can be pre-calculated to deal with different logic gates placement. Because the eigenfunctions f''(x, y) depends on the covariance kernel for physical parameters random processes rather than design placement.

These equations, equation (3.10), equation (3.11), equation (3.12), and equation (3.14) form a set of deterministic heat transfer equations, and its solutions can be formulated based on Algorithm Sub Power Projection Procedure **Input:** The constants  $b_1^*$ ,  $b_2^*$ ,  $\{f_i''^*\}$ ,  $\tilde{L}^*$ , and polynomial  $\Phi_k(\vec{\xi})$ **Output:** Return the value of  $e^{B^* \sum_{i=1}^{N'_{kl}} f''_i \cdot \zeta_i + b_*^* q^*(\vec{\zeta})} \Phi_k(\vec{\xi}) >$ where  $B^* = b_1^* + 2b_2^* \tilde{L}^*$ ,  $q^*(\vec{\zeta}) = \{\sum_{i=1}^{N'_{kl}} f_i''^* \zeta_i\}^2$ Begin 1 if the polynomial  $\Phi_k(\vec{\xi})$  contains  $\varsigma_i$ Return 0 2 3 4 Endif **do** Transform the quadratic form  $q^*(\vec{\zeta}) = \vec{\zeta}^T A \vec{\zeta}$  to standard form  $\vec{\nu}^T D \vec{\nu}$ , 5  $\vec{\nu} = [\nu_1, ..., \nu_{N''_{kl}}], Q^T A Q = D, \text{linear transform } \vec{\zeta} = Q \vec{\nu},$ obtain the eigenvalues  $\{\lambda_i^A\}$  of A, $D_s^* = \prod_i^{N''_{kl}} \frac{e^{\frac{2-4b_2^*\lambda_i^A}{(B^*C_i)^2}}}{(1-2b_2^*\lambda_i^A)^{0.5}}, C_i = \sum_{j=1}^{N''_{kl}} Q_{ij} f_j''^*.$ if the order of  $\Phi'_k(\vec{\nu}) = 1$ Return  $\frac{B^*C_k}{1-2b_2^*\lambda_k^A} D_s^*$ endif 6 7 8 9 10 11 endif elseif the order of  $\Phi'_k(\vec{\nu}) = 2$ Return  $\frac{(B^*C_k)^2 - 2b_2^*\lambda_k^k + 1}{(1-2b_2^*\lambda_k^k)^2}D_s^*$ 12 13 endif 14 15 End

Fig. 3.4: Procedure of sub power projection

several existing techniques. In this work, we apply the analytic technique (Generalized Integral Transforms) [14] to serve as the deterministic solver. Once the set of coefficients  $\{\hat{T}_k(\mathbf{r},t)\}$  be obtained, mean value and variance of the temperature distribution can be solved.

The mean value and variance of temperature distribution can be expressed as:

$$E\{T(\mathbf{r},t,\vec{\xi})\} = \hat{T}_0(\mathbf{r},t)$$
(3.27)

$$Var\{T(\mathbf{r}, t, \vec{\xi})\} = \sum_{i=1}^{N_{pc}} \hat{T}_i^2(\mathbf{r}, t) < \Phi_i^2(\vec{\xi}) >$$
(3.28)

# Chapter 4

# **Experimental Results**

The proposed stochastic thermal simulator is implemented as a tool in C++ on HPxw9300 workstation with 16GB memory. The simulation results are obtained on the our tested placement benchmark. We create a leakage power gate level cell library based on HSPICE simulation on TSMC 65nm technology model. The nominal value of oxide thickness is set to 1.4nm and the  $3\sigma$  value of parameters variations for channel length L and oxide thickness  $T_{ox}$  are set to 20% of the nominal parameter values. The ratios of correlation length to chip size for x-dir  $(\eta_x/L_x)$  and y-dir  $(\eta_y/L_y)$  were set to 0.31 that means the correlation between two devices are located half of chip dimension away in either direction is 0.2. The generated tested placement benchmark with about four millions gate counts from the floorplanning shown in Fig. 4.1 is based on the cell library at 65nm technology. The experimental results can be summarized as the following:



Fig. 4.1: The floorplan of our test circuit

#### • Accuracy and Efficiency

25

30

701

991

The accuracy and efficiency of proposed stochastic thermal simulator can be shown in Table 4.1. We demonstrate the accuracy and efficiency of this work in comparison with Monte Carlo Method with 100000 samples. We found that 100000 samples is the reasonable number of the Monte Carlo simulation. Actually, our proposed method leads to about 1% of errors in both mean and standard deviations for choosing that  $N_{kl}$  is equal to 75 and the order of polynomial chaos is equal to 1 and only takes four minutes. Noted that  $N'_{kl}$  and  $N''_{kl}$  are set equal to  $N_{kl}$ . The errors of mean value is strongly depended on  $N_{kl}$  rather than the order of PC. The errors of standard deviation relys on not only  $N_{kl}$  but only the order of PC.

| $N_{kl}$ | $N_{pc}$ | PC    | $E\{T\}$  | $\sigma\{T\}$ | Run     | Speedup |
|----------|----------|-------|-----------|---------------|---------|---------|
|          |          | Order | MaxErr(%) | MaxErr(%)     | Time(s) | (X)     |
| 25       | 51       | 1     | 4.74      | 9.4           | 83.83   | 1960    |
| 50       | 101      | 1     | 2.52      | 3.3           | 165.79  | 990     |
| 75       | 151      | 1     | 1.72      | 1.09          | 247.4   | 662     |
| 20       | 461      | 2     | 4.87      | 7.68          | 949.25  | 264     |

6.43

1.41

1339.58

2039.07

143

100

4.2

3.67

Table 4.1: Accuracy and Efficiency Compared to Monte Carlo Method

#### • Compared to the Deterministic Thermal Simulation

 $\mathbf{2}$ 

 $\overline{2}$ 

As shown in Fig. 4.3 and Fig. 4.4, there is 18% difference between the deterministic nominal power consideration and stochastic power consideration in the average temperature in our tested circuit. The deterministic simulation that underestimated the temperature profile and hottest value of temperature can not offer designers a robust solution. The cost function of traditional thermal-aware floorplanning, placement, and optimization methodology is often avoiding the hot spot and smoothing the thermal profile. However, the deterministic cost and objective function is not reliable enough in deep sub-micron technology. The stochastic thermal simulator conducts designers to not only minimize the mean value of temperature to avoid high temperature failure but also reduce the variance of temperature to lower thermal gradients. In addition, solutions of stochastic thermal simulator support designers to develop circuits with more tolerance of manufactured process variations.



Fig. 4.2: (a) The nominal power distribution at the top surface of die, (b) The mean power distribution at the top surface of die.



Fig. 4.3: (a) The 3D nominal temperature distribution at the top surface of die, (b) The 3D mean temperature distribution at the top surface of die.



Fig. 4.4: (a) The 2D nominal temperature distribution at the top surface of die, (b) The 2D mean temperature distribution at the top surface of die.

#### • Considering Spatial Correlation Compared to Neglecting Spatial Correlation

We demonstrated the difference between considering spatial correlation and ignoring spatial correlation based on Monte Carlo simulation in our tested circuit. From the simulation results, we observe that the mean value of temperature of method with considering spatial correlation is close to the method without considering spatial correlation. However, the variance of temperature profile between the two methods is different entirely. As shown in Fig. 4.5, the results reveal that the variance of temperature distribution is larger in the method which considers spatial correlation. Moreover, the variance value of surrounding the region with larger variance value is also larger. This shows that the variance of temperature profile behaves with circumfluent phenomenon and spatial correlation. This is because the leakage powers of neighbor region are less correlated when spatial correlation is ignored. For considering spatial correlation, the leakage powers of neighbor region are with positive interactions. From simulation solutions of the tested circuit in Fig. 4.5(a) and Fig. 4.5(b), the method with considering spatial correlation 3X - 4Xtimes the variance of the one without considering spatial correlation. The thermal simulator without considering spatial correlation will undervalue the thermal gradients. Large temperature gradients may not only reduce lifetime of chip but also cause thermal stress to crack circuits. [20].



Fig. 4.5: (a) The 3D standard deviation temperature distribution at the top surface of die with considering spatial correlation, (b) The 3D standard deviation temperature distribution at the top surface of die without considering spatial correlation.



Fig. 4.6: (a)The 2D standard deviation temperature distribution at the top surface of die with considering spatial correlation, (b)The 2D standard deviation temperature distribution at the top surface of die without considering spatial correlation.

#### • Temperature Yield Estimation

The yield loss will worsen in future technologies due to the continued significance of leakage powers and increasing process variations. Another trouble observation is that increased variations on leakage powers not only cause a larger spread of temperature distribution but also higher average temperature distribution. It is worth to note that most current thermal simulation approaches do not consider process variations and are unaware of their impact on yield. These deterministic thermal simulation approaches result in yield loss due to increased susceptibility to process variations. The evaluating on thermal yield can be formulated by the following Generalized Chebyshev Inequality [35].

**COROLLARY** 4.1 If  $E[X] = \mu$ ,  $Var(X) = \sigma^2$ , then for a > 0

$$P\{X \ge \mu + a\} \le \frac{\sigma^2}{\sigma^2 + a^2} \tag{4.1}$$

$$P\{X \le \mu - a\} \le \frac{\sigma^2}{\sigma^2 + a^2} \tag{4.2}$$

Using the Generalized Chebyshev Inequality, we can obtain the tighter bounds of temperature distribution. The probability of temperature distribution under the given temperature constraint can be estimated by using the generalized inequality. This gives the designers a guideline and solutions to deal with yield issues for manufactured process variations. Furthermore, these solutions can serve as a stochastic thermal solver which is incorporated into other CAD design methodologies such as timing and reliability analysis methods. It can be shown in Fig. 4.7 that the probability of temperature distribution under upper bound which is mean value plus three standard deviation is larger than 90%.



Fig. 4.7: The mean value plus three standard deviation of temperature distribution.

#### • Hot Spot Observations

Generally speaking, the general purpose of thermal-aware placement and floorplanning is to minimize the maximal temperature gradient over the chip and to avoid the hot spot occurrence. The deterministic thermal simulators offer the explicit hot spot location to CAD designer. However, the variations in leakage power result in the on-chip temperature distribution with uncertainties. It's difficulty to indicate the hot spot location under process variations. In our tested circuit, we observe that the hot spot location can not merely be determined by traditional deterministic thermal solver. As shown in Fig. 4.4(a), the hot spot location is at the point ( $x = 0 \ (mm), y = 5 \ (mm)$ ) based on deterministic thermal simulator. However, it can be shown in Fig. 4.8 that all the points of the line at  $y = 5 \ (mm)$  have probability to become hot spot location. Therefore, the traditional deterministic thermal simulators are unaware of the precise hot spot location.

It's crucial to precisely indicate the hot spot location for designers under manufactured process variations. Our future work is developing a robust scheme to determine hot spot distribution.



Fig. 4.8: The temperature profile at  $y = 5 \ (mm)$ .

# Chapter 5

# Conclusion

In this work we have proposed a stochastic thermal simulation procedure to estimate the statistical temperature distribution in the presence of within-die process variations. Our experimental results show that the proposed method has small variation errors and high efficiency compared to Monte Carlo simulation. The simulation results indicate that simulation without considering spatial correlation may underestimate existing critical thermal gradient. We further point out that the traditional deterministic thermal simulator will lose information of thermal profile in manufactured process variations and guide designer to a optimistic way. The stochastic thermal simulator we proposed offers a robust estimation of temperature distribution and serves as a necessary machine for thermal-aware methodology with considering within-die process variations.

# **Bibliography**

- Brian E. Stine, Duane S. Boning, and James E. Chung, "Analysis and Decomposition of Spatial Variation in Integrated Circuit Processes and Devices," *IEEE Trans. on Semiconductor Manufacturing*, 10(1), Feb., 1997, pp.24-41.
- [2] D. Okumu Ouma, Duane S. Boning, James E. Chung, William G. Easter, Vivek Saxena, Sudhanshu Misra, and Annette Crevasse, "Characterization and Modeling of Oxide Chemical Mechanical Polishing Using Planarization Length and Pattern Density Concepts," *IEEE Trans. on Semiconductor Manufacturing*, 15(2), 2002.
- [3] M. Orshansky, L. Milor, P. Chen, K. Keutzer, and C. Hu, "Impact of spatial chip gate length variability on theperformance of high-speed digital circuits," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, 2002, pp.544-553
- [4] J. Xiong, V. Zolotov, and L. He, "Robust Extraction of Spatial Correlation," *Proceedings of the ACM ISPD*, 2006.
- [5] F. Liu, "A General Framework for Spatial Correlation Modeling in VLSI Design," *Proceedings of the ACM DAC*, 2007.
- [6] S. Bhardwaj, S. Vrudhula, P. Ghanta, and Y. Cao, "Modeling of Intra-die Process Variations for Accurate Analysis and Optimization of Nanoscale Circuits," *Proceedings of the ACM DAC*, 2006.
- [7] P. Friedberg, Y. Cao, J. Cain, R. Wang, J. Rabaey, and C. Spanos, "Modeling Within-Die Spatial Correlation Effects for Process-Design Co-Optimization," *Proceedings of the ISQED*, 2005.
- [8] R. Hogg, and E. Tanis, "Probability and Statistical Inference," Prentice Hall, 2001

- [9] R. G. Ghanem, and P. D. Spanos, "Stochastic Finite Elements: A Spectral Approach," Springer-Verlag, 1991.
- [10] R. Rao, A. Srivastava, E. David Blaauw, and D. Sylvester, "Statistical Analysis of Subthreshold Leakage Current for VLSI Circuits," *IEEE Trans. on Very Large Scale Integration Systems*, vol. 12, NO. 2, February 2004.
- [11] W. Huang, M. R. Stan, K. Skadron, K. Sankaranarayanan, S. Ghoshyz, and S. Velusamyz, "Compact Thermal Modeling for Temperature-Aware Design," *Proceedings of the ACM DAC*, 2004.
- [12] W. Huang, S. Ghoshyz, S. Velusamyz, K. Sankaranarayanan, K. Skadron, and M. R. Stan,
   "HotSpot: A Compact Thermal Modeling Methodology for Early-Stage VLSI Design," *IEEE Trans. on Very Large Scale Integration Systems*, vol. 14, May 2006.
- [13] http://www-device.eecs.berkeley.edu/ ptm/
- [14] P. Y. Huang, C. K. Lin, Y. M. Lee, "Full-Chip Thermal Analysis for the Early Design Stage via Generalized Integral Transforms," *Proceedings of the ACM ASPDAC*, January 2008.
- [15] R. H. Cameron, W. T. Martin, "The orthogonal development of nonlinear functionals in series of Fourier-Hermite functionals," Ann. of Math., 1947.
- [16] T. Y. Wang and C. C. P. Chen, "Thermal-ADI: A Linear-Time Chip-Level Thermal Simulation Algorithm Based on Alternating-Direction Implicit (ADI) Method," *IEEE Trans. on Very Large Scale Integration Systems*, vol. 11, no. 4, pp. 691-700, August 2003.
- [17] T. Y. Wang and C. C. P. Chen, "SPICE-Compatible Thermal Simulation with Lumped Circuit Modeling for Thermal Reliability Analysis Based on Model Reduction." *Proceedings* of the ACM ISQED, 2004.
- [18] P. Li, L. T. Pileggi, M. Asheghi, and R. Chandra, "IC Thermal Simulation and Modeling via Efficient Multigrid-Based Approaches," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. 25, no. 9, pp. 319-26, September 2006.

- [19] Y. Zhan, and S. S. Sapatnekar, "High Efficiency Green Function-Based Thermal Simulation Algorithms," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems*, vol. 26, no. 9, September 2007.
- [20] J. L. Tsai, C. C. P. Chen, G. Chen, B. Goplen, H. Qian, Y. Zhan, S. M. Kang, M. D. F. Wong and S. S. Sapatnekar, "Temperature-Aware Placement for SOCs," *Proceedings of the IEEE*, vol. 94, no. 8, August, 2006.
- [21] W. Huang, M. R. Stany, K. Skadronz, K. Sankaranarayananz, S. Ghoshyz, and S. Velusamyz, "Compact Thermal Modeling for Temperature-Aware Design," *Proceedings of DAC*, pp. 878-883, June 2004.
- [22] T. -Y. Wang and C. C. -P. Chen, "Thermal-ADI: A Linear-Time Chip-Level Thermal Simulation Algorithm Based on Alternating-Direction Implicit (ADI) Method," *IEEE Transactions on Very Large Scale Integration Systems*, vol. 11, no. 4, pp. 691-700, August 2003.
- [23] A. Agarwal, K. Kang and K. Roy, "Accurate Estimation and Modeling of Total Chip Leakage Considering Inter- & Intra-Die Process Variations," *IEEE Transactions on Computeraided design*, pp. 736-741, 2005.
- [24] S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, and V. De, "Parameter variations and impact on circuits and microarchitecture", *Proceedings of DAC*, pp.338, 2003.
- [25] A. Agawal, D. Blaauw, V. Zolotov "Statistical Timing Analysis for Intra-Die Process Variations with Spatial Correlations, *Proceedings of ICCAD*", pp.900-907, 2003.
- [26] Jianfeng Luo, Subarna Sinha, Qing Su, Jamil Kawa, and Charles Chiang "An IC Manufacturing Yield Model Considering Intra-Die Variations" *Proceedings of DAC*, pp.749-754, 2006.
- [27] G. Seber, Multivariate Observations, Wiley Series, 1984.
- [28] S. Mukhopadhyay, and K. Roy "Modeling and Estimation of Total Leakage Current in Nano-scaled CMOS Devices Considering the Effect of parameter Variation" *Proceedings* of ISLPED, pp.172-175, 2003.

- [29] H. Chang, and Sapatnekar, S.S. "Full-chip analysis of leakage power under process variations, including spatial correlations" *Proceedings of DAC*, pp.523-528, 2005.
- [30] Bao Liu "Gate Level Statistical Simulation Based on Parameterized Models for Process and Signal Variations" *Proceedings of ISQED*, pp. 257-262, 2007.
- [31] Bao Liu "Statistical Analysis and Optimization for VLSI:Timing and Power", Springer, 2004.
- [32] Kyung Ki Kim, et al, "Accurate Macro-modeling for Leakage Current for IDDQ Test", *IMTC*, 2007.
- [33] Nassif, "Delay Variability: Source, Impacts and Trends", ISSCC, pp. 368, Feb. 2000,
- [34] Wolfgang Hormann, Josef Leydold, and Gerhard Derflinger, "Automatic Nonuniform Random Variate Generation", Springer, 2003.
- [35] Athanasios Papoulis, "Probability, Random Variables, and Stochastic Processes", McGRAW-HILL, 1991.