# 行政院國家科學委員會專題研究計畫 期中進度報告

子計畫九:適用於晶片上電力傳輸分析之階層模組簡化技術 (1/3)

計畫類別:整合型計畫

計畫編號: NSC92-2220-E-009-032-

執行期間: 92年11月01日至93年07月31日

執行單位: 國立交通大學電信工程學系

<u>計畫主持人:</u> 李育民 <u>共同主持人:</u> 周景揚

計畫參與人員:林義琅、陳逸宏、周桓宇、邱震軒、黃至鴻

報告類型: 完整報告

報告附件: 出席國際會議研究心得報告及發表論文

處理方式: 本計畫可公開查詢

中 華 民 國 93 年 5 月 31 日

# 行政院國家科學委員會補助專題研究計畫期中進度報告

低功率系統之設計及自動化

子計畫九:適用於晶片上電力傳輸分析之階層模組簡化技術(1/3)

計畫類別: 個別型計畫 整合型計畫

計畫編號: NSC 92 - 2220 - E - 009 - 032

執行期間:92年 11月 1日至 93年 7月 31日

計畫主持人:李育民 共同主持人:周景揚

計畫參與人員:林義琅、陳逸宏、周桓宇、邱震軒、黃至鴻

成果報告類型(依經費核定清單規定繳交): 精簡報告 完整報告

本成果報告包括以下應繳交之附件:

赴國外出差或研習心得報告一份

赴大陸地區出差或研習心得報告一份

出席國際學術會議心得報告及發表之論文各一份

國際合作研究計畫國外研究報告書一份

處理方式:除產學合作研究計畫、提升產業技術及人才培育研究計畫、

列管計畫及下列情形者外,得立即公開查詢

涉及專利或其他智慧財產權, 一年 二年後可公開查詢

執行單位:國立交通大學電信工程系

中 華 民 國 九十三 年 五 月 三十一 日

#### 低功率系統之設計及自動化

子計畫九:適用於晶片上電力傳輸分析之階層模組簡化技術(1/3)

Power Delivery Network Analysis with

Hierarchical Model Order Reduction Techniques

計畫編號: NSC 92 - 2220 - E - 009 - 032

執行期間:92年 11月 1日至 93年 7月 31日

計畫主持人:李育民

共同主持人:周景揚

### 一、中文摘要

隨著超大型積體電路(VLSI)複雜度的增加,以及晶片元件特徵尺寸(feature size)的持續縮減,導線的數量將超過上億個區段,因此電力傳輸設計與分析成為極具挑戰性的工作,不良的電力傳輸設計將會降低電路的效能、雜訊邊際,及可靠度。

為確認電力傳輸的設計品質,必須使用暫態模擬去分析電力傳輸的波動。同時,由於電力傳輸網路的分析矩陣很大,傳統的電路模擬工具如 SPICE/HSPICE 無法有效的分析這類系統。因此,發展一個有效的電力傳輸分析工具是非常重要的。前期的研究中,我們考慮電感(自感及互感)的效應並結合了階層式(hierarchical)分析與模組簡化(model order reduction)技術,發展出一個計算負載與電流源數目無關的快速電力傳輸分析工具。然而由於我們忽略不同分割電路區塊間的互感效應,如此將引進人為的誤差。本計劃的第一年我們將引入聯結不同子電路的邊界以估計其偶合所造成的效應,並繼續研讀及發展出更具前瞻性的模組簡化技術以加快分析速度。

關鍵詞:電力傳輸, 階層式分析, 模組簡化

#### 二、英文摘要

The increase in the complexity of VLSI (Very Large Scale Integrated) chips, and the decrease in the feature size of chips demand large metal resources for the power delivery network. The number of wire segments will be over one billion in nanometer designs. This causes the designing and verifying of the power delivery network to become a challenging task. The inferior design of the power distribution network can degrade the circuit performance, noise margin, and reliability.

To ensure the design quality of power delivery, extensive transient power grid simulations need to be performed to analyze the power delivery fluctuation. However, due to the large size of power delivery network, the traditional circuit simulators, such as SPICE/HSPICE, do not perform well and often take days to complete the full simulation and need many gigabytes of memory space. Hence, in order to facilitate the design of large scale power grids, it is crucial to develop an efficient transient simulation engine for the power delivery network analysis. In our previous research, we utilized the hierarchical analysis and model order reduction techniques to

develop a hierarchical passivity preserved interconnect macro modeling engine for the power delivery network. The computation load of this method is independent of the number of sources. This engine not only considers the self inductances, but also includes the effect of mutual inductances. However, this method neglects any electric coupling between different circuit blocks when it performs the hierarchical decomposition. Therefore, it might reduce its accuracy. In the first year, the goal of this project is to consider the coupling at the boundaries of different sub-circuits to approximate the coupling effects, and study and develop more advanced model order reduction methods to speed up the analysis.

Keywords: Power/Ground Network, Power Delivery Network, Hierarchical Analysis, Model Order Reduction

## 三、研究計畫之背景及目的

With the UDSM (Ultra Deep Sub-Micron) technology, several features of today's chips (higher operating frequencies, larger number of transistors, smaller feature size and lower power supply voltage) have pushed the power delivery noise analysis onto the designers' list of high priority concerns [1~4]. Basically, the power delivery noise consists of IR drop, Ldi/dt drop and resonance fluctuations. The IR drop has been widely discussed and extensively studied in the literatures [5~8]. Due to the roaring clock frequency, increasing current consumption, and even the clock gating feature, Ldi/dt noise is quickly emerging as another power fluctuation concern [6]. Power delivery noise causing the power voltage to deviate from the ideal value can severely degrade the performance and even make the gate function erroneously. Therefore, the extensive analysis of RLC/RLKC (resistance, inductance, mutual inductance, and capacitance) power delivery system is required to ensure them to meet the target performance and reliability goals.

Generally speaking, one of the major difficulties for the power delivery analysis is size explosion. Tens of millions of devices and parasitics are required to be modeled and simulated over a long time period. However, it is computationally expensive to simultaneously simulate all transistors with the power delivery structure. To enhance the simulation speed, it has been proposed to decouple the power delivery structure simulation and transistors' simulation [6]. First, the current profiles of transistors can be estimated by the current extraction methods. After that, the power delivery network can be modeled as a suitable RLC/RLKC circuit attached by current sources. In this way, the simulation can be effectively done since there are fewer elements in the circuit, and a RLC/RLKC circuit can be simulated with one LU-decomposition. However, due to the large size of linear circuit, traditional circuit simulators, such as SPICE/HSPICE, do not perform well and often take days to complete the full simulation and need many gigabytes of memory space. For this reason, the hierarchical simulation technique has been applied by [6] to

speed up the power delivery network simulation.

The MOR (Model Order Reduction) technique is another efficient way which can be utilized to speed up the circuit analysis [8], and has been widely studied and improved over the last decade [5, 9~11]. Starting from AWE (Asymptotic Waveform Evaluation) [9] to PRIMA (Passive Reduced-order Interconnect Macromodeling Algorithm) [11], MOR techniques have been successfully extended to consider the inductance effects with reasonable accuracy. Later, an extended Krylov subspace method, EKS (Extended Krylov Subspace) [5], has been developed to simulate large scale power delivery circuits with many PWL (Piece Wise Linear) current sources. To resolve the source waveform modeling issues, EKS need to perform the moment shifting procedure to recover the proper moments. In [12], we proposed an improved EKS (IEKS) method such that it did not need to perform moment shifting for the source waveform modeling, and established a novel hierarchical power delivery macro-modeling methodology which integrated the multiple-port Norton equivalent theorem with the MOR algorithm to generate compact and accurate models and achieved significant runtime improvement. We not only considered the self inductance, but also included the effect of mutual inductance in each sub-circuit. The computation load of this method is independent of the number of sources.

In the first year of this project, we will continue to study and develop more advanced and reliable MOR methods to speed up the analysis. Since our previous method neglected the electric coupling between different circuit blocks when it performed the hierarchical decomposition, it might introduce extra errors during the analysis. We will include the coupling at the boundaries of different sub-circuits to approximate the coupling effects.

#### 四、研究方法

The power delivery network can be modeled as a RLC/RLKC circuit attached by many current sources as illustrated in Figure 1. For simplicity, we do not show the mutual inductances in Figure 1. This RLC/RLKC circuit model can be represented as a set of MNA (Modified Nodal Analysis) equations,

$$\mathbf{G}x(t) + \mathbf{C} \mathcal{L}(t) = \mathbf{B}u(t), \tag{1}$$

where x(t) represents the vector of MNA variables consisting of nodal voltages, inductor source currents, and voltage source currents, u(t) denotes the vector of port voltage sources and internal current sources. **G** is the conductance matrix, **C** is the susceptance matrix, and **B** is the input selector matrix mapping the sources to the internal states. Equation (1) can be transformed to the s-domain by the Laplace transformation,

$$\mathbf{G}\mathbf{x}(s) + s\mathbf{C}\mathbf{x}(s) = \mathbf{B}\mathbf{u}(s). \tag{2}$$

The goal of model order reduction techniques is to generate an analytic model which is a compact description of the original circuit by matching its moments or poles. First, both sides of Equation (2) can be expanded to their Taylor series around zero frequency,

$$(\mathbf{G}+s\mathbf{C})(m_0+m_1s^1+m_2s^2+ ) = \mathbf{B}(u_0+u_1s^1+u_2s^2+ ), \tag{3}$$

where  $m_i$  and  $u_i$  are the  $i_{th}$  moment of  $\mathbf{x}(s)$  and  $\mathbf{u}(s)$  respectively. After that, each unknown moment  $m_i$  can be represented by the source moments  $(u_i)$  as the following iterative equations,

$$\mathbf{G}m_0 = \mathbf{B}u_0 \tag{4}$$

$$\mathbf{G}m_i + \mathbf{C}m_{i-1} = \mathbf{B}u_i \tag{5}$$

Then, a basis matrix V is constructed by  $m_0$ ,  $m_1$ , ...,  $m_{K-1}$ , where K is the order of the circuit after reduction. Finally, the congruent transformation is used to transfer the original circuit to the reduced circuit. We combined the hierarchical analysis with the above procedure to speed up the analysis of power delivery system in our previous work. Please refer to [12] for the detail discussion.

In the following two subsections, first, a possible approach is proposed to consider the coupling at the boundaries of different sub-circuits during the hierarchical analysis. After that, a more reliable moment matching methods by using the multipoint expansion technique will be proposed to perform the model order reduction for each sub-circuit.



On-Chip resistance and inductance

Figure 1 Modeling of Power Delivery Networks

Gates current

#### A. Hierarchical Analysis with Considering the Mutual Inductances

Since the order of RLC/RLKC power grid model is very high, the hierarchical analysis utilizes the technique of divide-and-conquer to efficiently and quickly analyze the power delivery noise. In this subsection, we will apply the FM (*Fiduccia Mattheyses*) algorithm [13] to divide the original system into two blocks by minimizing the effect of mutual inductances between these two blocks. We can also extend the proposed methods to partition a circuit into multiple blocks.

Given a RLC/RLKC power gird model, we first transfer this model into a graph G(V, N), where N is the set of nets, V is the set of wire segments, and the number of wire segments is C(|V|=C). Each wire segments c consists of resistance, capacitance, self inductance, and mutual inductance, or current source, or voltage source. Then, we apply the FM algorithm to divide the set V into two subsets  $V_1$ ,  $V_2$  and optimize the cost. In order to consider the effect of mutual inductances between different blocks, we need to modify the gain function g(i) in [13] of each wire segment  $c_i$ .

Let F(i) and T(i) be the From\_block and To\_block of the  $i_{th}$  cell  $c_i$  respectively,  $1 \le i \le C$ . The gain g(i) resulting from the movement of the  $i_{th}$  cell  $c_i$  from block F(i) to block T(i) is defined as,

$$g(i) = \alpha * GN(i) + \beta * GM(i), \tag{6}$$

where the GN(i) is the number of cut nets reduced after moving  $c_i$  to To\_block, GM(i) is the effective ratio of mutual inductances decreased after moving  $c_i$  to To\_block. The  $\alpha \in [0,1]$  is the weighting factor relative to the cut net,  $\beta \in [0,1]$  is the weighting factor relative to the cut mutual inductance, and  $\alpha + \beta = 1$ . With adjusting the values of  $\alpha$ , and  $\beta$ , we can make a trade-off between the size of cut set and the total effective ratio of mutual inductances between different blocks.

The definitions of GN(i), and GM(i) are GN(i) = FS(i) - TE(i).

FS(i) = the number of nets connected to  $c_i$  and not connected to any other wire segment in the From\_block F(i) of  $c_i$ .

TE(i) = the number of nets that are connected to  $c_i$  and not crossing the cut. GM(i) = TM(i) - IM(i).

TM(i) = the sum of the ratio of mutual inductances between From\_block and To\_block to the self inductance of  $c_i$ .

IM(i) = the sum of the ratio of mutual inductances inside the From\_block to the self inductance of  $c_i$ .

After partitioning, a suitable model order reduction is applied to each sub-circuit. But before we start to reduce the order of each block, the mutual inductances between any two blocks need to be eliminated. A method of diagonal compensation of matrix entries is applied to eliminate those mutual inductances. That is, for a matrix, the off-diagonal entries of a matrix may be ignored if they are relatively small to their corresponding diagonal terms. We can view that there is a weak connection between them and add the off-diagonal entries to the diagonal without sacrificing too much accuracy.

Since the goal of our partition algorithm is to minimize the weighted sum of cut size, and the ratios of mutual inductances between any two blocks, we can apply the concept of diagonal

compensation method of matrix entries to add the mutual inductances to their relative self inductances across the interfaces of different blocks. The window search concept is applied to speed up the procedure of diagonal compensation. We consider the wire segment which the distance between it and the ports is less than the window size, and add their cross-block mutual inductances to the corresponding wire segments.

Finally, we can continue our hierarchical analysis of each partition blocks without the cross-block mutual inductances since we have already minimized them and added them into their relative wire segments.

# **B.** Multipoint Expansion Technique

In our previous research, we expanded the input sources and the output variables with Taylor series at s=0, and matched their moments with the source moments. This would be adequately accurate when there are only slow changing input sources, and for the low frequency components of the circuit response. In order to have better accuracy for higher frequencies, we need to consider the moments at different frequencies. We can expand the Taylor series at different frequencies, match the moments, and map the original system to a simplified system. Instead of calculating a set of high order moments at a single expanded frequency, we calculate many low order moments at different expanded frequencies. By this way, we can have more reliable numerical characteristic and more accurate result. Let's expand the Taylor series at  $s=s_0$ , where  $s_0$  is the desired frequency. By setting  $z=s-s_0$ , and plugging this into Equation (3), we have

$$[(\mathbf{G} + s_0 \mathbf{C}) + z \mathbf{C}](\overline{m_0} + \overline{m_1} z^1 + \overline{m_2} z^2 + ) = \mathbf{B}(\overline{u_0} + \overline{u_1} z^1 + \overline{u_2} z^2 + ).$$
 (7)

With the similar procedure presented in [12], we can also prove that the  $-1_{st}$  and  $-2_{nd}$  order moments are all zeros for any finite time PWL source waveform expanded at any arbitrary  $s_0$ . Therefore, we do not need to perform moment shifting for the source waveform modeling as well as [12]. Furthermore, the procedure of source moment calculating shown in [12] can also be applied to Equation (4) since they have the same representation. The recursive relationship between adjacent moments can be derived as

$$(\mathbf{G} + s_0 \mathbf{C}) \overline{m}_0 = \mathbf{B} \overline{u}_0 \tag{8}$$

$$(\mathbf{G} + s_0 \mathbf{C}) \overline{m}_i + \mathbf{C} \overline{m}_{i-1} = \mathbf{B} \overline{u}_i \tag{9}$$

Using the similar way as shown in [5], the above recursive moment matching procedure can be described as the following algorithm.

------

**Algorithm**: **IEKS Moment Matching Algorithm**( $(s_p, K_p), p=0,1,2,...,L-1$ )

/\* ( $s_p$ ,  $K_p$ ): the pair of expanded point  $s_p$ , and its target expanding order  $K_p$  \*/ N = O(number of bases)

**for** *p*=0:*L*-1

$$b_0 = \mathbf{B} u_0, \ \overline{m}_0 = [\mathbf{G} + s_p \mathbf{C}]^{-1} b_0, \ \alpha_0 = 1/|\overline{m}_0|, \ \hat{\gamma}_0 = \alpha_0 \overline{m}_0$$

N++:

**for** i = 1:  $K_p$ -1

$$\gamma_{i+N} = (\mathbf{G} + s_{p}\mathbf{C})^{-1} \left\{ \prod_{n=0}^{i-1} \alpha_{n} \mathbf{B} \mathbf{u}_{i}^{-} - \mathbf{C} \alpha_{i-1} \gamma_{i-1+N} \right\}$$

$$\bar{\gamma}_{i+N} = \gamma_{i+N} - \sum_{n=0}^{i-1+N} h_{i+N,n} \hat{\gamma}_{n}, \text{ where } h_{i+N,n} = \hat{\gamma}_{n} \gamma_{i+N}$$

$$\mathbf{if} \quad |\bar{\gamma}_{i+N}| < error\_tolerance, \mathbf{break};$$

$$\mathbf{else}$$

$$\alpha_{i} = 1/|\bar{\gamma}_{i+N}|, \quad \hat{\gamma}_{i+N} = \alpha_{i} \bar{\gamma}_{i+N}$$

$$N++;$$

$$\mathbf{endif}$$

$$\mathbf{endfor}$$

$$\mathbf{endfor}$$

## 五、結論與討論

In this report, we have utilized the technique of multipoint expansion to match the moments at several desired frequencies. We believe that this method will be more reliable and more accurate. Currently, we are implementing this method via C++ language, and investigating how to choose the suitable values of expanded frequencies (desired frequencies).

We also have proposed a procedure of hierarchical analysis of power delivery system with considering the mutual inductances. One another possible approach can be done in the following way. It is possible to consider the coupling after those reduced circuits have been built since we know the projection vectors and we can use superposition method to handle this issue. We can add the boundary couplings (ports to ports) back to approximate the coupling effects when we integrate those reduced blocks. We will realize the above ideas as soon as possible.

#### 六、成果

[1] Yu-Min Lee, Yahong Cao, Tsung-Hao Chen, Janet Wang, and, and Charlie Chung-Ping Chen, "HiPRIME: Hierarchical and Passivity Preserved Interconnect Macro-modeling Engine for RLKC Power Delivery", submitted to *IEEE Transactions on Computer-Aided-Design of Integrated Circuits and Systems*.

- [1] Howard H. Chen, and David D. Ling, "Power Supply Noise Analysis Methodology for Deep-submicron VLSI Chip Design", *Design Automation Conference*, pp. 638-643, 1997.
- [2] Michael K Gowan and Larry L Biro and Daniel B Jackson, "Power Considerations in the Design of the Alpha 21264 Microprocessor", *Design Automation Conference*, pp. 726-731, 1998.
- [3] Adhijit Dharchoudhury, Rajendran Panda, David Blaauw, and Ravi Vaidyanathan, "Design and Analysis of Power Distribution Networks in *PowerPC*<sup>TM</sup> Microprocessors", *Design Automation Conference*, pp. 738-743, 1998.
- [4] Yi-Min Jiang, and Kwang-Ting Cheng, "Analysis of Performance Impact Caused by Power Supply Noise in Deep Submicron Devices", *Design Automation Conference*, pp. 760-765, 1999.
- [5] Janet M. Wang, and Tuyen V. Nguyen, "Extended Krylov Method for Reduced Order Analysis of Linear Circuits with Multiple Sources", *Design Automation Conference*, pp. 247-252, 2000.
- [6] Min Zhao, Rajendran V. Panda, Sachin S. Sapatnekar, Tim Edwardsand, Rajat Chaudhry, and David Blaauw, "Hierarchical Analysis of Power Distribution Networks", *Design Automation Conference*, pp. 150-155, 2000.
- [7] Sani R. Nassif and Joseph N. Kozhaya, "Fast Power Grid Analysis", *Design Automation Conference*, pp. 156-161, 2000.
- [8] Mustafa Celik, Lawrence Pillage and Altan Odabasioglu, "IC Interconnect Analysis", Kluwer Academic Publishers, 2002.
- [9] L. Pillage and R. A. Rohrer, "Asymptotic Waveform Evaluation for Timing Analysis", *IEEE Transactions on Computer-Aided- Design of Integrated Circuits and Systems*, vol. 9, no. 4, April, pp. 352-366, 1990.
- [10] P. Feldman and R. W. Freund, "Efficient Linear Circuit Analysis by Pade Approximation via the Lanczos Process", *IEEE Transactions on Computer-Aided- Design of Integrated Circuits and Systems*, vol. 14, no. 5, May, pp. 639-649, 1995.
- [11] A. Odabasioglu, M. Celik, and L. T. Pillage, "PRIMA: Passive Reduced-order Interconnect Macromodeling Algorithm", *Proc. of IEEE/ACM International Conference on Computer-Aided Design*, 1997.
- [12] Yahong Cao, Yu-Min Lee, Tsung-Hao Chen, and Charlie Chung-Ping Chen, "HiPRIME: Hierarchical and Passivity Reserved Interconnect Macro-modeling Engine for RLKC Power Delivery", *Design Automation Conference*, pp. 379-384, 2002.
- [13] Sadiq M Sait, and Habib Youssef, "VLSI Physical Design Automation: Theory and Practice", World Scientific Publishing Co. Pte. Ltd., 1999.