# Correspondence # An Orthogonal Simulated Annealing Algorithm for Large Floorplanning Problems Shinn-Ying Ho, Shinn-Jang Ho, Yi-Kuang Lin, and William Cheng-Chung Chu Abstract—The conventional simulated annealing with some random generation mechanism using the sequence-pair topological representation in block placement and floorplanning is effective for a very small number of modules (40–50). This paper proposes an orthogonal simulated annealing algorithm (OSA) with an efficient generation mechanism (EGM) for solving large floorplanning problems. EGM samples a small number of representative floorplans and then efficiently derives a high-performance floorplan by using a systematic reasoning method for the next move of OSA based on orthogonal experimental design. Furthermore, an improved swap operation is proposed which cooperates with EGM to make OSA efficient. Excellent experimental results using the Microelectronics Center of North Carolina and the Gigascale Sysems Research Center benchmarks show that OSA performs better than existing methods for large floorplanning problems. Index Terms—Floorplanning, optimization, orthogonal experimental design, simulated annealing, very large scale integration (VLSI). #### I. Introduction It is known that the floorplanning problem with a given set of M rectangular modules, is the core of problems for VLSI layout design, which is NP-hard [1]. The structure that represents the geometric relation for a floorplan will affect the basic operation to the structure and determine the inherent complexity to the approaches using it [2]. Recently, researchers have proposed several effective representations for nonslicing floorplans, such as sequence pair (SP) [1], O tree [2], and B\* tree [3]. Murata etal. [1] introduced the SP representation which has been recognized as an elegant structure to describe a placement of modules. The SP is easily utilized as a coding scheme of a stochastic algorithm such as simulated annealing (SA) [4]. Murata etal. [5] proposed an adaptation procedure implemented in the SP-based SA to solve the problem of VLSI/printed circuit board (PCB) placement with obstacles. SA has been the best known floorplanner so far [6]. All block-placement algorithms that are based on SP use SA where the generation and evaluation of a large number of SPs are required [7]. For all Microelectronics Center of North Carolina (MCNC) benchmark block-placement problems [8], the algorithm FAST-SP [7] using SA with the fast SP evaluation can obtain the best results ever reported in the literature. However, the conventional SA with some random generation mechanism by swapping one pair of two modules at a time to alter SP is effective for a very small number of modules (40–50) [9]. In this paper, we propose an orthogonal simulated annealing algorithm (OSA) with an efficient generation mechanism (EGM) for solving large floorplanning problems. EGM generates a small Manuscript received November 26, 2002; revised August 16, 2003. - S.-Y. Ho is with the Department of Biological Science and Technology and Institute of Bioinformatics, Nationsal Chiao Tung University, Hsin-Chu, Taiwan 300, R.O.C. - S.-J. Ho is with the Department of Automation Engineering, National Huwei University of Science and Technology, Huwei, Yunlin, Taiwan 632, R.O.C. - Y.-K. Lin is with the Department of Information Engineering and Computer Science, Feng Chia University, Taichung, Taiwan 407, R.O.C. - W. C.-C. Chu is with the Department of Computer Science and Information Engineering, TungHai University, Taichung, Taiwan 407, R.O.C. Digital Object Identifier 10.1109/TVLSI.2004.831464 number of sampling floorplans to be analyzed by performing swap operations on some combinations of N(>1) pairs of modules, and then systematically obtains a candidate floorplan for the next move by swapping the promising ones of the N pairs at a time based on orthogonal experimental design (OED) [10]. Furthermore, an improved swap operation is proposed which cooperates with EGM to make OSA efficient. Excellent experimental results using MCNC and Gigascale Systems Research Center (GSRC) benchmarks [11] show that OSA performs better than existing methods for large floorplanning problems, compared with those in [7] and [11]. The rest of the paper is organized as follows. Section II gives the proposed OSA for solving large floorplanning problems. Experimental results of OSA are given in Section III. Finally, Section IV concludes this paper. #### II. ORTHOGONAL SIMULATED ANNEALING ALGORITHM # A. Weakness of Existing SA-Based Floorplanning A sequence pair (SP) is an ordered pair of $\Gamma_+$ and $\Gamma_-$ , where each of $\Gamma_+$ and $\Gamma_-$ is a sequence of the M module names. Murata et al. applied one of the following three pair interchanges in a perturbation step of SA [1]. 1) Swap 1: two module names in $\Gamma_+$ ; 2) Swap 2: two module names both in $\Gamma_+$ and $\Gamma_-$ ; and 3) Swap 3: the width and the height of a module, where the last one is for orientation optimization. Since the existing perturbation step swapping only one pair of two modules changes only a small portion of SP [1], [5], [7], the probability is relatively small in further improving performance when the value of M is large and the floorplan tends to be compact. In other words, it would easily result in premature convergence. However, if promising ones of the N pairs are simultaneously swapped in a perturbation operation, the probability of obtaining an improved configuration can be significantly increased. EGM aims to economically determine the promising ones of the N pairs. # B. Concepts of the Used OED An efficient way to study the effects of several factors (variables) simultaneously is to use OED based on orthogonal array (OA) and factor analysis. OA is a matrix of numbers arranged in rows and columns where each row represents the levels of factors in each experiment, and each column represents a specific factor that can be changed from each experiment. Factor analysis can evaluate the effects of individual factors on the evaluation function, rank the most effective factors, and determine the best level for each factor such that the objective function is optimized. OED uses well planned and controlled experiments in which certain factors are systematically set and modified, and then main effect of factors on the response (objective function) can be observed. Therefore, OED using OA and factor analysis is regarded as a systematic reasoning method. OED can also be incorporated in the recombination step of genetic algorithms for efficiently solving intractable optimization problems comprising lots of design parameters [10]. ## C. OA and Factor Analysis for EGM The two-level OA used in EGM is described as follows. Let there be N factors with two levels for each factor. The number of total experiments is $2^N$ for the popular "one-factor-at-once" study. To use an OA of N factors with two levels, we obtain an integer $n=2^{\lceil \log_2(N+1) \rceil}$ , build a two-level OA $L_n(2^{n-1})$ with n rows and n-1 columns, use the first N columns, and ignore the other n-1-N columns [12]. OA can reduce the number of experiments for factor analysis. The number of OA experiments required to analyze all individual factors is only n, where $N+1 \le n \le 2N$ . An algorithm of constructing various OAs can be found in [12]. After proper tabulation of experimental results, the summarized data are analyzed using factor analysis to determine the relative effects of levels of various factors as follows. Let $f_t$ denote an objective function value of the combination corresponding to the experiment t, where $t = 1, \ldots, n$ . Define the main effect of factor j with level k as $S_{jk}$ where $j = 1, \ldots, N$ and k = 1, 2 $$S_{jk} = \sum_{t=1}^{n} f_t \cdot F_t \tag{1}$$ where $F_t = 1$ if the level of factor j of experiment t is k; otherwise, $F_t = 0$ . Considering the case that the objective function is to be minimized, the level 1 of factor j makes a better contribution to the objective function when $S_{j1} < S_{j2}$ . If $S_{j1} = S_{j2}$ , levels 1 and 2 have the same contribution. After the better one of two levels of each factor is determined, an efficient combination consisting of all factors with the better levels can be easily derived. ## D. Efficient Generation Mechanism EGM considers N pairs of modules for some $\mathit{SWAP}$ operation such as $\mathit{Swap1}$ , $\mathit{Swap2}$ , and $\mathit{Swap3}$ on the SP of a current solution S simultaneously. The merit of EGM is that the ability of OED is incorporated in EGM to economically identify the promising ones of the N pairs and then obtain a good candidate solution Q for the next move using at most $n = 2^{\lceil \log_2(N+1) \rceil}$ SP evaluations. The procedure of EGM( $\mathit{SWAP}$ , N) is described as follows. - Step 1) Select N pairs of modules from a current solution S. One pair of modules is treated as a factor. - Step 2) Use the first N columns of an OA $L_n(2^{n-1})$ where $n=2^{\lceil \log_2(N+1) \rceil}$ . Let level 1 (level 2) of a factor represent that the SWAP of the corresponding pair is disabled (enabled). - Step 3) Compute $f_t$ of the generated sampling floorplans corresponding to the experiment t, where $t = 2, \ldots, n$ . The floorplan corresponding to the experiment 1 is the solution S. - Step 4) Compute the main effect $S_{jk}$ where $j=1,\ldots,N$ , and k=1,2. - Step 5) Determine the better one of two levels of each factor based on the main effect. - Step 6) The derived combination consists of all factors with the better levels. The floorplan R is obtained from the derived combination by performing the enabled SWAP operations. - Step 7) Q is the best one of the n generated floorplans including R. R may be the same as one of the n-1 sampling floorplans. Note that there is a large probability that Q=R. EGM(SWAP, N) with N=1 means that the function of OED is disabled and the operation is degenerated to a conventional swap operation using one pair of modules. The proper value of N is proportional to the size M of the floorplanning problems. To efficiently use all columns of OA $L_n(2^{n-1})$ , N is generally specified as $2^{\alpha}-1$ where $\alpha$ is an integer. In the study, the used value $\alpha=\lceil\log_{10}M\rceil$ . # E. Improved Swap2 Operation We present an improved Swap2 operation, called Swap2/os as follows. Let the pair–interchange of two randomly selected module names in $\Gamma_+$ and $\Gamma_-$ with swap of their orientations. Furthermore, the selected modules are limited to modules which are similar in appearance. Let constants $w_i$ and $h_i$ be the width and height of module i, respectively. Define the dissimilarity measure as $D_{ij} = |h_i - h_j| + |w_i - w_j|$ . A TABLE I PERFORMANCE COMPARISONS OF VARIOUS ALGORITHMS | Benchmark | [11] | [7] | OSA | | |--------------|------------|------------|-----------------------------|----------| | ( <i>M</i> ) | area (mm²) | area (mm²) | best/avg (mm <sup>2</sup> ) | Time (s) | | Apte (9) | 47.31 | 46.92 | 47.08 / 47.30 | 2.1 | | Xerox (10) | 20.05 | 19.8 | 19.80 / 20.01 | 2.4 | | Hp (11) | 9.14 | 8.95 | 8.95 / 9.11 | 2.8 | | ami33 (33) | 1.2 | 1.2 | 1.17 / 1.18 | 8.8 | | ami49 (49) | 37.19 | 36.46 | 35.88 / 36.32 | 43.0 | | n100 (100) | 0.1985 | NA | 0.1819 / 0.1824 | 223.7 | | n200 (200) | 0.1943 | NA | 0.1805 / 0.1808 | 1066.2 | | n300 (300) | 0.3047 | NA | 0.2812 / 0.2822 | 2212.3 | table is prepared for EGM, which records the set $M_i$ containing the best K candidate modules with the smallest values of $D_{ij}$ for each module i. The other module can be selected from only $M_i$ if module i has been selected. The operation Swap2/os is the same as Swap2 with the additional orientation inheritance and scale similarity constraint. #### F. OSA for Floorplanning The floorplanning problem is to arrange a given set of rectangular modules in the plane to obtain minimal values for all objectives, such as chip area, wire length, timing, congestion, power, etc. The general single-objective function f of this minimization problem uses a weighted sum approach. For example, the function f for minimizing chip area and wire length can be defined as $c_1 \cdot \operatorname{area} + c_2 \cdot \operatorname{wirelengt} h$ , where the weights $c_1$ and $c_2$ are for area and wire length, respectively, and the two terms area and wirelength are normalized [2], [6]. The outline of the proposed OSA for block placement and floorplanning is described as follows. - Step 1) Randomly generate a sequence pair as an initial solution S and compute f(S). Initialize the temperature $T = T_0$ , the number of trials per temperature $N_T = N_0$ , and cooling rate C. I = 0. - Step 2) Perturb S to Q using one of the three types of swap operations in turn: 1) Swap I; 2) EGM(Swap 2/os, N) where $\alpha = \lceil \log_{10} M \rceil$ , $N = 2^{\alpha} 1$ , and $K = 5\alpha$ ; and 3) Swap 3. - Step 3) Accept Q with probability $\min\{1, \exp((f(S) f(Q))/T)\}.$ - Step 4) I = I + 1. If $I < \lceil N_T \rceil$ , go to Step 2. - Step 5) $T = C \cdot T$ and $N_T = C \cdot N_T$ . I = 0. - Step 6) If the frozen condition is met, stop the algorithm. Otherwise, go to Step 2. ### III. EXPERIMENTAL RESULTS High performance of OSA is evaluated by comparing performances with the best results reported in FAST-SP [7] and website [11] using MCNC and GSRC benchmarks. The parameters of OSA are as follows: $T_0=10^8,\,N_0=200,\,{\rm and}\,C=0.99.$ The stopping condition is to use $2\times10^6$ SP evaluations ( $N_{\rm eval}$ ). The evaluation of SP uses a simple and efficient $O(M^2)$ algorithm introduced in [7]. The average performance is obtained from 20 independent runs. OSA is implemented using C++ language on a PC with Intel P3-733. For comparison purposes, the objective function f is to minimize chip area only in this study, since only area minimization results were reported for B\* tree [3] and FAST-SP [7]. Table I shows that the best and average performances of various benchmarks using OSA. The best results by area for other methods are directly taken from [7] and [11]. The results of n100, n200, and n300 for FAST-SP are not available from [7]. The best and average performances of OSA are better than the best performances of existing Fig. 1. Performances of OSA with various values of N for benchmark n300. The areas for $N=1,\,3,\,{\rm and}\,\,7$ are 0.2824, 0.2807, and 0.2796 ${\rm mm}^2,\,{\rm respectively}.$ methods for floorplanning problems $(M\geq 33).$ Especially, the improvement is significant for large floorplanning problems $(M\geq 100)$ with 7%–8% area improvements. Note that all the obtained floorplans using OSA are compact with small dead areas. It means that OSA can obtain satisfactory and encouraging results. To evaluate the ability to handle large floorplanning problems by adaptively specifying a proper value of N, Fig. 1 shows a typical result of OSA with N=1, 3, and 7 for n300 (M=300) using $N_{\rm eval}=8\times10^7$ . The minimal areas for N=1, 3 and 7 are 0.2824, 0.2807, and 0.2796 ${\rm mm}^2$ , respectively. The result of OSA is much better than the best result 0.3047 ${\rm mm}^2$ in [11]. The result reveals that OSA is efficient for solving large floorplanning problems. ### IV. CONCLUSION This paper proposes an OSA with a novel efficient generation mechanism embedded in a perturbation step for efficiently solving large floorplanning problems. Excellent experimental results using MCNC and GSRC benchmarks show that OSA performs better than existing methods. We believe that OSA can be widely used to improve the performance of SA-based methods for large floorplanning problems of minimizing chip area, wire length, timing, congestion, power, etc. ## REFERENCES - [1] H. Murata, K. Fujiyoshi, S. Nakatake, and Y. Kajitani, "VLSI module placement based on rectangle-packing by the sequence pair," *IEEE Trans. Computer-Aided Design*, vol. 15, pp. 1518–1524, Dec. 1996. - [2] P.-N. Guo, T. Takahashi, C.-K. Cheng, and T. Yoshimura, "Floorplanning using a tree representation," *IEEE Trans. Computer-Aided Design*, vol. 20, pp. 281–289, Feb. 2001. - [3] Y. C. Chang, Y. W. Chang, G. M. Wu, and S. W. Wu, "B\*-trees: A new representation for nonslicing floorplans," in *Proc. Design Automation Conf.*, June 2000, pp. 458–463. - [4] D. F. Wong, H. W. Leong, and C. L. Liu, Simulated Annealing for VLSI Design. Norwell, MA: Kluwer, 1988. - [5] H. Murata, K. Fujiyoshi, S. Nakatake, and Y. Kajitani, "VLSI/PCB placement with obstacles based on sequence pair," *IEEE Trans. Computer-Aided Design*, vol. 17, pp. 61–68, Jan. 1998. - [6] A. Ranjan, K. Bazargan, S. Ogrenci, and M. Sarrafzadeh, "Fast floorplanning for effective prediction and construction," *IEEE Trans. VLSI Syst.*, vol. 9, pp. 341–351, Apr. 2001. - [7] X. Tang, R. Tian, and D. F. Wong, "Fast evaluation of sequence pair in block placement by longest common subsequence computation," *IEEE Trans. Computer-Aided Design*, vol. 20, pp. 1406–1413, Dec. 2001. - [8] Benchmark suite of North Carolina State. [Online] Available: http://www.cbl.ncsu.edu/cbl\_docs/lys92.html - [9] D. F. Wong and C. L. Liu, "Floorplan design for VLSI circuits," Algorithmica, vol. 4, pp. 263–291, 1989. - [10] H.-L. Huang and S.-Y. Ho, "Mesh optimization for surface approximation using an efficient coarse-to-fine evolutionary algorithm," *Pattern Recognition*, vol. 36, no. 4, pp. 868–884, April 2003. - [11] J. Rabaey. Gigascale systems research center. [Online] Available: http://www.cse.ucsc.edu/research/surf/GSRC/progress.html - [12] Y. W. Leung and Y. Wang, "An orthogonal genetic algorithm with quantization for global numerical optimization," *IEEE Trans. Evol. Comput.*, vol. 5, pp. 41–53, Feb. 2001. # A Hybrid Current/Voltage Mode On-Chip Signaling Scheme With Adaptive Bandwidth Capability Rizwan Bashirullah, Wentai Liu, Ralph Cavin, and Dale Edwards Abstract—This brief describes an adaptive bandwidth bus architecture based on hybrid current/voltage mode repeaters for long global RC interconnect static busses that achieves high-data rates while minimizing the static power dissipation associated with current-mode (CM) signaling. An experimental adaptive bandwidth bus test chip fabricated in AMI 1.6- $\mu$ m Bulk CMOS indicates a reduction in power dissipation of approximately 62% over CM sensing and an increase in maximum data rate of 40% over voltage-mode signaling. ${\it Index Terms} {\bf --Adaptive, bus, current-mode (CM), on-chip interconnects, repeater.}$ # I. INTRODUCTION Repeaters have been traditionally used for optimizing wire delays [1]. However, as operating frequencies continue to increase, repeaters will be required to improve the signaling bandwidth of interconnects for the purpose of achieving high-speed on-chip data transmission [2]. A dominant limitation of interconnects at higher frequencies is the signal dispersive effects due to lossy conductors. One possible approach to extend the bandwidth of lossy interconnects is via an alternative repeater insertion technique based on current-mode (CM) signaling [3]. CM signaling uses low-impedance receive-end signal sensing to fundamentally enhance the interconnection bandwidth and hence increase the maximum attainable data rates in addition to reducing the wire delays. The key to CM signal transporting is in the reduction of RC time constants that results from sensing signals with low impedance nodes [3]–[5]. In addition, it should be noted here that although line inductance is assumed to be negligible because of high-conductor attenuation (i.e., $R \gg \omega L$ ), current sensing techniques have been shown to be effective for driving long interconnects in the presence of on-chip Manuscript received March 4, 2003; revised February 16, 2004. This work was supported in part by the National Science Foundation and Semiconductor Research Corporation through Advanced Micro Devices under Award Contract 983,001. R. Bashirullah was with the Department of Electrical Engineering, North Carolina State University, Raleigh NC 27695-7914 USA. He is now with the Department of Electrical and Computer Engineering, University of Florida, Gainsville, FL 32611 USA. W. Liu is with the School of Engineering, University of California at Santa Cruz, Santa Cruz, CA 95064-1007 USA. R. Cavin and D. Edwards are with Semiconductor Research Corporation, Research Triangle Park, NC 27709 USA. Digital Object Identifier 10.1109/TVLSI.2004.831481