# 國立交通大學

# 

系統晶片/系統封裝之低測試成本方法研究 Low-Cost Test Methodologies for Modern SoC/SiP Designs

研究生:林佳毅

指導教授:陳宏明 教授

中華民國九十九年五月

# 系統晶片/系統封裝之低測試成本方法研究 Low-Cost Test Methodologies for Modern SoC/SiP Designs

研究生:林佳毅

Student: Chia-Yi Lin

指導教授:陳宏明教授

Advisor: Prof. Hung-Ming Chen



電子工程學系電子研究所



Submitted to Department of Electronics Engineering and Institute of Electronics College of Electrical and Computer Engineering National Chiao Tung University In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy In Electronics Engineering

> May 2010 Hsinchu, Taiwan, Republic of China



## 系統晶片/系統封裝之低測試成本方法研究

研究生:林佳毅 指導教授:陳宏明教授

#### 國立交通大學

電子工程學系 電子研究所

#### 中文摘要

在現代的系統晶片設計中,隨著系統設計越趨複雜,晶片測試成本占晶片製造成本比例越來越高,透過有效的測試策略來降低測試成本變成很重要的議題。 在晶片測試的過程中,往往因為過高的功率消耗導致晶片燒燬、良率下降,因此 我們針對測試功率和大量的測試資料加以探討改進其處理方法。另外因系統晶片 及系統封裝技術的改進,系統級的晶片成本已漸漸降到一般人可接受的範圍,系

在第一章裡,我們把近期所遇到的問題及相關的研究結果加以整理,簡略說 明過去的一些測試方法。在第一章最後,我們概要的整理說明整篇論文的結構。 於第二章和第三章中,我們描述以解碼器為基礎的方案和相對應的方法論,是以 選擇性的測試資料壓縮法為基礎。這些方案在送測試資料進晶片測試時能夠避免 大量的位元瞬間由0變1或由1變0,進而導致晶片消耗大量功率(switching power),透過計算的方法,我們可以盡量減少掃瞄鏈(scan chain)上的訊號變動 以達到低功率測試的目的。在第四章中,我們討論泛型(general)的多維度測試 方案以達到減低測試功率、測試資料量和測試時間的目的,此方案僅會多出少量 的晶片面積。另一方面,隨著技術的進步,系統封裝技術已經普遍,在第五章中 我們進一步的探討系統封裝技術,並且提出測試方案在系統封裝中連線測試上的 應用。最後在第六章我們總結本論文的貢獻及未來的一些挑戰。



## Low-Cost Test Methodologies for Modern

## SoC/SiP Designs

Student: Chia-Yi Lin

Advisor: Hung-Ming Chen

Department of Electronics Engineering National Chiao Tung University

#### Abstract

In modern SoC designs, the test strategy is becoming one of the most important issues due to the increase of the test cost, among which we focus on the large test power dissipation and large test data volume. In this dissertation, we propose related test schemes to suppress the test power, test data volume, and test time for test cost reduction. In addition, due to the advancement of SoC/SiP technology, these kinds of technologies become more and more popular today. We also provide some of the low cost SiP test solutions in this dissertation.

We explain the basic idea of the chip test and related works in Chapter 1. In Chapter 2 and 3, we describe the decoder based schemes and methodologies which are based on the selective test pattern compression. These schemes can reduce considerable shift-in power by skipping the switching signal passing through long scan chains. Compared with the previous works, our test scheme achieves relatively small test power and test data volume. In Chapter 4, we propose an adaptive multi-dimensional scan-control scheme which can achieve low test power, small test data volume, and short test time with small area overhead. With the extra scan-control chains, we can access each sub-scan-chain easily and efficiently. Because the scan-control chains are very simple, the area overhead is very small. On the other hand, since System-in-Package (SiP) technique becomes a feasible solution to integrate multiple chips, in Chapter 5, we propose the test schemes which can test the RAM interconnections in SiP efficiently. Our test scheme can generate one test scheme is very fast. Finally, we conclude this dissertation and list some future works in Chapter 6.

# Acknowledgements(誌謝)

僅獻上最誠摯之謝意,感謝指導教授陳宏明博士,在學生論文研究期間給許許多寶貴建議及自由的空間,使學生能對測試領域作出些許貢獻並接觸許多不同領域之研究。

另外感謝家人們的支持使我無後顧之憂的完成此學位。然而在此論文研究期 間還有許多工作同仁們、助理們、學長姐及學弟妹的協助,亦要在此獻上最真摯 的謝意。

此論文之完成還要感謝口試委員李昆忠博士、張世杰博士、吳文慶博士、黃 俊郎博士、趙家佐博士、溫宏斌博士及陳朢矜博士在百忙中抽空協助,提供學生 許多寶貴建議,使得本論文得以更加完備。

要感謝的人太多, 衷心的感謝在這數個寒暑的過程中給予學生協助及幫助的 人, 謝謝你們。

林佳毅

謹誌於 新竹交大

九十九年五月

# Contents

| 1                                                                  | Intr                                 | Introduction                                                  |    |  |  |  |
|--------------------------------------------------------------------|--------------------------------------|---------------------------------------------------------------|----|--|--|--|
| 1.1 Test strategies for test power and volume reduction $\ldots$ . |                                      |                                                               |    |  |  |  |
|                                                                    | 1.1.1 Multiple scan chain techniques |                                                               |    |  |  |  |
|                                                                    |                                      | 1.1.2 Encoding techniques                                     | 3  |  |  |  |
|                                                                    |                                      | 1.1.3 Restructuring, decoder based and hybrid techniques      | 5  |  |  |  |
|                                                                    | 1.2                                  | System-in-Package (SiP) test                                  | 6  |  |  |  |
|                                                                    |                                      | 1.2.1 SiP interconnect test                                   | 6  |  |  |  |
|                                                                    | 1.3                                  | Overview and organization of the dissertation                 | 7  |  |  |  |
| <b>2</b>                                                           | Sele                                 | ective Pattern-Compression Scheme Using the Variable Input    |    |  |  |  |
| Fixed Output (VIFO) Encoding                                       |                                      |                                                               |    |  |  |  |
|                                                                    | 2.1                                  | Introduction                                                  | 8  |  |  |  |
|                                                                    | 2.2 VIFO scheme optimization method  |                                                               |    |  |  |  |
|                                                                    |                                      | 2.2.1 Pattern selection stage                                 | 11 |  |  |  |
|                                                                    |                                      | 2.2.2 Pattern compression stage                               | 12 |  |  |  |
|                                                                    |                                      | 2.2.3 Power optimization stage                                | 13 |  |  |  |
|                                                                    | 2.3                                  | Experimental results                                          | 15 |  |  |  |
|                                                                    | 2.4                                  | Summary                                                       | 17 |  |  |  |
| 3                                                                  | Sele                                 | ective Pattern-Compression Scheme Using the Fixed Input Vari- |    |  |  |  |
|                                                                    | able                                 | e Output (FIVO) Encoding                                      | 18 |  |  |  |
|                                                                    | 3.1                                  | Introduction                                                  | 18 |  |  |  |

|   | 3.2 | FIVO   | scheme optimization method and implementation                                                  | 19 |
|---|-----|--------|------------------------------------------------------------------------------------------------|----|
|   |     | 3.2.1  | Pattern compression stage                                                                      | 20 |
|   |     | 3.2.2  | Power optimization stage                                                                       | 21 |
|   | 3.3 | Implei | menting FIVO                                                                                   | 23 |
|   | 3.4 | Exper  | imental results                                                                                | 25 |
|   | 3.5 | Analy  | sis and discussion                                                                             | 27 |
|   |     | 3.5.1  | FIVO scheme analysis                                                                           | 27 |
|   |     | 3.5.2  | Discussion                                                                                     | 30 |
|   | 3.6 | Summ   | ary                                                                                            | 31 |
| Δ | Δn  | Adapt  | ive Multi-Dimensional Scan-Control Scheme for Low Test                                         | -  |
| • | Cos | t      |                                                                                                | 32 |
|   | 4.1 | Introd | uction                                                                                         | 32 |
|   | 4.2 | Propo  | sed architecture overview                                                                      | 33 |
|   |     | 4.2.1  | Two test modes of the proposed scheme                                                          | 34 |
|   | 4.3 | Two-d  | limentional scan shift control optimization methodology $\ldots$ .                             | 37 |
|   |     | 4.3.1  | Test data encoding and control code embedding $\ldots$                                         | 37 |
|   |     | 4.3.2  | Design of sub-scan-chain data design                                                           | 38 |
|   |     | 4.3.3  | Scan-in data encoding                                                                          | 39 |
|   |     | 4.3.4  | Implementation flow                                                                            | 39 |
|   | 4.4 | Exper  | imental results                                                                                | 40 |
|   |     | 4.4.1  | The results on test power, volume and time                                                     | 40 |
|   |     | 4.4.2  | The scan design realization                                                                    | 44 |
|   | 4.5 | Discus | sion $\ldots$ | 44 |
|   |     | 4.5.1  | 3-D architecture                                                                               | 44 |
|   |     | 4.5.2  | 1-D vs. 2-D vs. 3-D multiple scan chains in area overhead                                      | 46 |
|   |     | 4.5.3  | Adaptive multi-dimensional scan control scheme overhead anal-                                  |    |
|   |     |        | ysis                                                                                           | 46 |

|          | 4.6   | Summ   | ary                                           | 48 |
|----------|-------|--------|-----------------------------------------------|----|
| <b>5</b> | Low   | Cost   | SiP Interconnect Test                         | 50 |
|          | 5.1   | Introd | uction                                        | 50 |
|          | 5.2   | Propos | sed test circuit design and diagnosis methods | 51 |
|          |       | 5.2.1  | Test strategy for SiP interconnects           | 53 |
|          |       | 5.2.2  | Diagnosis of interconnects                    | 54 |
|          | 5.3   | The te | est scheme for multi-port RAM                 | 55 |
|          | 5.4   | The m  | athematic form of LFSR                        | 56 |
|          | 5.5   | Discus | sions                                         | 57 |
|          |       | 5.5.1  | Walking-0 solution                            | 57 |
|          |       | 5.5.2  | Overhead                                      | 58 |
|          | 5.6   | Case s | tudy for fault detection and analysis         | 59 |
|          |       | 5.6.1  | The data lines stuck-at-0                     | 59 |
|          |       | 5.6.2  | The address lines stuck-at-le                 | 60 |
|          |       | 5.6.3  | The data and address line short               | 60 |
|          |       | 5.6.4  | The wire-OR fault                             | 62 |
|          |       | 5.6.5  | The wire-Dominate fault                       | 62 |
|          |       | 5.6.6  | Diagnosis guideline                           | 64 |
|          | 5.7   | Summ   | ary                                           | 64 |
| 6        | Con   | clusio | n                                             | 65 |
|          | 6.1   | Our co | ontributions                                  | 65 |
|          | 6.2   | Future | works                                         | 66 |
|          |       | 6.2.1  | The 3D design challenges                      | 67 |
|          |       | 6.2.2  | The heterogeneous design challenges           | 68 |
| Bi       | bliog | raphy  |                                               | 68 |

# List of Figures

| 1.1 | A simple example to demonstrate the number of transitions in single      |    |  |
|-----|--------------------------------------------------------------------------|----|--|
|     | scan chain and multiple scan chain architecture                          | 4  |  |
| 2.1 | Proposed test schemes and ATE (automatic test equipment) diagram.        | 9  |  |
| 2.2 | The compressed scan unit and decoders structure in our fixed output      |    |  |
|     | scan chain architecture. The number of scan cells inside compressed      |    |  |
|     | scan unit depends on the optimization stage shown in Section 2.2         | 9  |  |
| 2.3 | Pattern selection stage. After we get the partitioned test data set, we  |    |  |
|     | use X bit omit ratio to determine which test pattern should be put       |    |  |
|     | into CSC data set and which test pattern should be put into NSC          |    |  |
|     | data set                                                                 | 11 |  |
| 2.4 | Pattern selection stage example. This case uses 0.3 as X-bit omit        |    |  |
|     | ratio to select the test pattern. (a) is original test pattern set which |    |  |
|     | has 17 test patterns. (b) is selected patterns for NSC. (c) is selected  |    |  |
|     | patterns for CSC                                                         | 12 |  |
| 2.5 | Pattern compression.                                                     | 13 |  |

| 2.6 | VIFO scheme case in pattern compression and power optimization          |    |
|-----|-------------------------------------------------------------------------|----|
|     | stages. We use a 4-bit encoding example to illustrate this approach.    |    |
|     | This figure shows the results of the merging procedure and all Xs       |    |
|     | are replaced by 0 after the merging procedure. (a) is the compressed    |    |
|     | test pattern. (b) shows the index number of each segment. (c) shows     |    |
|     | the new test data results and the mapping code comes from (b). (d)      |    |
|     | shows one test pattern transformation from original test data to new    |    |
|     | test data.                                                              | 14 |
| 2.7 | Experimental results for a 4-bit VIFO scheme using one scan chain       |    |
|     | architecture with different omit ratios. New total test size is normal- |    |
|     | ized to original test size. Power consumption is normalized to the      |    |
|     | power of all X bits filled with 0. This shows that the scheme provides  |    |
|     | both power and volume reduction.                                        | 16 |
| 3.1 | Our FIVO scan chain architecture including compressed scan units.       |    |
|     | A compressed scan unit is provided with n-bit decoder (n=3 in this      |    |
|     | illustration), which maps to several number of scan cells in normal     |    |
|     | scan chain. The last scan unit may not be equal to n in some cases      | 19 |
| 3.2 | FIVO scheme merge procedure.                                            | 20 |
| 3.3 | FIVO scheme case in pattern compression and power optimization          |    |
|     | stage. We use a 3-bit encoding example to illustrate our approach.      |    |
|     | This table shows the results after the merging process and all Xs are   |    |
|     | replaced by 0 after the merging process. (a) is the compressed test     |    |
|     | pattern of each segment. (b) shows the index number of each segment.    |    |
|     | (c) shows the new test data results and the mapping codes come from     |    |
|     | (b). (d) shows one test pattern transformation from original test data  |    |
|     | to new test data.                                                       | 22 |
| 3.4 | Circuit design of the FIVO scan chain architecture.                     | 23 |

| 3.5 | Waveforms of the control signal and decoders                             | 24 |
|-----|--------------------------------------------------------------------------|----|
| 3.6 | A 3-bit FIVO scheme experimental results using one scan chain archi-     |    |
|     | tecture with different omit ratio. Power consumption is normalized       |    |
|     | to power of all X bits filled with 0                                     | 25 |
| 3.7 | Compression rate analysis for FIVO scheme. (a) provides the com-         |    |
|     | pression effect analysis with the 3-bit scheme and the test data size is |    |
|     | 100x100 bits. (b) provides more results in different test data size in   |    |
|     | the 3-bit scheme. (c) provides the analysis results with 4-bit schemes.  |    |
|     | From the results of (b), (c), and (d), we can perceive that the 4-bit    |    |
|     | scheme provide better results than the 3-bit scheme at the same X-bit    |    |
|     | rate test data.                                                          | 28 |
| 4.1 | The proposed adaptive multiple scan chain architecture with 2-D $4x4$    |    |
|     | scan shift control chains.                                               | 33 |
| 4.2 | The design diagram of the proposed test architecture with routing        |    |
|     | connection details.                                                      | 34 |
| 4.3 | Circuit design of the proposed architecture. (a) is part of the sub-     |    |
|     | scan-chain design. (b) presents the details for one scan control chain.  | 35 |
| 4.4 | Regular scan mode waveform. In regular scan mode, the test patterns      |    |
|     | are shifted segment-by-segment                                           | 35 |
| 4.5 | Skipping scan mode waveform. The test patterns in skipping mode          |    |
|     | contain signal control codes which skip the segments with all X bits.    |    |
|     | In this figure, the 15th cycle and the 16th cycle skip two segments of   |    |
|     | sub-scan-chains, which can reduce the test cost                          | 36 |

| 4.6  | An encoding example of control signal. In this example, we consider                                                                          |    |
|------|----------------------------------------------------------------------------------------------------------------------------------------------|----|
|      | scan-in and scan-out data simultaneously because we need to check a                                                                          |    |
|      | bit in this sub-scan-chain which is not X bit. The original test data                                                                        |    |
|      | size is 42 bits in this example. By applying our encoding method, the                                                                        |    |
|      | scan-in test data is shrunk to 32 bits                                                                                                       | 38 |
| 4.7  | Pesudecode of control code embedding                                                                                                         | 39 |
| 4.8  | The proposed test design flow. In order to integrate the proposed                                                                            |    |
|      | scheme design into the traditional design flow, we insert an extra                                                                           |    |
|      | stage after the scan chain synthesis                                                                                                         | 40 |
| 4.9  | The test power (a), volume (b), and time (c) results of b17 circuit                                                                          |    |
|      | by applying different number of flip-flops in control 1 and 2. The x                                                                         |    |
|      | and y axes are horizontal. They represent the number of flip-flops                                                                           |    |
|      | in control 1 and control 2. The z axis is vertical, which represents<br>the normalized value. This shows that different number of flip-flops |    |
|      | in scan control 1 and 2 can get different results in test power, data                                                                        |    |
|      | volume, and time. And the prediction numbers from equations $(4.1)$ ,                                                                        |    |
|      | (4.2), and (4.3) are reasonable and provide good results. $\ldots$                                                                           | 43 |
| 4.10 | The routing result image of b17 circuit. The white lines are the scan-                                                                       |    |
|      | in paths                                                                                                                                     | 45 |
| 4.11 | The test scheme for stacked 3-D IC                                                                                                           | 45 |
| 4.12 | The overhead analysis of adaptive multi-dimensional scheme. The                                                                              |    |
|      | horizontal axis is the number of dimension. The vertical axis is the                                                                         |    |
|      | overhead. We use a 10000 scan flip-flops design as example to demon-                                                                         |    |
|      | strate the results.                                                                                                                          | 47 |

| 5.1 | The proposed test architecture. The LFSR+Analyzer block inte-            |    |  |  |  |  |
|-----|--------------------------------------------------------------------------|----|--|--|--|--|
|     | grated LFSR and Analyzer together. The SR+1 block has one more           |    |  |  |  |  |
|     | data flip-flop (DFF) than traditional shift register to generate one     |    |  |  |  |  |
|     | more clock cycle to fill test pattern at address 0. The detailed circuit |    |  |  |  |  |
|     | design is shown in Figure 5.2.                                           | 51 |  |  |  |  |
| 5.2 | The circuit design of the proposed test architecture. This example       |    |  |  |  |  |
|     | uses 4 bits data and address lines                                       | 53 |  |  |  |  |
| 5.3 | The circuit design of the LFSR+2+Analyzer test scheme for multi-         |    |  |  |  |  |
|     | port RAM                                                                 | 56 |  |  |  |  |
| 5.4 | The mathematic form of the LFSR and the 8 bits LFSR example              | 57 |  |  |  |  |
| 5.5 | The 3-D X-ray picture of the wire bonding short example                  | 59 |  |  |  |  |



# List of Tables

| 1.1 | The switching number in different kind of filling methods                      | 3  |
|-----|--------------------------------------------------------------------------------|----|
| 2.1 | VIFO scheme experimental results in test pattern reduction and power           |    |
|     | reduction using Mintest set<br>(ISCAS89 benchmarks)                            | 16 |
| 3.1 | Our FIVO scheme experimental results in test pattern reduction and             |    |
|     | power reduction using Mintest set(ISCAS89 benchmarks)                          | 26 |
| 3.2 | Our FIVO scheme result in single scan chain using Mintest set. $\# \mathbf{V}$ |    |
|     | means total number of test patterns. Most of test patterns are shifted         |    |
|     | from CSC ( $\#$ VC) but most of power dissipation is provided from NSC         |    |
|     | (NSCP). The power saving in CSC is very significant                            | 26 |
| 3.3 | Experimental results on test data volume using Mintest set                     | 27 |
| 3.4 | Experimental results on scan-in power comparison with $[1]$                    | 27 |
| 3.5 | Run time analysis of each circuit in 3-bit, 4-bit, and 5-bit FIVO              |    |
|     | scheme. The unit in this table is second                                       | 30 |
| 4.1 | The experimental results show that the test power, test data volume,           |    |
|     | and test time rate are reduced by applying small number of extra               |    |
|     | flip-flops in this scheme. Compared with traditional single scan chain         |    |
|     | and fill all X bits with 0's, the proposed scheme provides low test            |    |
|     | power, small test data volume, and short test time, especially the             |    |
|     | large circuit (b17 or b22).                                                    | 42 |
| 4.2 | The implementation overhead results of ISCAS'99 b17 circuit                    | 44 |

| 4.3  | The overhead comparison with conventional one-dimensional multiple     |    |  |  |  |
|------|------------------------------------------------------------------------|----|--|--|--|
|      | scan shift scheme. The proposed 2-D and 3-D scheme have significant    |    |  |  |  |
|      | reduction in the number of scan control flip-flops                     | 47 |  |  |  |
| 5.1  | The DFFs values of LFSR+Analyzer block. This LFSR can generate         |    |  |  |  |
|      | 15 patterns                                                            | 52 |  |  |  |
| 5.2  | The walking-1 behavior of SR+1 circuit.                                | 52 |  |  |  |
| 5.3  | The DFFs values in the proposed scheme for RAM test. The 4-bit         |    |  |  |  |
|      | address example needs 10 cycles.                                       | 54 |  |  |  |
| 5.4  | The DFFs values of the LFSR+2+Analyzer test scheme. A 4 bits           |    |  |  |  |
|      | address example needs 7 cycles to test the multi-port RAM. $\ldots$ .  | 55 |  |  |  |
| 5.5  | The DFFs values of the walking-0 patterns.                             | 58 |  |  |  |
| 5.6  | The overhead and requirements of the proposed test schemes. $\ldots$ . | 59 |  |  |  |
| 5.7  | The DFFs values of the fault diagnosis example, stuck-at-0             | 60 |  |  |  |
| 5.8  | The DFFs values of the fault diagnosis example, stuck-at-1             | 61 |  |  |  |
| 5.9  | The truth table of the wire-AND and wire-OR faults                     | 61 |  |  |  |
| 5.10 | The DFFs values of the fault diagnosis example, wire-AND               | 62 |  |  |  |
| 5.11 | The DFFs values of the fault diagnosis example, wire-OR                | 63 |  |  |  |
| 5.12 | The DFFs values of the fault diagnosis example, wire-Dominate $\ldots$ | 63 |  |  |  |
| 5.13 | The symptom and possible fault                                         | 64 |  |  |  |

# Chapter 1 Introduction

The chip design has evolved for over half a century. Although the chip design can achieve the required function, the fabrication of these chips may not work as expected. The goal of chip testing focuses on finding the chip with correct fabrication. In order to save power and to extend the active time of these electronic devices, the digital circuit designers focus on the low power design structure. However, the large test power may damage the low power design circuit especially in nanometer scale. Instantaneous high current flow causes electron migration that can impact the chip yield and chip reliability. Those result in the extra cost of chip fabrication.

Test researchers propose many ideas to reduce the test power by manipulating test patterns or new test schemes. Specifically, the compression algorithms reduce the total test data volume, and the chip test time can be reduced. Since the test cost is charged by the length of the test time, shorter test time costs less money. In fact, many companies try to apply test with shorter test time and, if possible, with fewer pins [2].

In the nanometer era, millions or billions of gates are fabricated into a small die. A chip integrates many functions to build a system. The stacked integration process is developing to integrate more functions in a package. With the third dimension of the chip design technology, the test engineers will encounter many challenges and solve them in the near feature. In the following sections, we briefly describe the preliminary works which are related to this dissertation, then show the organization of the dissertation.

## 1.1 Test strategies for test power and volume reduction

The chip design and fabrication technology develop quickly and lead to a dramatic growth of scan cells for DFT application. It makes the large-scale chip designs require large test data volume and longe scan chains. In order to reduce the test data volume, researchers are actively pursuing new solutions for test data compression. On the other hand, the large number of scan cells may generate a huge number of switching activities in the test mode, which may have never been generated in the normal mode [3, 4]. The peak and average power resulting from those extraordinary switching activities in the test mode may cause chip malfunction and decrease its reliability [4, 5], which implies additional yield and cost loss. Unfortunately, an effective test data compression scheme may not necessarily be an effective power minimization scheme.

In fact, test power, test data volume, and test time are all important issues in testing field. When the chip is in test mode, some test patterns activate all the blocks inside the chip, which causes large power consumption and may damage the chip during test [6]. Another issue is test data volume. The memory capacity of ATE (Automatic Test Equipment) is limited but large test data volume forces the test company to upgrade the ATE, which is very costly. If the test time is reduced, the chip test process cost will also be reduced.

Ravikumar in [7] lists some power reduction techniques like scan cell reordering, scan chain segmentation, and scan chain disabling. The authors in [8] applies Minimum Transition Fill (MT-fill) technique to achieve low test power, and this method minimizes the transitions in one scan chain. The paper [9] considers the successive test pattern to fill the X bits in test pattern. This approach can also reduce test power. Table 1.1 shows the transition number of different type of filling methods to fill the don't care (X) bits in the test pattern. In addition, [10, 11, 12] apply reorder techniques to reduce the test power by minimizing the transition number among the test data. Below we describe some categories in test cost reduction techniques.

| Original test pattern | 0X1XX100X |             |
|-----------------------|-----------|-------------|
| Method                | pattern   | transitions |
| MT-fill               | 001111000 | 2           |
| Fill 0                | 001001000 | 4           |
| Fill 1                | 011111001 | 3           |

Table 1.1: The switching number in different kind of filling methods.

#### 1.1.1 Multiple scan chain techniques

Multiple scan chain is a kind of scan chain segmentation method and it can also be implemented as scan chain disabling technique. In multiple scan chain architecture, each sub-scan-chain has shorter length than the original single scan chain architecture. That really reduces many unnecessary switching activities during pattern shift-in periods. For example, assume the initial value of the scan flip-flops inside the chip is 0 and we need to shift the test pattern 00100010 into the chip. The total transition count is 18. If we use two sub-scan-chains to shift these two test pattern 0010 and 0010. The total number of transition is 10. Figure 1.1 shows the details of the calculation. [13], [14] and [15] use this kind of techniques to reduce test power. Moreover, [11, 16, 17, 18, 19] use multiple-scan-chain technique to reduce test power or test time. The architecture in [11] can shutdown or enable a partial range of the sub scan chains, which reduces both peak power and total power consumption.

#### 1.1.2 Encoding techniques

Test pattern compression is one of the solutions to reduce the large test data volume. Various encoding techniques have been applied in this field. Huffman encoding,

Figure 1.1: A simple example to demonstrate the number of transitions in single scan chain and multiple scan chain architecture.

Golomb encoding, and nine-code encoding have been proposed and these techniques provide different compression results [1, 20]. Statistical and Huffman encoding in [21, 22, 23] are used for test volume and test application time reduction. [21] uses 4-bit blocks and encodes test patterns to Huffman code by calculating the pattern frequency. [22] uses run-length (technique in [24]) encoding and Huffman encoding technique to compress data. The basis of run-length encoding lies in encoding a long data string by counting the number of repeated characters or patterns. Nine-code encoding technique is applied in [20] to compress test data. It uses 9 code words to encode data.

With the help of encoding technique, researchers in [25] apply Golomb coding to shrink the test data volume and transitions of the test pattern. Linear feedback shift register (LFSR) based technique is also a popular testing architecture. For example, LFSR reseeding approach [6] utilizes the random-fill property for X bits to achieve high encoding efficiency while introducing many switching activities. Since LFSR reseeding technique provides many transitions, [26] uses reseeding and mask technique to achieve low test power. Therefore, to develop a test compression scheme targeting both high compression ratio and large test power reduction is an important goal in the test area.

By using extra registers and control signals, the embedded test pattern generator [27] can generate low power test pattern [26, 28]. In order to disable the unnecessary switching activities in a multiple-scan-chain scheme, [14] only deals with the necessary scan-in data and scan-out data. Furthermore, [29] tries to minimize the distance between scan-in data and scan-out data. There are some methods which use structure and memory base scheme to encode test data. In reconfigurable switch structure, [30] applies fewer bits to reconfigurable switch block.

#### 1.1.3 Restructuring, decoder based and hybrid techniques

For test time reduction, researchers in [31] apply bypass technique to test the multiple modules in one chip to reduce scan-in time. The scan-in data pass through mux to the determined segment, which can reduce many scan-in cycles. Researches modify the test structure and manipulate the test pattern to reduce scan-in data [30] [19]. In addition, researchers in [32], [33], and [34] use memory array concept to fill the test pattern into the scan flip-flops. This concept can also reduce test power, test data volume and test time at the same time.

[35] provides an inverter-interconnect based decompression network to decode the test data. The broadcast approach in [36, 37] uses a broadcaster to distribute few control bits and generates a large number of bits to internal scan chains. The Multi-layer Data Copy (MDC) scheme's method in [38] duplicates the test data inside the chip. [39, 40] use dictionary based methods with memory to encode test data. [17, 41] use decoders to decode the pre-encoded test data, providing reduction on power and test data volume. [28, 14] manipulate the embedded deterministic testing (EDT) structure and patterns to achieve low power test.

For unification solutions, [25, 20] use finite state machine (FSM) mechanism

and various kinds of data compression techniques to reduce scan power, test data volume, and test time. [1] modifies run length and combines resource partition to achieve low power and small test data volume. [25] also combines Golomb coding (technique in [42]) to deal with the power and volume issues.

## 1.2 System-in-Package (SiP) test

In order to reduce design cost, various applications are integrated in a chip or a package [43]. Wireless modules, nonvolatile memory, SRAM and dynamic random access memory (DRAM) modules may be integrated in a personal digital assistant (PDA) or a cellular phone. System-on-Chip (SoC) designs put different modules in a chip but SiP uses package technology to put different chips in a small package. Since SiP has the advantages including smaller size, higher performance and low power [44], it is widely used in system realization solutions. It is also a low cost solution for heterogeneous system integration. However the testing for SiP does not have a standard. The test often combines functional test at system level or defect-based test at the die and interconnect level [45].

#### 1.2.1 SiP interconnect test

As the SiP technology evolved, multiple-chip integration becomes more and more popular today. Testing each module and chip-to-chip interconnections emerges as important testing issues [46, 47] in SiP. The IEEE 1149.1 [48] and Embedded Core Test [49] provide good guidelines in SiP/SoC testing. However, RAM, which is a very common module in SiP, often does not have a standard for SiP test. Dilip [47] and Jong [50] developed the interconnect test in memory. By using the special characteristics of RAM, we can use these useful input and output pins to perform the interconnect test.

## **1.3** Overview and organization of the dissertation

In this dissertation, Chapter 1 introduces the test power, test data volume and test time issues, and reviews the previous works in this area. For the recent technology, we also survey the new test challenges in SoC, SiP and 3D designs. In the following chapters, we describe the proposed solutions for these issues.

In Chapter 2, we discuss the different kinds of decoder based test schemes and focus on the variable input pins and fixed output pins (VIFO) in these schemes. With the specific encoding method, this test scheme provides a good result in test data compression. Based on the methodology, we focus on the test scheme with fixed input pins and variable output pins (FIVO) decoders in Chapter 3. The compression analysis of this scheme is also provided in this chapter as well.

Due to the large-scale chip designs, we present a generic multi-dimensional scancontrol scheme in Chapter 4. In fact, this scheme can also be implemented for stacked-die test. We can use a few control logic gates with 2 or 3 dimension test scheme to test the large-scale chip systems. On the other hand, with more SiP and possible 3D designs, the related test techniques are essential and necessary, among which memory has played a key role. However, a bare memory die often do not provide the test standard circuit for system integration. So, we provide an interconnect test scheme for the RAM to integrate in the large system in Chapter 5. With this scheme, we can test the interconnects between the system and the memory dies in a more efficient way. Finally, the conclusion the future trends of this dissertation are provided in Chapter 6.

# Chapter 2

# Selective Pattern-Compression Scheme Using the Variable Input Fixed Output (VIFO) Encoding

### 2.1 Introduction

In order to deal with the shift-in power and test data volume problems in scan designs, decoder-based schemes are proposed [22, 23]. Here we define the decoders with fixed (F) or variable (V) input (I) and output (O). There are 4 choices: FIFO, FIVO, VIFO, VIVO. However, the FIFO scheme has less flexibility than VIFO in the input, which may cause some unnecessay input bits on the decoders. Since the FIFO scheme will reduce the compression efficiency, we do not implement the FIFO scheme since it needs some constraints to reduce the complexity in implementation. We implement the FIVO scheme which is a pseudo VIVO scheme <sup>1</sup>. After the preliminary analysis, we choose fixed output (VIFO) and fixed input (FIVO) schemes to be implemented in this dissertation.

The proposed two schemes, VIFO and FIVO, which can obtain good results in test power and test volume reduction. Our methodology consists of three steps:

<sup>&</sup>lt;sup>1</sup>In fact, we define the maximum fan-out as constraint in the FIVO scheme. If the number of output bits is larger than the defined maximum fan-out number, we will reduce the input number bits to meet the constraint.



Figure 2.1: Proposed test schemes and ATE (automatic test equipment) diagram.



Figure 2.2: The compressed scan unit and decoders structure in our fixed output scan chain architecture. The number of scan cells inside compressed scan unit depends on the optimization stage shown in Section 2.2.

define the solution scheme, deal with the complex and simple patterns separately, and refine the final solution. With this methodology, these two schemes present different behaviors. The compressed scan units of the VIFO scheme have a fixed number of output bits but a variable number of input bits. The compressed scan units of the FIVO scheme have a fixed number of input bits but a variable number of output bits. In our implementation, both of these two schemes can reduce significant amount of test data volume. These two schemes are presented in this chapter and Chapter 3.

The VIFO scheme uses compression techniques to reduce test data and power.

The test concept diagram is shown in Figure 2.1. The optimization flow can achieve low power and small test data volume respectively. The proposed selective scan chain architectures are shown in Figure 2.2. We intend to separate all of the test patterns into two groups: the first group of test pattern is used for compressed scan chain (CSC), the shift-in patterns are in compressed form; the second group of test patterns is used in normal scan chain (NSC), the shift-in patterns are not compressed. In this architecture, CSC uses the first group of test patterns which has more X bits to manipulate. X bit ratio is the ratio of the X in a single pattern length (SPL). In this work, we use the X bit omit ratio as an indicator to separate the test patterns: if the X bit ratio of a test pattern is smaller than the X bit omit ratio, the pattern belongs to the NSC group. Otherwise the test pattern belongs to the CSC group.

Compressed patterns need special decoders to decode them. The fixed-output-bit version is shown in Figure 2.2. The decoders have fixed number of bits for output. Figure 2.2 shows the 1 unit-to-4 scan cells structure. Each compressed scan unit handles four output bits. The number of scan cells inside the compressed scan unit depends on the optimization stage described in Section 2.2.

## 2.2 VIFO scheme optimization method

This section will introduce the optimization methodology on test data and power in VIFO scheme. The methodology consists of scan chain partition and three stages. Scan chain partition is to divide the original single scan chain to multiple scan chains. The test patterns in the same partition will be processed via the following three stages. The first stage is pattern selection: it sets the X bit omit ratio in order to select the pattern for the CSC. The second stage is pattern compression: it merges the test patterns in the same segment of test sets. The third stage is power optimization stage: it uses shorter pattern length and applies greedy search to find



Figure 2.3: Pattern selection stage. After we get the partitioned test data set, we use X bit omit ratio to determine which test pattern should be put into CSC data set and which test pattern should be put into NSC data set.

the smaller power consumption code in the CSC segment by segment.

#### 2.2.1 Pattern selection stage

This stage separates the test patterns into two groups by the X bit omit ratio. If the test pattern's X bit ratio is smaller than the given X bit omit ratio, this test pattern will belong to the NSC group, or it will belong to the CSC, as shown in Figure 2.3.  $SPL_{new}$  is the CSC test data length that comes from compressed scan unit (Figure 2.2). Total original test size in equation (2.1) and total new test size in equation (2.2) are used to calculate the test data volume in Chapter 2 and 3.

 $Original\_test\_size_{total}$ 

 $= Pattern\_number_{org} \times SPL_{org}.....(2.1)$ 

 $New\_test\_size_{total}$ 

 $= NSC\_pattern\_number \times SPL_{org}$  $+CSC\_pattern\_number \times SPL_{new}......(2.2)$ 



Figure 2.4: Pattern selection stage example. This case uses 0.3 as X-bit omit ratio to select the test pattern. (a) is original test pattern set which has 17 test patterns. (b) is selected patterns for NSC. (c) is selected patterns for CSC.

The example is shown in Figure 2.4. There are 17 patterns in the original test pattern set (Figure 2.4 (a)). If we set the X bit omit ratio as 0.3, we can get 3 patterns in the NSC test data set (Figure 2.4 (b)) and 14 patterns in the CSC test data set (Figure 2.4 (c)).

#### 2.2.2 Pattern compression stage

When the test patterns for the CSC are determined, we use these patterns in the compression stage. In this stage, we attempt to shrink coding size as small as possible. As we explain the process, we use 4 bits as fixed output size. If the original test data length is 21 bits, we divide it into 6 segments. The first step is to compress test patterns from the first segment to the sixth segment. The next step merges patterns in each segment. For instance, this step will merge pattern XX11 and 1XX1 to pattern 1X11. Figure 2.5 shows a simple example. In this example, we use two test patterns to explain the merge process. The first segment of these

 Two test patterns.

 X000 1XXX X000 1XXX XXXX X

 X001 1011 00X0 01XX XXXX 1

 \_\_\_\_\_\_

 X000 1011 0000 1XXX XXXX 1

 X001 011X 0000 1XXX XXXX 1

 X001 01XX

 Merged in each column

Figure 2.5: Pattern compression.

two test patterns can not be merged but the second segment can be merged to 1011. When the merging procedure is done in each column, we can get the results in Figure 2.6 (a) by replacing all of the Xs with 0.

We define TNC as the total number of compression results. If  $TNC \leq 2$ , the number of cells in compressed scan unit (CSU) is set to 1. This means we can use 1 input bit to control all of the test patterns in the output. If  $2 < TNC \leq 4$ , the number of cells in CSU is set to 2. When we get a TNC bigger than 8, we will not encode the segment. Figure 2.6 (a) is the compression result and the patterns that come from the CSC test data set in Figure 2.4 (c). Figure 2.6 (a) shows the compression result and all Xs are replaced by 0 on compressed data after the compression process. Each compressed pattern is given a number as an index. Through the compression procedure, the first and second segment of the compressed test data in Figure 2.6 (a) both need 2 bits to encode the whole segment. The third segment needs 3 bits to encode since the TNC in this segment is 8.

#### 2.2.3 Power optimization stage

After the pattern compression stage, this subsection introduces the power optimization method to minimize the shift-in power in the CSC. We set the bit before first bit of each pattern to 0 and we apply greedy search method to find the lower power coding. This stage maps new encoded data to the compressed pattern index. We



Figure 2.6: VIFO scheme case in pattern compression and power optimization stages. We use a 4-bit encoding example to illustrate this approach. This figure shows the results of the merging procedure and all Xs are replaced by 0 after the merging procedure. (a) is the compressed test pattern. (b) shows the index number of each segment. (c) shows the new test data results and the mapping code comes from (b). (d) shows one test pattern transformation from original test data to new test data.

use the example in Figure 2.6 (b) to explain this mapping. The first segment of Figure 2.6 (b) has 3 different compressed codes. The mapping code and index number can be found in Figure 2.6 (a). By finding all of the possible codes from the first segment to the last segment, we can get the final encoding result for the CSC. The result is shown in Figure 2.6 (c). This stage can finds out the mapping result which has the smallest transition counts. The permutation of 3 bits, with 8 new encoded data, is 40320(8!). The permutation of 2 bits, with 4 new encoded data, is 24(4!). The optimization method tries to find all the permutations from the first segment to the last segment and decides the minimal switch power encodings. The encoding result becomes new encoded data.

The first pattern in Figure 2.6 (b) shows that 030311 is the index number of the test pattern. This pattern maps to test result 000010100000110001010 and new test

pattern 01000101010. The first 4 bits of the test result is 0000. It comes from the mapping number 0 in Figure 2.6 (b) and first segment of pattern 0000 in Figure 2.6 (a). The 5th to 8th bits of the test result are 1010. It comes from the mapping number 3 in the second column of Figure 2.6 (b) and the second column of pattern 1010 in Figure 2.6 (a). At last, we can get the test result 000010100000110001010. The new test result for the CSC is shown in Figure 2.6 (c). The new test result comes from the power optimization process. Each row in Figure 2.6 (c) will be shifted into the CSC. Figure 2.6 (d) integrates a pattern result of aforementioned stages.

#### 2.3 Experimental results

We implement the program in C and compile the program with Dev-C++ 4.9.9.2. The operating system is Windows XP SP3 running on the Intel(R) Prntium(R) 2.4GHz CPU with 512MB RAM. We use 4 bits output scan unit as our experimental architecture which is shown in Figure 2.2. The experimental results are shown in Table 2.1 and Figure 2.7. Table 2.1 shows that different circuits need different X bit omit ratios to obtain the lowest test power. The caculation of total power consumption consists of a shift in power in the CSC, NSC, and transfer power from the CSC and NSC. Using 55% as the X bit omit rate in benchmark s38584, we can get 55% power (compared with filling all X with 0) and 55% test data volume.

In Figure 2.7, we observe that both power and data volume reduction are 45% at X bit omit ratio=55%. All of the results in Table 2.1 and Figure 2.7 are normalized with the data of all X bits filled with 0. This technique achieves small test data size in each circuit and it has some power reduction percentage in these experimental results. The area overhead proportion compared with the original circuit of our decoders (Decoder Area). The area overhead calculation in Figure 2.7 is shown in equation (2.3). We can see that the area overhead is around 10% to 20% in Figure

Table 2.1: VIFO scheme experimental results in test pattern reduction and power reduction using Mintest set(ISCAS89 benchmarks).

| Circuit | Orig. | Orig.  | New test | New vol. | Orig.    | New shift-in | New shift-in |
|---------|-------|--------|----------|----------|----------|--------------|--------------|
| Circuit | SPL   | vol.   | vol.     | rate     | power    | power        | power rate   |
| s5378   | 214   | 23754  | 13401    | 0.56     | 396508   | 252392       | 0.64         |
| s9234   | 247   | 39273  | 23535    | 0.60     | 647634   | 377319       | 0.58         |
| s13207  | 700   | 165200 | 62843    | 0.38     | 1898650  | 1005255      | 0.53         |
| s15850  | 611   | 76986  | 36836    | 0.48     | 1754458  | 995196       | 0.57         |
| s35932  | 1763  | 28208  | 15778    | 0.56     | 653390   | 268537       | 0.41         |
| s38417  | 1664  | 164736 | 88968    | 0.54     | 11723227 | 6306253      | 0.54         |
| s38584  | 1464  | 199104 | 101520   | 0.51     | 11733372 | 7938510      | 0.68         |



Figure 2.7: Experimental results for a 4-bit VIFO scheme using one scan chain architecture with different omit ratios. New total test size is normalized to original test size. Power consumption is normalized to the power of all X bits filled with 0. This shows that the scheme provides both power and volume reduction.

2.7.

Area\_Overhead\_Proportion

$$= \frac{Decoder\_Area}{Original\_Circuit\_Area\_without\_Scan\_Chain} \dots (2.3)$$

## 2.4 Summary

The proposed VIFO scheme with related encoding techniques can reduce test power consumption and test data volume. The experimental results show that the area overhead is around 10% to 20%. We plan to find some method to shrink the overhead to a smaller percentage. In fact, we can use more bits on the output side to enhance the reduction rate of the test data volume with this scheme but the design complexity would be higher.

The VIFO scheme's decoders have fixed number of bits at the output. This might restrict the compression efficiency of this scheme. By using the same methodology, we will present the FIVO scheme in the next chapter. This scheme has flexible number of output pins in FIVO's decoders.



# Chapter 3

# Selective Pattern-Compression Scheme Using the Fixed Input Variable Output (FIVO) Encoding

## 3.1 Introduction

In the previous chapter, we use the VIFO encoding to achieve low power and small test data volume with some area overhead. This chapter will describe the FIVO encoding to manipulate the test data. From the experimental result, we can achieve smaller test data than VIFO scheme and smaller area overhead as well.

The fixed input variable output (FIVO) scheme has a fixed number of bits in the input. It means that the decoders of this scheme has a fixed number of input bits but the decoder's output bit number is variable. For example, Figure 3.1 shows 3-bit decoder structure with original scan chain and compressed scan chain. It provides  $2^3$  conditions of coding results. Each condition provides decoding results to the normal scan cell as test data. The number of scan cells in each condition depends on the optimization stage shown in Section 3.2.



Figure 3.1: Our FIVO scan chain architecture including compressed scan units. A compressed scan unit is provided with n-bit decoder (n=3 in this illustration), which maps to several number of scan cells in normal scan chain. The last scan unit may not be equal to n in some cases.

# 3.2 FIVO scheme optimization method and implementation

This section will introduce the FIVO scheme optimization methodology. The methodology is similar to the previous VIFO scheme. If we apply the multiple scan chain technique, it consists of a scan chain partition and three stages, and the different partitions go through these three steps independently. Otherwise it consists of three stages. The three stages are pattern selection, pattern compression, and power optimization. Pattern selection is the same with the previous scheme (Section 2.2.1). Techniques in pattern compression and power optimization are different. We will focus on these two stages in this section. Similarly, all of the test patterns in the same partition go through these three stages.

The first stage is pattern selection, it sets the X bit omit ratio in order to select the pattern for CSC. The second stage is pattern compression, it merges variable length of test patterns in the same segment of test sets. The third stage is power optimization stage, it uses shorter pattern length and applies greedy search to find the smallest power consumption code in the CSC segment by segment.


#### **3.2.1** Pattern compression stage



Since pattern selection method of the FIVO scheme is the same as the VIFO scheme, we introduce the pattern compression stage in this subsection. This stage contains merging, extending, and maximizing steps as follows.

We use an example of n = 3 to illustrate the procedure in this stage. The first step is to compress test patterns with 3-bit segments from the first bit of the CSC test pattern set. Each 3-bit decoder provides 8 different codes. Each code represents one compressed pattern which is merged from the original test data. For instance, this step will merge pattern X11 and XX1 to pattern X11. Moreover, XX11, X111, 0111 will be merged to code 0111 in the same segment. If the total number of compression results is smaller than 8, it extends the segment to 4 bits. Until the total number of compression results (TCRS) is maximum but  $TCRS \leq 8$ , the results are encoded to 3 bits in the CSC. The details is shown in Figure 3.2. The test pattern comes from Figure 2.4 (c). The first step is trying to merge 3 bits and we found that 1 bit is enough. We try to merge 4 bits and the results need 2 bits to encode. If we merge 5 bits, we also needs 2 bits to encode. Finally, we can merge 11 bits and the results need 3 bits to encode.

Next, we encode another 3 bits. At last, it may have 2 bits or 1 bit decoder at the end. If the number of compression results is 3 or 4, the results will be encoded to 2 bits. If the compression results equal to 1 or 2, the results will be encoded to 1 bit. We also restrict the maximum number of one decoder output to 256. If the output number of a decoder approaches 256, we will finish processing this segment and the input number of this segment's encoder may be fewer than 3. Finally, we can use a 4-bit or 5-bit decoder that provides 16 or 32 different codes.

To be more specific, a realistic case is provided here. This example shows 14 test patterns in the CSC. The compression pattern number will equal or be smaller than the original pattern number after the merging step. Figure 3.3 (a) shows the compression results and all Xs after the compression are replaced by 0. Each compressed pattern is given a number as index. Through the compression procedure, the first segment of compressed test data in Figure 3.3 (a) applies 3 bits (6 codes) to encode 11 bits data.

#### 3.2.2 Power optimization stage

In order to minimize the shift-in power with n-bit based encoding, the greedy search method are applied in each segment to find the lower power coding. Assuming the initial state of the scan chain is 0, it will obtain the optimal solution (n = 3) and a heuristically good solution (n > 3) after pattern selection and pattern compression stages. This stage maps new encoded data to the compressed pattern. The first column of Figure 3.3 (a) needs 6 different compressed codes to map. In Figure 3.3 (a) (b) (c), the test result 00001000000 maps to 0, and 0 maps to the new test pattern 110 in the first row first column of Figure 3.3 (b) (c). The CSC shift-in data will be 110 and the decoder will produce 00001000000 to the NSC.

In this stage, we try to get the mapping result with the smallest transition counts. The permutation of 3 bits, with 8 new encoded data, is 40320(8!). The optimization

|    |                |            |       | ~    | 4    |        | 440    | 001 |
|----|----------------|------------|-------|------|------|--------|--------|-----|
| 0  | 00001000000    | 0010010001 |       | 1    | 4    |        | 110    | 101 |
| 1  | 00011011000    | 0100000010 |       | 2    | 1    |        | 100    | 010 |
| 0  | 01110000010    | 000000000  | _     | 3    | 0    |        | 011    | 101 |
| 2  | 01110000010    | 0000000000 |       | 4    | 2    |        | 001    | 100 |
| 3  | 00101110011    | 1000001010 |       | 0    | 3    |        | 110    | 011 |
| 4  | 00011010111    | 010000001  |       | 5    | 4    |        | 000    | 001 |
| 4  | 00011010111    | 010000001  |       | 1    | 4    |        | 111    | 001 |
| 5  | 00000001101    | 100000000  |       | 1    | 5    |        | 111    | 111 |
| 6  |                | 101000000  |       | 1    | 6    |        | 111    | 110 |
| -  |                | 101000000  |       | 2    | 7    |        | 100    | 000 |
| 7  |                | 1110000000 |       | 4    | 6    |        | 001    | 110 |
|    | (2)            |            |       | 5    | 1    |        | 000    | 010 |
|    | (a)            |            |       | 5    | 0    |        | 000    | 101 |
|    |                |            |       | (    | b)   |        | (      | c)  |
| 0r | iginal Test Da | ata        | X0001 | XX)  | (X00 | 01X    | XXXXXX | X   |
| Te | st Data in NSC | 00001      | 000   | 0000 | 010  | 000000 | )1     |     |
| Ma | pping Code     | 0          |       |      | 4    |        | _      |     |
| Te | st data in CSC | 110        |       |      | 001  |        |        |     |
|    |                | (0         | 4)    |      |      |        |        |     |

Figure 3.3: FIVO scheme case in pattern compression and power optimization stage. We use a 3-bit encoding example to illustrate our approach. This table shows the results after the merging process and all Xs are replaced by 0 after the merging process. (a) is the compressed test pattern of each segment. (b) shows the index number of each segment. (c) shows the new test data results and the mapping codes come from (b). (d) shows one test pattern transformation from original test data to new test data.

method tries to find all the permutations from the first segment to the last segment, and decides the minimal switch power encoding. The encoding results become new encoded data. Since the permutations of 4 and 5 bits of encoded data are very large, the computation time of the optimization stage will be very long. Only to optimizing the partial encoded data is recommended.

The first row in Figure 3.3 (b) shows that 04 is the index number of the test pattern. This pattern maps to test result 000010000000100000001 and new test pattern 110001. The test result can be obtained from Figure 3.3 (a) by index. In this example, the index code 0 in the first column of Figure 3.3 (a) maps to 0000100000 and maps to 110 in Figure 3.3 (c). The index code 4 in the second column of Figure 3.3 (a) maps to 0100000001 and maps to 001 in Figure 3.3 (c). We can also perceive that all of the index codes 0 in the first column of Figure 3.3 (b)



Figure 3.4: Circuit design of the FIVO scan chain architecture.

map to 110.

With the mapping method, we can get the CSC shift-in codes. Each row in Figure 3.3 (c) will be shifted into the CSC. Figure 3.3 (c) is the power optimization result codes. We can observe that the compressed patterns are encoded to fewer bits of data. Figure 3.3 (d) shows a new test pattern (on the bottom), which is generated from original test pattern, to the compressed data mapping index, then to the compressed data in the CSC.

## 3.3 Implementing FIVO

Figure 3.4 shows the circuit schematic of the FIVO scheme. The function of this scheme is described in four steps. The first step is compressed pattern scan in. The compressed data are shifted from CSI (Compressed Scan-In) when CCKE (Compress Clock-Enable) is enabled. While the compressed data are ready, CCKE is disabled. The second step is decoding. Both DSE (Decode Scan-Enable) and NSE (Normal Scan-Enable) are enabled. The decoders fill the decoded test pattern to the normal scan chain from the DSI (Decode Scan-In) in this cycle. The third step is capture.



Figure 3.5: Waveforms of the control signal and decoders.

The test response fills from the data (D) pin of the flip-flop in the capture cycle. The fourth step is scan out. After the normal scan chain captures the test response, NSE is enabled, and a series of 0 are shifted from NSI (Normal Scan-In) that push the test response to the SO (Scan-Out). While the scan out operation is performed, CCKE can be enabled to shift another compressed test pattern, and compressed data are scanned out from CSO (Compressed Scan-Out).

Figure 3.5 shows the waveform behavior during the scan test. Compressed patterns are shifted from CSI pin into the compressed scan unit. When compressed patterns are ready, the DSE and NSE are raised to high to execute the decoding action. With the help of control signals, most of the unnecessary switching activities are disabled.

The area overhead is inevitable in this scheme. We implement the decoder with the ISCAS89 circuits. The decoders have 3 types, 3-to-N, 2-to-N, and 1-to-N. The compressed scan unit contains a decoder and flip-flops. The actual experimental results are shown in Table 3.2.



Figure 3.6: A 3-bit FIVO scheme experimental results using one scan chain architecture with different omit ratio. Power consumption is normalized to power of all X bits filled with 0.

## 3.4 Experimental results

We implement the program in C and compile these programs with gcc version 3.4.5. The programs run on a server with an Intel(R) Xeon(R) 5160 3.00 GHz CPU and 32GB memory. In our experimental results of FIVO scheme, different test data sets and circuits have different ratios to meet the lowest power and smallest test data volume. This shows that the FIVO scheme result of circuit s38584. Users can apply the X bit omit ratio to obtain the best power or the best compression. Figure 3.6 shows the FIVO scheme result provides the lowest power about 49% at 30% X bit omit ratio in circuit s38584. It uses 130 of total 136 patterns in the CSC. The smallest test data volume is about 31% at 65% X bit omit rate, and it uses 116 of 136 patterns in the CSC. Table 3.1 shows the best volume size results in each bench circuit.

Since we add extra CSC scan chains to the original scan chain, the calculation of total power consumption includes power in CSC, NSC, and transfer power from CSC to NSC. Table 3.2 shows single scan chain results of this work with circuit name, number of flip-flops (#DFFs), power of all X filled with 0 (Fill 0), proposed scheme

| Table | 3.1: | Our   | FIVO  | scheme  | experimental | results | in  | $\operatorname{test}$ | pattern | reduction | and |
|-------|------|-------|-------|---------|--------------|---------|-----|-----------------------|---------|-----------|-----|
| power | redu | ction | using | Mintest | set(ISCAS89  | benchm  | ark | (s).                  |         |           |     |

| Circuit | Orig.                | Orig.  | New test | New vol. | Orig.    | New shift-in | New shift-in |
|---------|----------------------|--------|----------|----------|----------|--------------|--------------|
| Circuit | $\operatorname{SPL}$ | vol.   | vol.     | rate     | power    | power        | power rate   |
| s5378   | 214                  | 23754  | 8830     | 0.37     | 396508   | 228202       | 0.58         |
| s9234   | 247                  | 39273  | 17607    | 0.45     | 647634   | 370290       | 0.57         |
| s13207  | 700                  | 165200 | 32048    | 0.19     | 1898650  | 929472       | 0.49         |
| s15850  | 611                  | 76986  | 22266    | 0.29     | 1754458  | 861100       | 0.49         |
| s35932  | 1763                 | 28208  | 10798    | 0.38     | 653390   | 246093       | 0.38         |
| s38417  | 1664                 | 164736 | 62179    | 0.38     | 11723227 | 5315026      | 0.45         |
| s38584  | 1464                 | 199104 | 61064    | 0.31     | 11733372 | 6859668      | 0.58         |

Table 3.2: Our FIVO scheme result in single scan chain using Mintest set. #V means total number of test patterns. Most of test patterns are shifted from CSC (#VC) but most of power dissipation is provided from NSC (NSCP). The power saving in CSC is very significant.

|         |     | Р     | ower consu | mption r | esults | and ex        | xtra overl | nead   |        |         |
|---------|-----|-------|------------|----------|--------|---------------|------------|--------|--------|---------|
| Circuit | #V  | #DFFs | Fill 0     | FIVO     | Peak   | #VC           | NSCP       | TP     | Max    | Decoder |
|         |     |       |            | (3bit)   |        |               |            |        | Fanout | Area    |
| s5378   | 111 | 214   | 396508     | 228202   | e 185  | 91            | 177037     | 15265  | 56     | 0.10    |
| s9234   | 159 | 247   | 647634     | 370290   | 194    | s <b>13</b> 8 | 195507     | 24810  | 27     | 0.13    |
| s13207  | 236 | 700   | 1898650    | 1305352  | 1492   | 211           | 1120611    | 136442 | 102    | 0.06    |
| s15850  | 126 | 611   | 1754458    | 861100   | 443    | 114           | 622774     | 57128  | 23     | 0.12    |
| s35932  | 16  | 1763  | 653390     | 246093   | 1441   | 10            | 233616     | 11578  | 256    | 0.03    |
| s38417  | 99  | 1664  | 11723227   | 5315026  | 1358   | 91            | 2772612    | 102532 | 46     | 0.14    |
| s38584  | 136 | 1464  | 11733372   | 6859668  | 1187   | 116           | 5745994    | 136489 | 35     | 0.09    |

power (FIVO (3-bit)), peak power (Peak), transfer power (TP), maximum fan-out number (Max Fanout) and area overhead proportion compared with the original circuit of our decoders (Decoder Area). The area overhead calculation in Figure 3.6 is shown in equation (2.3). The TP here is the switching activities caused by the test pattern filling from the decoders to the normal scan chain.

Table 3.3 shows the Mintest set compression results. Although previous works [51, 40, 52] have smaller volume than ours in some circuits, our 5 bits FIVO scheme still provides good compression results in these circuits. Table 3.4 shows the average power comparison with previous work proposed in [1]. The average power  $(P_{av})$  is calculated by the total switching power dividing the original number of patterns

| Circuit | Mintest | FDR[51] | ARL[1] | MDC[38] | Dictionary[40] | SDI[52] | FIVO    | FIVO    | FIVO    |
|---------|---------|---------|--------|---------|----------------|---------|---------|---------|---------|
|         | set     |         |        |         |                |         | (3-bit) | (4-bit) | (5-bit) |
| s5378   | 23754   | 12346   | 11694  | 10416   | 6345           | Х       | 8830    | 7388    | 4962    |
| s9234   | 39273   | 22152   | 21612  | 17794   | 11498          | Х       | 17607   | 13676   | 10288   |
| s13207  | 165200  | 30880   | 32648  | 15596   | 8517           | 9708    | 26151   | 20448   | 9992    |
| s15850  | 76986   | 2600    | 26306  | 22384   | 13873          | 10726   | 22266   | 17199   | 10074   |
| s38417  | 164736  | 93466   | 64976  | 62914   | 62939          | 36864   | 62179   | 44952   | 27093   |
| s38584  | 199104  | 77812   | 77372  | 57428   | 53287          | 27555   | 61064   | 46215   | 27667   |

Table 3.3: Experimental results on test data volume using Mintest set

and the switching power calculation method is weighted transition metric (WTM) from [8].  $P_{av_0}$  is the power of Xs mapped to 0s in [1], and  $P_{av_{min}}$  is the power of Xs mapped to minimized WTM in [1]. As the results show, our 3-bit FIVO scheme  $(P_{av_{3-bit}})$  provides smaller shift-in power in this comparison.

Table 3.4: Experimental results on scan-in power comparison with [1]

| Circuit | $P_{av_0}$ in [1] | $P_{av_{min}}$ in [1] | $P_{av_{3-bit}}$ |
|---------|-------------------|-----------------------|------------------|
| s5378   | 3336              | 2435                  | 2056             |
| s9234   | 5692              | 3466                  | 2329             |
| s13207  | 12416             | 7703                  | 5532             |
| s15850  | 20742             | 13381                 | 6835             |
| s38417  | 172665            | 112198                | 53688            |
| s38584  | 136634            | 88298                 | 50439            |

## 3.5 Analysis and discussion

#### 3.5.1 FIVO scheme analysis

In order to further find out the compression efficiency of the proposed scheme, we provide three analysis scenarios on the FIVO scheme. First, we randomly generate data with different X-bit rates and use the proposed 3-bit scheme to compress them. The test data have 100 patterns, and the pattern length is 100 bits. In this way, we can get the compression behavior of the proposed scheme. Second, the randomly generated data have different pattern lengths and different numbers of patterns. We can see that the 3 bit scheme has different compression results on these test data.



Figure 3.7: Compression rate analysis for FIVO scheme. (a) provides the compression effect analysis with the 3-bit scheme and the test data size is 100x100 bits. (b) provides more results in different test data size in the 3-bit scheme. (c) provides the analysis results with 4-bit schemes. From the results of (b), (c), and (d), we can perceive that the 4-bit scheme provide better results than the 3-bit scheme at the same X-bit rate test data.

Third, we apply the data in the first and the second analyses to the 4-bit scheme and report the results to see the different compression behavior between these two schemes.

We randomly generate different X-bit rate in test pattern and randomly fill 0 or 1 in it to satisfy the X-bit rate in each test pattern. The test data format is shown in Figure 3.7 (a). In this analysis, we use 100 test patterns and the pattern length is 100 bits. Figure 3.7 (b) shows the compression behavior of the 3-bit scheme. As we can see, the curve drops at the 50% X-bit rate, which means that the test volume starts to reduce.

The second analysis scenario further extends the previous one to different pattern lengths and different numbers of pattern. We not only provide different X-bit rates in the test pattern, we also provide different pattern lengths and different numbers of test patterns in this analysis. Figure 3.7 (c) shows the compression behavior of the 3-bit scheme. The curve drops after 35% and it drops rapidly until 85%. A different pattern length has similar compression effect in our analysis result.

Similar to the second analysis, the third scenario uses the same data set but we use a 4-bit scheme to analyze the compression behavior. Figure 3.7 (d) shows the 4-bit scheme compression behavior. Because the 4-bit scheme has more coding space than a 3-bit scheme, the curves shift to the left of the previous graph. The curves drop from 15% while the pattern length equals 100 and pattern number equals 100. The result of a 50x100 pattern set shows that the curves drop slowly after the X-bit ratio of 80%.

From the results in Figure 3.7, we observe that if we use fewer input bits in this scheme, the number of codes is smaller and the compression performance increases rapidly in high X-bit ratio. If we use more input bits in this scheme, we will have more codes to use and the curves drop at a smaller value on the same number of test patterns. We can predict that if the number of test pattern is large, the 4-bit scheme would have better compression efficiency than the 3-bit scheme. Due to the distribution of the X bits in the test patterns, we have chance to reduce the test data volume. If the coding is the same as the number of bits before coding, we do not get benefits in these kinds of schemes. However, we can get the benefit by selectively choosing the test pattern for the CSC set. The distributions of 1s and 0s in the test patterns are not random but based on the design. When the number of test patterns is large, we can predict that we can still compress the test data volume by these kind of schemes.

Table 3.5 shows the run time in seconds. We check 40320(8!) cases in the FIVO (3-bit) scheme. However, the permutations of 4 and 5 bits of encoded data are very large. We only optimize some of the encoded data at the optimization stage to

| 1110 000010 | 10 00001101  |              |              |
|-------------|--------------|--------------|--------------|
| Circuit     | FIVO (3-bit) | FIVO (4-bit) | FIVO (5-bit) |
| s5378       | 21.568       | 2.484        | 3.646        |
| s9234       | 42.838       | 2.902        | 4.876        |
| s13207      | 129.891      | 54.222       | 8.358        |
| s15850      | 59.781       | 9.949        | 21.017       |
| s35932      | 0.984        | 1.167        | 1.011        |
| s38417      | 128.085      | 10.861       | 19.881       |
| s38584      | 171.475      | 21.558       | 46.074       |
|             |              |              |              |

Table 3.5: Run time analysis of each circuit in 3-bit, 4-bit, and 5-bit FIVO scheme. The unit in this table is second.

achieve low power results. That is why the results for the FIVO (4-bit) and FIVO (5-bit) schemes have shorter run times than the FIVO (4-bit) scheme.

#### 3.5.2 Discussion

With extra CSC to achieve the low power and test data compression, we need to add some control circuits. Each scan cell in the NSC has three incoming inputs and two 2-to-1 MUXes in each flip-flop. The first input comes from the circuit, the second one comes from the normal scan chain and the third one comes from the decoder of the CSC. Table 3.2 also shows the number of extra area overhead. Although these kinds of schemes need extra circuit overhead, they prevent many switching activities in CSC scan-in step.

As Figure 3.7 shows, the 4-bit FIVO scheme has better compression efficiency than the 3-bit scheme. We can predict that the decoder area overhead in the 4-bit scheme might be a little more than the 3-bit one. Because the decoders store some information of the test pattern [53], we can shift fewer bits and decompress the total needed bits through the decoder. However, if the test patterns have very few X bits, the compression efficiency will not be very high. Another factor can affect the compression efficiency is the distribution of X bits. If the X bits appear at the same segment in each test pattern, that will improve the compression efficiency of this scheme. Otherwise, that will decrease the compression efficiency. Test time is another issue that we need to pay attention. Although the shift-in length is reduced in this work, the shift-out length is the same. The total test time is the same unless we compress the scan out data by adding a multiple input shift register (MISR) to reduce the shift out signature.

The routing overhead is inevitable in each test scheme. The decoder can be placed near those scan cells to mitigate the wirelength. We also recommend these schemes to be implemented in a stable version of designs. When the designs are stable, we can use the test pattern to get a proper X bit omit ratio by the power, test data volume and area overhead trade off. If there are minor changes in the design, the test pattern can be applied in the NSC.

We present the VIFO scheme in the previous chapter and FIVO scheme in this chapter. The comparison of these two schemes are shown in Figure 2.7 and 3.6. The results consist of shift-in power, test data volume and area overhead. Although the compression efficiency of the FIVO scheme is better than the VIFO scheme, the fixed bits segmentation method of VIFO scheme is more intuitive.

## 3.6 Summary

In this chapter, we describe the FIVO scheme. The comparison of the FIVO scheme with previous works, our results provide relatively smaller test data size. In addition, we implement the decoders in FIVO scheme and observe that we can get a feasible X bit omit ratio as tradeoff between volume, power, and area overhead. We also provide the compression efficiency analysis of FIVO scheme in different situation by using random generated data. The FIVO scheme achieves high compression rate at high X-bit rate. Finally, we discuss about some limitations in Section 3.5.2.

# Chapter 4

# An Adaptive Multi-Dimensional Scan-Control Scheme for Low Test Cost

## 4.1 Introduction

In this chapter, we further propose an adaptive multi-dimensional scan shift control concept for adaptive multiple scan chain design. Adaptive multiple scan chain test scheme provides very low scan power by skipping many long scan chain switching activities. Based on the two-dimensional scan shift control, we can achieve low test power with simple and small overhead structure. We can further extend the scheme to a generic N dimension test scheme. The proposed scheme skips many unnecessary don't care (X) patterns to reduce the test data volume and test time. The experimental results of the proposed 2-D scheme achieve significant improvement in shift power reduction, test volume and test time reduction.

We organize this chapter as follows. Section 4.2 presents the proposed scan architecture and the concept of our methodology. Section 4.3 presents the test data manipulation details and implementation flow. Section 4.4 shows the experimental results with ISCAS'89 and ITC'99 circuits. Section 4.5 discusses some issues in multi-dimensional schemes. Section 4.6 summarize this chapter.



Figure 4.1: The proposed adaptive multiple scan chain architecture with 2-D 4x4 scan shift control chains.

## 4.2 Proposed architecture overview

The proposed scheme uses memory block concept to design multiple scan chain. By using two-dimensional scan shift control as location indicator, each sub-scan-chain can operate independently in scan-in mode. Figure 4.1 shows the proposed twodimensional scheme. Combined with the proposed methodology, this test scheme can achieve low test power, small test data volume, and short test time with very little area overhead. Scan control 1 and 2 indicate the sub-scan-chain location. When the location of the sub-scan-chain is determined, the specific test pattern is shifted from the scan input. While the scan-in operation is performed, the scan-out data are shifted out from the scan output.

Since we need the scan-out data to check the correctness of the chip fabrication, we consider the scan-in test stimulus and the scan-out patterns simultaneously when we encode these test patterns. Because there are many X bits sequences inside the test pattern, we can use scan control to skip the scan-in and scan-out operations. With the help of scan control, the proposed scheme saves test power, test data volume, and test time simultaneously.



Figure 4.2: The design diagram of the proposed test architecture with routing connection details.

Figure 4.2 shows the design details of the proposed scheme. The scan control 1 applies control signals to the control circuit of each sub-scan-chain in each column. The scan control 2 provides the row bank control signal. Figure 4.3 reveals the control circuit design by logic gates. Scan control 1i in Figure 4.3 (a) connects to the flip-flop signal 1i in Figure 4.3 (b). Scan control 2 uses the same principle to connect to the control signals. From Figure 4.2 and 4.3, we can perceive that the scan input signals will be masked if the control signal does not enable the sub-scan-chain.

#### 4.2.1 Two test modes of the proposed scheme

This scheme has two scan modes: regular scan mode and skipping scan mode. The regular scan mode does not have compression effect. With the skipping scan mode, we can reduce the test data size. The regular scan mode shifts the test pattern from the first sub-scan-chain to the last sub-scan-chain in order. The skipping scan mode shifts the required sub-scan-chain patterns only. Figure 4.4 shows the waveform of the behaviors of the control signals in regular scan mode. Figure 4.5 shows the



Figure 4.3: Circuit design of the proposed architecture. (a) is part of the sub-scanchain design. (b) presents the details for one scan control chain.



Figure 4.4: Regular scan mode waveform. In regular scan mode, the test patterns are shifted segment-by-segment.

waveform of the behaviors of the control signals in skipping mode. <sup>1</sup> We also reduce many unnecessary shifting operations in skipping scan mode.

#### The working behavior of the regular scan mode

The first cycle of the regular scan mode and skipping scan mode is the same. Both of the first flip-flop values in the scan control 1 and 2 are reset to 1 to indicate the scan-in data location. From Figure 4.4, test data are shifted into the sub-scan-chain from the scan input during the next 6 cycles and the data inside the sub-scan-chain flip-flops are shifted out from the scan output. The scan control 1 shifts the value

 $<sup>^{1}</sup>$ We need to add 2 bits of control data to deliver the test patterns to the correct sub-scan-chains.



Figure 4.5: Skipping scan mode waveform. The test patterns in skipping mode contain signal control codes which skip the segments with all X bits. In this figure, the 15th cycle and the 16th cycle skip two segments of sub-scan-chains, which can reduce the test cost.

1 from the first flip-flop to the second flip-flop to refer the next sub-scan-chain at the 8th cycle. At the 29th cycle, the first flip-flop value of scan control 1 is set to 1 again and the first flip-flop value of scan control 2 is shifted to the second flip-flop. These regular operations shift the test patterns into each of the sub-scan-chain.

#### The working behavior of the skipping scan mode

The skipping mode operation is shown in Figure 4.5. The first flip-flop value of the scan control 1 and 2 is reset to 1. The operations of 15th and 16th cycles shift the flip-flop value of scan control 1 from the second flip-flop to the 4th flip-flop. Due to the five successive skipping control codes, this scheme skips 5 sub-scan-chains from the 23rd cycle to the 27th cycle. The first flip-flop of scan control 1 sets value to 1 and shifts the flip-flop value of scan control 2 from the first flip-flop to the second flip-flop at the 23rd cycle. The flip-flop value of scan control 1 shifts from the first flip-flop to the fourth flip-flop at the 24th, the 25th, and the 26th cycles. The first flip-flop of scan control 1 value sets to 1 and shifts the flip-flop value of scan control 2 from the second flip-flop to the third flip-flop at the 27th cycle. With these operations, the proposed architecture can skip a lot of unnecessary scan in data.

## 4.3 Two-dimentional scan shift control optimization methodology

This section will introduce the optimization methodology on test volume, test time, and test power for the proposed two-dimensional scheme. The methodology consists of scan control data definition, sub-scan-chain data segmentation, and scan-in data encoding. First, we define the codes to be added to the new test patterns. Second, we propose a guideline to get the length of each sub-scan-chain. Third, a heuristic encoding method is applied to reduce test data volume. Finally, we recommend an implementation flow to realize this test scheme.

#### 4.3.1 Test data encoding and control code embedding

We define a 2-bit control signal coding method for the proposed scheme. The definition of the control signal codes are listed below:

- Code 00: Regular scan signal
- Code 01: Skipping one segment
- Code 11: Skipping multiple segments

Figure 4.6 shows the encoding example which adds extra control signals to the test patterns. The original test data size is 42 bits and the encoded test data size is 32 bits in this example. Both of the scan-in and scan-out data in the first and the second segments are all X bits. The first test pattern is scan-in test pattern. The second test pattern is scan-out test pattern. The first test pattern will push the second test pattern to output.

The first two bits of each segment are control code. In the first test pattern, because of the successive skipping operation, the first skipping code is 11 and the second skipping code is 01. We fill all 0 in the sixth segment to push out the result in that segment because there is a 0 that we need to observe. However, not all of

| Original  | scan-in d  | lata and s | can-out m | nask data |          |        |
|-----------|------------|------------|-----------|-----------|----------|--------|
| XXXXXX    | XXXXXX     | XX1XXX     | XXXXXX    | XOX1XX    | XXXXXX   | XXXXXX |
| XXXXXX    | XXXXXX     | XXXXXX     | XXXXXX    | XXXXX1    | XOXXXX   | XXXXXX |
|           |            |            |           |           |          |        |
| Encoded a | scan-in da | ata        |           |           |          |        |
| 11        | 01         | 00001111   | 01        | 00001111  | 00000000 | 01     |

Figure 4.6: An encoding example of control signal. In this example, we consider scan-in and scan-out data simultaneously because we need to check a bit in this subscan-chain which is not X bit. The original test data size is 42 bits in this example. By applying our encoding method, the scan-in test data is shrunk to 32 bits.

the test patterns need to be encoded. If the encoded test pattern length is longer than the original test pattern length, the encoding is not necessary.

#### 4.3.2 Design of sub-scan-chain data design

The flip-flop number of scan control 1  $(L_{SCN1})$  and 2  $(L_{SCN2})$  are estimated from the total scan chain length  $(L_{TSCL})$  in equations (4.1) and (4.2). Designer can use the value as a reference and choose the approximate value as the implementation value. The length of sub-scan-chain  $(L_{seg})$  is calculated by the equation (4.3).

$$L_{SCN1} \approx \sqrt[3]{L_{TSCL}}$$
 (4.1)

$$L_{SCN2} \approx \sqrt[3]{L_{TSCL}}$$
 (4.2)

$$L_{seg} \approx (L_{TSCL} \div L_{SCN1}) \div L_{SCN1}$$
 (4.3)

The length of the sub-scan-chain segment is affected by the length of control 1 and 2. If we have long  $L_{SCN1}$  and  $L_{SCN2}$ , we will have short  $L_{seg}$  and have more chance to get a segment of test pattern with all X bits. However, the area overhead will be higher due to the long length of  $L_{SCN1}$  and  $L_{SCN2}$ . Tradeoffs should be considered to implement the test architecture.

#### 4.3.3 Scan-in data encoding

Since the definition of the control code and the approximate value of the scan control 1 and 2 are determined, we need to apply the control code to the test pattern. The pattern will be assigned to skipping scan mode pattern if the encoded pattern is shorter than original one. Figure 4.7 shows the pseudocode of control code embedding procedure.

> Regular scan mode patterns←NULL Skipping scan mode patterns←NULL FOR each test pattern TP<sub>i</sub> FOR each segment SEG<sub>j</sub> of TP<sub>i</sub> IF all the data inside the segment SEG<sub>i</sub> are "X" Replace the segment with control code 01 IF control code of SEG<sub>j-1</sub> is 01 Change the control code of SEG<sub>i-1</sub> to 11 END IF ELSE Add control code 00 to the segment SEG<sub>i</sub> **END IF** END LOOP IF test pattern length of TP<sub>i</sub>> length of TP Regular scan mode pattern ← TP<sub>i</sub> ELSE Skipping scan mode pattern ← TP<sub>i</sub> END IF **END LOOP**

Figure 4.7: Pesudecode of control code embedding.

#### 4.3.4 Implementation flow

In order to realize the proposed test scheme to the traditional design flow, we add an extra step to the flow. The extra step inserts the two-dimensional scan control circuits after the scan chain synthesis stage. The proposed design flow is shown in Figure 4.8. The extra codes consist of wires, flip-flops, and combinational logic gates.



Figure 4.8: The proposed test design flow. In order to integrate the proposed scheme design into the traditional design flow, we insert an extra stage after the scan chain synthesis.

## 4.4 Experimental results

The experimental results on ISCAS'89 and ISCAS'99 benchmark circuits are provided in this section. The test patterns are generated by Synopsys TetraMAX [54]. The power estimation method is WTM and the implementation technology is UMC's 0.18*um* cell library. In order to get the power comparison results, we use the single scan chain test patterns and fill the X bits with 0's to normalize our results.

#### 4.4.1 The results on test power, volume and time

The results including test power, test volume, and test time are shown in Table 4.1. The second column is the total number of test patterns  $(N_{ptn})$ . The third column is the pattern length  $(L_{ptn})$ . The fourth, fifth, and sixth column are the number of flip-flops in scan control 1  $(N_{c1})$ , 2  $(N_{c2})$ , and sub-scan-chain length  $(L_{seg})$ .  $T_{new}$ ,  $V_{new}$ , and  $P_{new}$  are new test time, new test data volume, and new test power of the proposed scheme respectively. New test time and new test data volume are normalized by the single scan chain design.  $TCycle_{new}(cycle)$ ,  $Vol_{new}(bit)$ , and

 $Pwr_{new}(sw)$  are total test cycles, new test data volume, and switching activities during shift-in and shift-out. New test power is normalized by the power of filling all X's with 0's.

The original test power is calculated by scan-in and scan-out power. Because the proposed scheme consists of extra control scan chains, the power of the proposed scheme includes switching activities of each sub-scan-chain and the extra control scan chains. Table 4.1 shows that the test power consumption  $(P_{new})$  in each circuit. For example, the b17 circuit has 11 scan control flip-flops in scan control 1, 11 scan control flop-flops in scan control 2, and 12 scan flip-flops in each sub-scan-chain. Compared with the traditional single scan chain design, the power consumption, test data volume, and test time with our test scheme are 1.1% (98.9% reduction), 46.5% (53.5% reduction), and 54.7% (45.3% reduction).

Table 4.1 shows the results only on the same length of  $L_{SCN1}$  (the length of scan control 1) and  $L_{SCN2}$  (the length of scan control 2). We further calculate the test power, test data volume, and test time of the circuit b17 in different number of scan control flip-flops and present the results in Figure 4.9 (a), Figure 4.9 (b), and Figure 4.9 (c). Due to the various distributions of the X bits in the test patterns, different numbers of scan control flip-flops settings achieve different test power, test data volume, and test time reduction results.

From Table 4.1 and Figure 4.9 (a), we can get a small power consumption result and a small test data volume result but the test time may not be the smallest one. Because large number of scan control may reduce the test data volume, the test time increases due to the long scan control time. In fact, Figure 4.9 (b) shows that we can get the smallest test data volume with 10 flip-flops in scan control 1 and 9 flip-flops in scan control 2. The total extra scan flip-flop number is 19 and test data volume is 45.5%. Moreover, Figure 4.9 (c) shows the shortest test time. With 10 flip-flops in scan control 1 and 9 flip-flops in scan control 2, the total test time is

Table 4.1: The experimental results show that the test power, test data volume, and test time rate are reduced by applying small number of extra flip-flops in this scheme. Compared with traditional single scan chain and fill all X bits with 0's, the proposed scheme provides low test power, small test data volume, and short test time, especially the large circuit (b17 or b22).

| Test power, volume, and time results |           |           |                     |          |           |                |               |             |               |             |                          |
|--------------------------------------|-----------|-----------|---------------------|----------|-----------|----------------|---------------|-------------|---------------|-------------|--------------------------|
| circuit                              | $N_{ptn}$ | $L_{ptn}$ | $\overline{N_{c1}}$ | $N_{c2}$ | $L_{seg}$ | $TCycle_{new}$ | $T_{new}(\%)$ | $Vol_{new}$ | $V_{new}(\%)$ | $Pwr_{new}$ | $\overline{P_{new}(\%)}$ |
|                                      | 125       | 1426      | 7                   | 7        | 30        | 176815         | 98.4          | 170940      | 95.9          | 310788      | 2.2                      |
|                                      |           |           | 8                   | 8        | 23        | 175504         | 97.7          | 167629      | 94.0          | 239380      | 1.7                      |
| s38584                               |           |           | 9                   | 9        | 18        | 175573         | 97.7          | 165698      | 93.0          | 187584      | 1.4                      |
|                                      |           |           | 10                  | 10       | 15        | 173833         | 96.8          | 161958      | 90.9          | 154458      | 1.1                      |
|                                      |           |           | 11                  | 11       | 12        | 173980         | 96.8          | 159230      | 89.3          | 128460      | 1.0                      |
|                                      | 337       | 1636      | 7                   | 7        | 34        | 197004         | 35.6          | 180828      | 32.8          | 308037      | 2.2                      |
|                                      |           |           | 8                   | 8        | 26        | 200370         | 36.2          | 179476      | 32.6          | 234601      | 1.7                      |
| s38417                               |           |           | 9                   | 9        | 21        | 207436         | 37.5          | 181487      | 32.9          | 191813      | 1.4                      |
|                                      |           |           | 10                  | 10       | 17        | 220272         | 39.8          | 187920      | 34.1          | 155887      | 1.2                      |
|                                      |           |           | 11                  | 11       | 14        | 233678         | 42.3          | 194586      | 35.3          | 135869      | 1.0                      |
|                                      | 24        | 1728      | 8                   | 8        | 27        | 34625          | 80.2          | 33065       | 79.7          | 82900       | 1.7                      |
|                                      |           |           | 9                   | 9        | 22        | 34352          | 79.6          | 32480       | 78.3          | 69450       | 1.4                      |
| s35932                               |           |           | 10                  | 10       | 18        | 34374          | 79.6          | 32046       | 77.3          | 56962       | 1.2                      |
|                                      |           |           | 11                  | 11       | 15        | 34894          | 80.8 0        | 32134       | 77.5          | 47698       | 1.0                      |
|                                      |           |           | 12                  | 12       | 12        | 35580          | 82.4          | 32100       | 77.4          | 40102       | 0.9                      |
|                                      | 141       | 534       | 4                   | 4        | 34        | 71763          | 94.7          | 69648       | 92.5          | 83207       | 6.8                      |
|                                      |           |           | 5                   | 5        | 22        | 71616          | 94.5          | 68232       | 90.6          | 51563       | 4.5                      |
| s15850                               |           |           | 6                   | 6        | 15        | 71523          | 94.3          | 66588       | 88.4          | 35491       | 3.2                      |
|                                      |           |           | 7                   | 7        | 11        | 72796          | 96.0          | 66028       | 87.7          | 26043       | 2.4                      |
|                                      |           |           | 8                   | 8        | 9         | 74316          | 98.0          | 65997       | 87.7          | 21679       | 2.1                      |
|                                      | 147       | 638       | 4                   | 4        | 40        | 82675          | 87.6          | 80470       | 85.8          | 116005      | 6.4                      |
|                                      |           |           | 5                   | 5        | 26        | 79186          | 83.9          | 75658       | 80.7          | 77529       | 4.3                      |
| s13207                               |           |           | 6                   | 6        | 18        | 73447          | 77.8          | 68302       | 72.8          | 50673       | 3.0                      |
|                                      |           |           | 7                   | 7        | 14        | 71547          | 75.8          | 64932       | 69.2          | 38213       | 2.4                      |
|                                      |           |           | 8                   | 8        | 10        | 71157          | 75.4          | 61896       | 66.0          | 29985       | 1.9                      |
|                                      | 707       | 1415      | 7                   | 7        | 29        | 572073         | 57.1          | 538137      | 53.8          | 679904      | 2.3                      |
|                                      |           |           | 8                   | 8        | 23        | 534450         | 53.4          | 491323      | 349.1         | 524170      | 1.9                      |
| b17                                  |           |           | 9                   | 9        | 18        | 538858         | 53.8          | 483712      | 248.4         | 413066      | 1.5                      |
|                                      |           |           | 10                  | 10       | 15        | 538118         | 53.7          | 471660      | 047.1         | 388532      | 1.3                      |
|                                      |           |           | 11                  | 11       | 12        | 548098         | 54.7          | 465379      | 946.5         | 297902      | 1.1                      |
|                                      | 510       | 735       | 5                   | 5        | 30        | 312340         | 83.2          | 30010       | 80.1          | 595370      | 4.4                      |
|                                      |           |           | 6                   | 6        | 21        | 285200         | 75.9          | 266840      | )71.2         | 408164      | 3.1                      |
| b22                                  |           |           | 7                   | 7        | 15        | 289577         | 77.1          | 26407       | 70.4          | 297170      | 2.3                      |
|                                      |           |           | 8                   | 8        | 12        | 290554         | 77.4          | 259444      | 469.2         | 243230      | 1.9                      |
|                                      |           |           | 9                   | 9        | 10        | 299668         | 79.8          | 262438      | 870.0         | 205050      | 1.7                      |





Figure 4.9: The test power (a), volume (b), and time (c) results of b17 circuit by applying different number of flip-flops in control 1 and 2. The x and y axes are horizontal. They represent the number of flip-flops in control 1 and control 2. The z axis is vertical, which represents the normalized value. This shows that different number of flip-flops in scan control 1 and 2 can get different results in test power, data volume, and time. And the prediction numbers from equations (4.1), (4.2), and (4.3) are reasonable and provide good results.

51.6%.

#### 4.4.2 The scan design realization

We use Cadence SOC Encounter [55] as placement and routing tool to implement the ISCAS'99 b17 circuit. The setting of aspect ratio is 1 and the core utilization is 0.9. With 8 flip-flops in scan control 1 and 8 flip-flops in scan control 2, we get 849181*um* in total length after detail route. Compared with the traditional one scan chain design, we have 1% extra routing overhead. We also provide the area and wire length results in Table 4.2. The first column is the traditional single scan chain design results (Original). The second column is our test scheme results (Our scheme). And the 3rd column is the overhead results. The core area overhead, cell area overhead, and routing overhead are all around 1%. The routing result image of b17 circuit is shown in Figure 4.10.

## 4.5 Discussion





From the view of shift power, the multiple scan chain technique reduces many unnecessary switching activities, the 1-D, 2-D and 3-D based scan design will have

| able 4.2. The implement | ation overnea | iu results of L | CAS 99 DIT CIICU |
|-------------------------|---------------|-----------------|------------------|
|                         | Original      | Our scheme      | Overhead (%)     |
| Core area $(um^2)$      | 357057.994    | 360752.641      | 1.035            |
| Cell area $(um^2)$      | 321350.198    | 324676.598      | 1.035            |
| Total wire length (um)  | 840521.100    | 849181.420      | 1.030            |

Table 4.2: The implementation overhead results of ISCAS'99 b17 circuit.



Figure 4.10: The routing result image of b17 circuit. The white lines are the scan-in paths.

similar shift power. From the view of test data volume and time, 3-D architecture needs one more bit than 2-D. That will cause a little overhead in test data volume and time.

With the 3rd dimension of scan control, the original 2-D scheme is independent in each stacked IC. We can use the 3rd dimension scan control to test each stacked IC independently. If we have the same sub-scan-chain number in each stacked IC,



Figure 4.11: The test scheme for stacked 3-D IC.

we can save data volume to record the difference of each stacked IC. However, if the sub-scan-chain numbers are different in each stacked IC, we need extra data to record the difference.

# 4.5.2 1-D vs. 2-D vs. 3-D multiple scan chains in area overhead

Compared with the conventional one-dimensional scan control multiple scan chain scheme, Table 4.3 provides the comparison results. Although one-dimensional scan control multiple scan chain schemes also provide small shift-in power, the proposed two-dimensional scan control scheme has fewer control flip-flops than onedimensional scan control scheme.

In order to provide fair comparison, the results in Table 4.3 apply the same sub-scan-chain length  $(L_{seg})$  in these two multiple scan chain schemes. The fourth column is the total number of extra flip-flops of the conventional one dimension multiple scan chain in scan control 1 (#FFs CMSS). The fifth column is the total number of extra flip-flops in scan control 1 and 2 of the proposed 2-D test scheme (#FFs 2-D). The last column of Table 4.3 shows the extra flip-flops in scan control 1, 2 and 3 in the 3-D test scheme (#FFs 3-D). Although we have fewer number of flip-flops in high dimension scheme, we may have complex routing in high dimension scheme. We should consider the tradeoff between high dimension scheme and routing overhead while we are implementing the test scheme.

### 4.5.3 Adaptive multi-dimensional scan control scheme overhead analysis

We have discussed about 2-D and 3-D scheme. We can generalize the scan control length to  $L_{scn}$  ( $L_{scn1}$ ,  $L_{scn2}$ ,  $L_{scn1}$ , ...,  $L_{scni}$ ) as equation (4.4). The assumption is that each scan control dimension has the same number of flip-flops. The general overhead equation of m multi-dimension scheme is shown in equation (4.5). In order

|         |      | <b>1</b>  | 1         |          |          |
|---------|------|-----------|-----------|----------|----------|
| Circuit | #FFs | $L_{seg}$ | #FFs CMSS | #FFs 2-D | #FFs 3-D |
| s13207  | 638  | 14        | 46        | 14       | 11       |
| s15850  | 534  | 11        | 49        | 14       | 12       |
| s35932  | 1728 | 15        | 116       | 22       | 15       |
| s38417  | 1636 | 17        | 97        | 20       | 14       |
| s38584  | 1426 | 15        | 96        | 20       | 14       |
| b17     | 1415 | 15        | 95        | 20       | 14       |
| b22     | 735  | 12        | 62        | 16       | 12       |

Table 4.3: The overhead comparison with conventional one-dimensional multiple scan shift scheme. The proposed 2-D and 3-D scheme have significant reduction in the number of scan control flip-flops.





Figure 4.12: The overhead analysis of adaptive multi-dimensional scheme. The horizontal axis is the number of dimension. The vertical axis is the overhead. We use a 10000 scan flip-flops design as example to demonstrate the results.

to get an integer number of the  $L_{scn}$ , we can get the ceiling of the value ( $\lceil L_{scn} \rceil$ ) but the length of each scan control dimension should be adjusted at the same time. Figure 4.12 shows the  $L_{scn}$  and extra flip-flops of the generic scheme. From Figure 4.12, we can see that the extra number of flip-flops decreases very fast between 1-D and 2-D and the overhead of flip-flop number from 3-D to 6-D is almost the same. However, the control signal routing overhead has not considered in this analysis. In modern designs, we recommend 2-D or 3-D to implement this test scheme because high dimension may cause heavy routing overhead.

$$L_{scn} = \sqrt[m+1]{L_{TSCL}} \tag{4.4}$$

$$Flip - flops = m * L_{scn} \tag{4.5}$$

## 4.6 Summary

In this chapter, the adaptive multi-dimensional scan shift control test scheme can reduces test power, test data volume, and test time with small area overhead simultaneously. The results show that the power reduction in each benchmark circuit is significant in the 2-D test scheme. The test data volume and test time in each circuit are also improved in each circuit. The circuit area overhead of 2-D test scheme is better than the conventional one dimension scan control multiple scan chain scheme but still has 1% more overhead than traditional single scan chain design in area and routing. However, the improvement of large circuits is especially significant, which indicates that the proposed scheme can scale to large circuit designs. We further extend this concept from 2-D to 3-D and compare the overhead of these test schemes. We recommend 2-D or 3-D test scheme to implement in the design with 10000 DFFs because high dimension of controls may cause routing overhead.

Since the design concepts of this chapter and the previous decoder based scheme are different, we can find that the decoder based scheme in Chapter 3 can reduce more test data size but the area overhead is higher. The test scheme in this chapter has smaller area overhead but it achieves fewer test data reduction.



# Chapter 5 Low Cost SiP Interconnect Test

## 5.1 Introduction

Due to the practicality of the SiP on system integration, it is essential to develop interconnect test and diagnosis methods for SiP. Low cost interconnect test schemes for system-in-package (SiP) on system integration are presented in this chapter. The proposed SiP interconnect test schemes focus on the major faults (stuck-at and short) in SiP interconnects. For single port RAM, we have a test scheme which uses only one more than the number of the address lines cycles to write and read in testing SiP interconnects. For multi-port RAM, we only need 3 more than the number of address lines cycles to test these interconnects. The proposed test schemes achieve short test time and diagnosis ability in interconnect test for modern designs with ASIC and RAM.

We present the circuit design and analysis methods of the test scheme in Section 5.2. Section 5.3 presents the test scheme for multi-port RAM. Section 5.4 shows different design methods and the mathematic form of LFSR design. Section 5.5 discuss about the walking-0 solution and the special requirement of the second test scheme. Section 5.6 shows the case study samples in different faults. Section 5.7 is the summary.



Figure 5.1: The proposed test architecture. The LFSR+Analyzer block integrated LFSR and Analyzer together. The SR+1 block has one more data flip-flop (DFF) than traditional shift register to generate one more clock cycle to fill test pattern at address 0. The detailed circuit design is shown in Figure 5.2.

## 5.2 Proposed test circuit design and diagnosis methods

Figure 5.1 is an example to illustrate the interconnection between with ASIC and RAM in a typical SiP. We use the example with 4-bit address and 4-bit data width. We also define the enable signals of write and output are high (1) and the disable signals are low (0). We illustrate the proposed scheme and define the notations as follows:

- DIi: A data-in wire identified by a unique number, i
- DOi: A data-out wire identified by a unique number, i
- Ai: A address wire identified by a unique number, i
- WE: Write enable wire
- OE: Output enable wire

Table 5.1: The DFFs values of LFSR+Analyzer block. This LFSR can generate 15 patterns.

|   | _    | Cycle   | DFF0      | DFF1     | DFF2     | DFF3    |       |
|---|------|---------|-----------|----------|----------|---------|-------|
|   | -    | 1       | 0         | 1        | 0        | 1       |       |
|   |      | 2       | 1         | 0        | 1        | 0       |       |
|   |      | 3       | 1         | 1        | 0        | 1       |       |
|   |      | 4       | 1         | 1        | 1        | 0       |       |
|   |      | 5       | 1         | 1        | 1        | 1       |       |
|   |      | 6       | 0         | 1        | 1        | 1       |       |
|   |      | 7       | 0         | 0        | 1        | 1       |       |
|   |      | 8       | 0         | 0        | 0        | 1       |       |
|   |      | 9       | 1         | 0        | 0        | 0       |       |
|   |      | 10      | 0         | 1        | 0        | 0       |       |
|   |      | 11      | 0         | 0        | 1        | 0       |       |
|   |      | 12      | 1         | 0        | 0        | 1       |       |
|   |      | 13      | 1         | 1        | 0        | 0       |       |
|   |      | 14      | 0         | 1        | 1        | 0       |       |
|   |      | 15      | 1         | 0        | 1        | 1       |       |
|   | Ξ    |         |           |          | U.       |         |       |
| Т | able | 5.2: Tł | ne walkii | ng-1 beh | avior of | SR+1 ci | rcuit |
| _ | Cyc  | cle DF  | F0 DI     | FF1 DI   | FF2 DF   | FF3 DF  | F4    |
| - | 1    | ]       |           | 1896     |          | 0 0     | )     |
|   | 2    | (       | )         | 1        | 0,11     | 0 0     | )     |
|   | 3    | (       | ) (       | 0        | 1 (      | 0 0     | )     |
|   | 4    | (       | ) (       | 0        | 0        | 1 (     | )     |
| _ | 5    | (       | )         | 0        | 0 0      | 0 1     |       |
|   |      |         |           |          |          |         |       |

Figure 5.2 shows the circuit design of the proposed test scheme. In this example, we initialize the LFSR+Analyzer block (Figure 5.2 (a)) value with 0101 from DFF0 to DFF3. The LFSR can generate a sequence of test patterns from DFF0 to DFF3. Table 5.1 lists 15 cycles of the values in DFFs of LFSR+Analyzer block. In fact, we use the first 5 cycles to test the interconnects. During these 5 cycles, the SR+1 block (Figure 5.2 (b)) generate specific address values. We show 5 cycles of the walking-1 <sup>1</sup> values in Table 5.2. In order to test the address 0, the delay walking-1 value is used in SR+1 block.

 $<sup>^1\</sup>mathrm{When}$  we put a 1 and shift it in the SR+1 block, the patterns look like that the value 1 is walking.



Figure 5.2: The circuit design of the proposed test architecture. This example uses 4 bits data and address lines.

#### 5.2.1 Test strategy for SiP interconnects

The test method consists of four steps.

- 1. First we reset the test scheme. Each DFF will have initial value.
- 2. Successive 5 cycles are writing action. The different test patterns generated from the LFSR will be written to different addresses if the address lines and data-in lines are not faulty.
- 3. The third step is to reset the test scheme. The DFFs will get the initial value again.

|       | SR+1         | LFSR+Analyzer | Control |
|-------|--------------|---------------|---------|
| Cycle | DFF0 to DFF4 | DFF0 to DFF3  | WE OE   |
| 1     | 10000        | 0101          | 1 0     |
| 2     | 01000        | 1010          | 1 0     |
| 3     | 00100        | 1101          | 1 0     |
| 4     | 00010        | 1110          | 1 0     |
| 5     | 00001        | 1111          | 1 0     |
| 6     | 10000        | 0101          | 0 1     |
| 7     | 01000        | 1010          | 0 1     |
| 8     | 00100        | 1101          | 0 1     |
| 9     | 00010        | 1110          | 0 1     |
| 10    | 00001        | 1111          | 0  1    |

Table 5.3: The DFFs values in the proposed scheme for RAM test. The 4-bit address example needs 10 cycles.

4. The fourth step reads 5 successive cycles from the address been written in previous 5 cycles. During these 5 cycles, we can check the value from the Sig signal in Figure 5.2 (a) to make sure the correctness of the interconnects.

The test values are shown in Table 5.3. In cycles 1 to 5, the write-operation is active. In cycles 6 to 10, the read-operation is active. If the interconnects are not faulty, the value of Sig (Figure 5.2 (a)) will be 0 during cycles 6 to 10.

#### 5.2.2 Diagnosis of interconnects

In Table 5.3, we initialize the test pattern which generates the sequence of test pattern from 0101. The next pattern is 1010 pattern. With these two test patterns, we can test the short and stuck-at faults in data-in and data-out lines.

The interconnects of address lines are not easy to test by one or two patterns. In our test example, we successively write test patterns to different addresses by enabling each address line. The write and read operations are both one more than the number of address lines. If the address lines stuck-at 0, the address 0 will be overwritten. If the address lines stuck-at 1, the value in that address will be overwritten. In order to catch the line with short fault, we can observe the values

|       | SR+3         | LFSR+2+Analyzer | Control |   |
|-------|--------------|-----------------|---------|---|
| Cycle | DFF0 to DFF6 | DFF0 to DFF5    | WE OE   |   |
| 1     | 1000000      | 010111          | 1       | 0 |
| 2     | 0100000      | 101011          | 1       | 0 |
| 3     | 0010000      | 110101          | 1       | 1 |
| 4     | 0001000      | 111010          | 1       | 1 |
| 5     | 0000100      | 111101          | 1       | 1 |
| 6     | 1000010      | 011110          | 0       | 1 |
| 7     | 0100001      | 001111          | 0       | 1 |

Table 5.4: The DFFs values of the LFSR+2+Analyzer test scheme. A 4 bits address example needs 7 cycles to test the multi-port RAM.

which have been written in the address. Since we write two or more test patterns at the same address. The latter test pattern will appear in that address. In fact, the Sig signal in cycles 6 to 10 shows the test results. If it is 1, the interconnects are faulty.

# 5.3 The test scheme for multi-port RAM

Some multi-port RAM can read and write at one cycle. In this section, we will introduce the second test scheme to speed up the test by using extra DFFs to implement the test scheme for multi-port RAM. This scheme is very efficient.

If the multi-port RAM has one port to read and one port to write in one clock cycle, we can apply the second test scheme in Figure 5.3. We use two extra DFFs at the end of LFSR to keep the value in previous cycles. The SR+3 in Figure 5.3 (b) can provide the actual address for read operation and the control signals also need to adapt the two delay cycles. In fact, designers can add more extra DFFs to keep more test patterns. The shift register and control circuits also need to be modified.

Table 5.4 shows the test value with pipeline based test scheme. In cycles 1 and 2, the write operation is active. Test patterns are written to address 0 and 1. In cycle 3, read and write operations are active. In cycles 3 to 7, we can check the Sig signature in the test scheme to determine the interconnects situation.


The mathematic form of LFSR 5.4

RAM.

There are many previous works discussing about the LFSR designs. The circuit in Figure 5.4 (a) can be seen as  $X^4 + X + 1$  [56, 57]. If we add two extra DFFs like Figure 5.3 (a), DFF1 to DFF4 can keep the value in previous cycle. Similarly, DFF2 to DFF5 can keep values of previous 2 cycles. However, the cost will be the extra DFFs. Figure 5.4 (b) is the reversed form of the 4 bits LFSR. Different from Figure 5.4 (a), the XOR logic fills the value to the next DFFs directly. The modern design may have 8 bits or 16 bits bus width. Designers can use different LFSR circuits to implement the proposed test scheme. Figure 5.4 (c) shows the 8 bits LFSR example. The mathematic form is  $X^8 + X^6 + X^5 + X + 1$ .



Figure 5.4: The mathematic form of the LFSR and the 8 bits LFSR example.

# 5.5 Discussions



#### 5.5.1 Walking-0 solution

By changing the SR+1 patterns from walking-1 to walking-0, we can get the test patterns in Table 5.5. The diagnosis method is different from the walking-1 patterns. If the address lines stuck-at 0, the value in the specific address will be overwritten. If

|       | SR+1         | LFSR+Analyzer | Cor | ntrol |
|-------|--------------|---------------|-----|-------|
| Cycle | DFF0 to DFF4 | DFF0 to DFF3  | WE  | OE    |
| 1     | 01111        | 0101          | 1   | 0     |
| 2     | 10111        | 1010          | 1   | 0     |
| 3     | 11011        | 1101          | 1   | 0     |
| 4     | 11101        | 1110          | 1   | 0     |
| 5     | 11110        | 1111          | 1   | 0     |
| 6     | 01111        | 0101          | 0   | 1     |
| 7     | 10111        | 1010          | 0   | 1     |
| 8     | 11011        | 1101          | 0   | 1     |
| 9     | 11101        | 1110          | 0   | 1     |
| 10    | 11110        | 1111          | 0   | 1     |

Table 5.5: The DFFs values of the walking-0 patterns.

the address lines stuck-at 1, we will always point the address 1111 and the address 1111 will be overwritten. The short fault can be observed by checking the value been written in the specific address. Since we write two or more test patterns at the same address. The latter test pattern will appear in that address. The Sig signal in cycles 6 to 10 also indicate the correctness of the test results. In the extended test scheme (Figure 5.3) for multi-port RAM, we can simply reverse the 0 to 1 and 0 to 1 from the walking-1 patterns to walking-0 patterns in SR+3 circuits.

#### 5.5.2 Overhead

The previous works [58, 50, 59] using boundary scan based test need to shift-in the test patterns. The proposed BIST-like test methods provide a quick test scenario. However, no test algorithm claims that it can provide the complete test and diagnosis in the interconnect test with a few test patterns.

Table 5.6 shows the overhead and requirements of the proposed test schemes. Symbol n is the number of address lines and m is the number of data lines. Cycles and DFFs are the testing cycles and number of data flip-flops used in that test scheme. Sig cycle and RAM type are the Sig signal which needs to check and the special requirement of the RAM type.

|           | LFSR+Analyzer | LFSR+2+Analyzer        |
|-----------|---------------|------------------------|
| Cycles    | $2^{*}(n+1)$  | n+3                    |
| DFFs      | 2m+2          | 2m+8                   |
| Sig cycle | n+1 to $2n$   | 3  to  n+3             |
| RAM type  | 1 port        | 2 port(read and write) |

Table 5.6: The overhead and requirements of the proposed test schemes.



Figure 5.5: The 3-D X-ray picture of the wire bonding short example.

# 5.6 Case study for fault detection and analysis

Figure 5.5 shows that two bounding wires are short. The fault may happen in data lines, address lines or control lines. If the control lines are faulty, it is easy to detect because most of the data operations will be wrong. We use Figure 5.1 and LFSR+Analyzer as an example and provide 5 cases to illustrate the various conditions.

#### 5.6.1 The data lines stuck-at-0

This scenario assumes the data line (DO0) stuck-at-0. The test pattern set is walking-1 in SR+1. The test results shows in Table 5.7. From the Sig signal, we can observe that the interconnects must be faulty. We get always 0 at the data

|       | SR+1         | LFSR+Analyzer | Co | ntrol |     |
|-------|--------------|---------------|----|-------|-----|
| Cycle | DFF0 to DFF4 | DFF0 to DFF3  | WE | OE    | Sig |
| 1     | 10000        | 0101          | 1  | 0     | 1   |
| 2     | 01000        | 1010          | 1  | 0     | 1   |
| 3     | 00100        | 1101          | 1  | 0     | 1   |
| 4     | 00010        | 1110          | 1  | 0     | 1   |
| 5     | 00001        | 1111          | 1  | 0     | 1   |
| 6     | 10000        | 0101          | 0  | 1     | 0   |
| 7     | 01000        | 1010(0010)    | 0  | 1     | 1   |
| 8     | 00100        | 1101(0101)    | 0  | 1     | 1   |
| 9     | 00010        | 1110(0110)    | 0  | 1     | 1   |
| 10    | 00001        | 1111(0111)    | 0  | 1     | 1   |

Table 5.7: The DFFs values of the fault diagnosis example, stuck-at-0

bit D0 but not the expected values in cycles 6 to 10. We can determine the DI0 or DO0 has stuck-at-0 fault. However, if the interconnects consist of multiple faults, the diagnosis is more complex. We may need more test patterns to analyze.

#### 5.6.2 The address lines stuck-at-1

This scenario shows that the address line (A0) stuck-at-1. We use walking-1 test pattern set in SR+1. The test results shows in Table 5.8. From the Sig signal, we can observe that the interconnects must be faulty. In cycles 6 and 7, we get the data 1010 twice. Because the latter one will overwrite the previous data, we can determine that address line A0 has fault. Although we know that A0 is faulty, we can not determine it is stuck-at-0 or stuck-at-1.

#### 5.6.3 The data and address line short

We have demonstrated the stuck-at-0/1 faults in previous examples. In fact, the wire short fault may cause wire-AND fault, wire-OR or wire-Dominate fault. The value of wire-AND(OR) will operate as AND(OR) gate. The value of wire-Dominate will present the value of the dominated wire. Table 5.9 shows the truth table of these kind of faults.

Table 5.10 shows the scenario of the address line (A1) and data line (DO1) having

|   |       | SR+1         | LFSR+Analyzer | Co | ntrol |     |
|---|-------|--------------|---------------|----|-------|-----|
|   | Cycle | DFF0 to DFF4 | DFF0 to DFF3  | WE | OE    | Sig |
| - | 1     | 10000(11000) | 0101          | 1  | 0     | 1   |
|   | 2     | 01000        | 1010          | 1  | 0     | 1   |
|   | 3     | 00100(01100) | 1101          | 1  | 0     | 1   |
|   | 4     | 00010(01010) | 1110          | 1  | 0     | 1   |
|   | 5     | 00001(01001) | 1111          | 1  | 0     | 1   |
|   | 6     | 10000(11000) | 0101(1010)    | 0  | 1     | 1   |
|   | 7     | 01000        | 1010          | 0  | 1     | 0   |
|   | 8     | 00100(01100) | 1101          | 0  | 1     | 0   |
|   | 9     | 00010(01010) | 1110          | 0  | 1     | 0   |
|   | 10    | 00001(01001) | 1111          | 0  | 1     | 0   |

 Table 5.8: The DFFs values of the fault diagnosis example, stuck-at-1

 SB+1
 LFSB+Analyzer
 Control



Table 5.9: The truth table of the wire-AND and wire-OR faultsInput dataOutput data

|                          | pare | aata   | Outpu | it dated |
|--------------------------|------|--------|-------|----------|
|                          | In 1 | In $2$ | Out 1 | Out 2    |
| wire-AND fault           | 0    | 0      | 0     | 0        |
| wire-AND fault           | 0    | 1      | 0     | 0        |
| wire-AND fault           | 1    | 0      | 0     | 0        |
| wire-AND fault           | 1    | 1      | 1     | 1        |
| wire-OR fault            | 0    | 0      | 0     | 0        |
| wire-OR fault            | 0    | 1      | 1     | 1        |
| wire-OR fault            | 1    | 0      | 1     | 1        |
| wire-OR fault            | 1    | 1      | 1     | 1        |
| wire in 1-Dominate fault | 0    | 0      | 0     | 0        |
| wire in 1-Dominate fault | 1    | 0      | 1     | 1        |
| wire in 1-Dominate fault | 0    | 1      | 0     | 0        |
| wire in 1-Dominate fault | 1    | 1      | 1     | 1        |

|       | SR+1         | LFSR+Analyzer | Co | ntrol |     |
|-------|--------------|---------------|----|-------|-----|
| Cycle | DFF0 to DFF4 | DFF0 to DFF3  | WE | OE    | Sig |
| 1     | 10000        | 0101(0001)    | 1  | 0     | 1   |
| 2     | 01000        | 1010          | 1  | 0     | 1   |
| 3     | 00100        | 1101          | 1  | 0     | 1   |
| 4     | 00010        | 1110(1010)    | 1  | 0     | 1   |
| 5     | 00001        | 1111(1011)    | 1  | 0     | 1   |
| 6     | 10000        | 0101(0001)    | 0  | 1     | 1   |
| 7     | 01000        | 1010          | 0  | 1     | 0   |
| 8     | 00100        | 1101          | 0  | 1     | 0   |
| 9     | 00010        | 1110(1010)    | 0  | 1     | 1   |
| 10    | 00001        | 1111(1011)    | 0  | 1     | 1   |

Table 5.10: The DFFs values of the fault diagnosis example, wire-AND

wire-AND fault. In cycles 1, 4 and 5, the data line (DO1) has been contaminated. We get the wrong data in cycles 6, 9 and 10.

EISL

### 5.6.4 The wire-OR fault

Similarly, Table 5.11 shows the symptom that the address line (A1) and data line (DO1) have wire-OR fault. The test results show that Sig signal detects the fault. In cycles 6 and 8, we get the data 1101 twice since the wire-OR fault induces the malfunction in address line (A1). In cycle 1 and cycle 3, the test pattern are written at the same address. However if the address line (A1) has stuck-at-1 fault, we will get the same result in Sig and data in these 10 cycles.

#### 5.6.5 The wire-Dominate fault

Due to the different driving strength of each wire line, One of the two short wires may behave as wire-Dominate fault. Table 5.12 shows the symptom that the address line (A1) and data line (DO1) have wire-Dominate fault. The dominated wire is A1. The test data patterns are changed by the dominated wire A1 in cycles 1, 4 and 5. We can detect the fault in cycles 6, 9 and 10.

|       |              | LI SIT Analyzei | C0 | minor |     |
|-------|--------------|-----------------|----|-------|-----|
| Cycle | DFF0 to DFF4 | DFF0 to DFF3    | WE | OE    | Sig |
| 1     | 10000(10100) | 0101            | 1  | 0     | 1   |
| 2     | 01000        | 1010            | 1  | 0     | 1   |
| 3     | 00100        | 1101            | 1  | 0     | 1   |
| 4     | 00010(00110) | 1110            | 1  | 0     | 1   |
| 5     | 00001(00101) | 1111            | 1  | 0     | 1   |
| 6     | 10000(10100) | 0101(1101)      | 0  | 1     | 1   |
| 7     | 01000        | 1010            | 0  | 1     | 0   |
| 8     | 00100        | 1101            | 0  | 1     | 0   |
| 9     | 00010(00110) | 1110            | 0  | 1     | 0   |
| 10    | 00001(00101) | 1111            | 0  | 1     | 0   |

Table 5.11: The DFFs values of the fault diagnosis example, wire-ORSR+1LFSR+AnalyzerControl



Table 5.12: The DFFs values of the fault diagnosis example, wire-Dominate SP + 1 FSP + A palwor Control

|       | SR+1         | LFSR+Analyzer | Co | ntrol |     |
|-------|--------------|---------------|----|-------|-----|
| Cycle | DFF0 to DFF4 | DFF0 to DFF3  | WE | OE    | Sig |
| 1     | 10000        | 0101(0001)    | 1  | 0     | 1   |
| 2     | 01000        | 1010          | 1  | 0     | 1   |
| 3     | 00100        | 1101          | 1  | 0     | 1   |
| 4     | 00010        | 1110(1010)    | 1  | 0     | 1   |
| 5     | 00001        | 1111(1011)    | 1  | 0     | 1   |
| 6     | 10000        | 0101(0001)    | 0  | 1     | 1   |
| 7     | 01000        | 1010          | 0  | 1     | 0   |
| 8     | 00100        | 1101          | 0  | 1     | 0   |
| 9     | 00010        | 1110(1010)    | 0  | 1     | 1   |
| 10    | 00001        | 1111(1011)    | 0  | 1     | 1   |

| Symptom             | Possible fault                  |
|---------------------|---------------------------------|
| One of the data bit | A data line consists of         |
| is always 0 or 1.   | stuck-at-0 or stuck-at-1 fault. |
| Two test patterns   | An address line is faulty.      |
| are the same.       |                                 |
|                     | An address line consists of     |
|                     | stuck-at-0 or stuck-at-1 fault. |

Table 5.13: The symptom and possible fault

#### 5.6.6 Diagnosis guideline

We have demonstrated the possible faults in the above discussion. The basic analysis approaches are summarized in Table 5.13. Although we may encounter multiple faults in the factory, we can use Table 5.13 as a guideline to diagnose the faults. For example, the wire-OR fault results in Table 5.11 behave like address stuck-at fault if we only analyze the test pattern at the data line. We can further analyze the address line data and find out that A1 is faulty. However, we can also use a microscope to get the picture like Figure 5.5 to see the physical connections.

# 5.7 Summary

In this chapter, we propose two SiP interconnect test schemes with the integration of LFSR and Analyzer. Our approaches use BIST-like method. Compared with the boundary scan based approaches, we do not need many shift cycles to apply test patterns. The proposed schemes can test the interconnects in SiP efficiently. By using the integrated LFSR, Analyzer and extra DFFs, the test scheme for multi-port RAM utilize very few test cycles. We also list the extra overhead of the proposed scheme and mention the multi-port RAM requirement in the extended test scheme implementation.

# Chapter 6 Conclusion

In this dissertation, we propose test schemes with related encoding methodology to reduce test power consumption and shrink test data volume in Chapter 2 and Chapter 3. In Chapter 4, we propose a multi-dimensional scan shift control test scheme to reduce test power, test data volume, and test time simultaneously. For modern SiP technology, we also propose fast test schemes for SiP interconnect test in Chapter 5. The results show that these schemes can deal with the most important test issues: test power, test data volume, and test time reduction. However, the extra area overhead is inevitable. We also provide the analysis of the trade off between the costs of these issues. In order to provide more functions in a small form-factor system, many researchers focus on the third dimension for some possible solutions in recent years. Below we describe about our contributions and future works.

# 6.1 Our contributions

For several years' research, we have surveyed the literature discussed about test power, test data volume and test time issues. The contributions of this dissertation are summarized as follows:

• We have presented decoder based compression concepts. In order to reduce the input pin number, we choose VIFO and FIVO decoder structures to compress the test data. Due to the large number of X bits in these test patterns, we use

selective pattern compression methodologies to compress the test data. The decoders have only very few care bits inside these test patterns to deal with. So the proposed VIFO and FIVO test schemes can achieve low power and high compression rate.

- Based on the multiple scan chain concept for lower power, we develop a generic multi-dimensional scan shift control test scheme to achieve low test power, small test data volume and short test time. With the multi-dimensional scan control structure and the related encoding method, the area overhead is only around 1%.
- From the recent technology trend, the SiP technology becomes more and more popular today. Most of the electronic products need memory to operate. However, the memory providers often do not provide the test features. The system integration engineers have the problem with testing the interconnects between the system and memory blocks. We propose a fast test scheme for interconnects test between ASIC and RAM. With the proposed test scheme, the chip vendor can enhance the SiP product yield.

# 6.2 Future works

Since our dissertation focus on test power, test data volume and test time by new test schemes, we do not consider the setup time and hold time issues in this dissertation. Setup time is the time period before the active clock edge which the input of the flipflop can not change to prevent the flip-flip latching the wrong value. Hold time is the time period after the active clock edge which the output of the flip-flop can maintain the value. The scan chain structure is many flip-flops connected one by one. That might cause the hold time problem. The setup time and hold time issues could be a further research topic in scan design. The third dimension provides a possible way to enhance the semiconductor industry. The foundary companies provide through silicon via (TSV) solutions and the package companies provide advanced package solutions. However, the TSV technology still has a lot of challenges that need to be solved. In addition, most of the low cost and technology ready solutions today are package based solutions. The package based solutions provide a way for the heterogeneous integration.

Although the reliability issue is not a new topic, the bio-based system may need to take care of these issues, especially the embedded medical system. Because the system is implanted inside the human body, the robustness of the system is very important. Any operation to replace the system inside our body will increase the risk of our life.

#### 6.2.1 The 3D design challenges

As the advancement of semiconductor fabrication technology, the semiconductor device scale will reach the physical limitation soon. Traditional 2-dimensional chip design can not have more devices on a fixed size chip. Researchers try to use the 3rd dimension to implement more devices in a fixed area. The 3-dimensional integrated circuit (3D-IC) becomes a buzz word in recent years [60, 61].

We list some of the future 3D challenges below:

- The testability of the 3D design may emerge in the near future.
- We may need a new methodology to enhance the fault coverage of the 3D design.
- The heat problem induced by test power will become even more difficult to handle in 3D design.

#### 6.2.2 The heterogeneous design challenges

The 3D-IC technology reveals the future of semiconductor industry, however it also impacts the chip design and test methodology. Low cost [62] and heterogeneous [63, 64] integration are important characteristics of 3D system design. Some digital 3D test schemes are been developed in [65], but the test method of integrated digital, mixed signal and other heterogeneous devices is still a challenge.

Some of the heterogeneous design challenges are shown below:

- We may need to pay more attentation on the reliability issues in bio-based system when we develope the test schemes.
- We need some new ideas to integrate the various test methods in heterogeneous system.
- The test methodology of the heterogeneous system still needs researchers to put more efforts on it.

# Bibliography

- A. Chandra and K. Chakrabarty. A unified approach to reduce soc test data volume, scan power and testing time. *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, 22(3):352–362, March 2003.
- [2] Mentor Graphics. Silicon Test & Yield Analysis Whitepaper. ICombining Low Pin Count Test with Scan Compression Dramatically Reduces Test Interface and Cost, http://www.mentor.com/products/siliconyield/techpubs/combining-low-pin-count-test-with-scan-compressiondramatically-reduces-test-interface-and-cost-54922, 2010.
- [3] A. Crouch. Design-for-Test for Digital IC's and Embedded Core Systems. Prentice Hall, 1999.
- [4] B. Koenemann. Lfsr-coded test patterns for scan designs. In *Proceedings VLSI Test Symposium*, pages 82–92, 2002.
- [5] N. Nicolici and B. M. Al-Hashimi. Power-Constrained Testing of VLSI Circuits. Kluwer Academic Publishers, 2003.
- [6] P. Girard. Survey of low-power testing of vlsi circuit. In Proceedings European Test Conference, pages 237–242, 1991.
- [7] C. P. Ravikumar, M. Hirech, and X. Wen. Test strategies for low power devices. In *Proceedings Design, Automation and Test in Europe*, pages 728–733, 2008.

- [8] R. Sankaralingam, R. R. Oruganti, and N. A. Touba. Static compaction techniques to control scan vector power dissipation. In *Proceedings VLSI Test* Symposium, pages 35–40, 2000.
- [9] J. Li, Q. Xu, Y. Hu, and X. Li. On reducing both shift and capture power for scan-based testing. In *Proceedings Asia and South Pacific Design Automation Conference*, pages 653–658, 2008.
- [10] K. J. Lee and J. J. Chen. Reducing test application time and power dissipation for scan-based testing via multiple clock disabling. In *Proceedings Asian Test* Symposium, pages 338–343, 2002.
- [11] I. Lee, J. H. Jeong, and T. Ambler. Two efficient methods to reduce power and testing time. In *Proceedings International Symposium on Low Power electronics* and Design, pages 167–172, 2005.
- [12] L. C. Hsu and H. M. Chen. On optimizing scan testing power and routing cost in scan chain design. In *Proceedings International Symposium on Quality Electronic Design*, pages 451–456, 2006.
- [13] I. Lee, Y. M. Hur, and T. Ambler. The efficient multiple scan chain architecture reducing power dissipation and test time. In *Proceedings Asian Test Symposium*, pages 94–97, 2004.
- [14] M. Elm, H. J. Wunderlich, M. E. Imhof, C. G. Zoellin, J. Leenstra, and N. Maeding. Scan chain clustering for test power reduction. In *Proceedings Design Automation Conference*, pages 828–833, 2008.
- [15] K. J. Lee, J. J. Chen, and C. H. Huang. Using a single input to support multiple scan chains. In *Proceedings International Conference on Computer-Aided Design*, pages 74–78, 1998.

- [16] K. J. Lee, S. J. Hsu, and C. M. Ho. Test power reduction with multiple capture orders. In *Proceedings Asian Test Symposium*, pages 26–31, 2004.
- [17] Y. Shi, N. Togawa, S. Kimura, M. Yanagisawa, and T. Ohtsuki. Low power test compression technique for designs with multiple scan chains. In *Proceedings Asian Test Symposium*, pages 386–389, 2005.
- [18] L. Whetsel. Adapting scan architectures for low power operation. In Proceedings International Test Conference, pages 863–872, 2000.
- [19] C. Y. Lin and H. M. Chen. A selective pattern-compression scheme for power and test-data reduction. In *Proceedings International Conference on Computer-Aided Design*, pages 520–525, 2007.
- [20] M. Nourani, M. Tehranipour, and K. Chakrabarty. Nine-coded compression technique with application to reduced pin-count testing and flexible on-chip decompression. In *Proceedings Design*, *Automation and Test in Europe*, pages 1284–1289, 2004.
- [21] S. Kajihara, K. Taniguchi, I. Pomeranz, and S. M. Reddy. Test data compression using don't-care identification and statistical encoding. In *Proceedings Asian Test Symposium*, pages 67–72, 2002.
- [22] M. Nourani and M. H. Tehranipour. Rl-huffman encoding for test compression and power reduction in scan applications. ACM Transactions on Design Automation of Electronic Systems, 10(1):91–115, January 2005.
- [23] A. Jas, J. G. Dastidar, M. E. Ng, and N.A. Touba. An efficient test vector compression scheme using selective huffman coding. *IEEE Transactions* on Computer-Aided Design of Integrated Circuits and Systems, 22(6):797–806, June 2003.

- [24] S. W. Golomb. Run-length encoding. *IEEE Trans. Inform. Theory*, IT(12):399–401, December 1966.
- [25] A. Chandra and K. Chakrabarty. Low-power scan testing and test data compression for system-on-a-chip. *IEEE Transactions on Computer-Aided Design* of Integrated Circuits and Systems, 21(5):597–604, May 2002.
- [26] J. Lee and N. A. Touba. Lfsr-reseeding scheme achieving low-power dissipation during test. *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, 26(2):396–401, February 2007.
- [27] J. Rajski, J. Tyszer, M. Kassab, and N. Mukherjee. Embedded deterministic test. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 23(5):776–792, 2004.
- [28] G. Mrugalski, J. Rajski, D. Czysz, and J. Tyszer. New test data decompressor for low power applications. In *Proceedings Design Automation Conference*, pages 539–544, 2007.
- [29] J. Li, Q. Xu, Q. Xu, Y. Hu, and X. Li. On reducing both shift and capture power for scan-based testing. In *Proceedings Asia and South Pacific Design Automation Conference*, pages 653–658, 2008.
- [30] H. Tang, S. M. Reddy, and I. Pomeranz. On reducing test data volume and test application time for multiple scan chain designs. In *Proceedings International Test Conference*, pages 1079–1088, 2003.
- [31] J. Aerts and E. J. Marinissen. Scan chain design for test time reduction in core-based ics. In *Proceedings International Test Conference*, pages 448–457, 1998.
- [32] H. Ando. Testing vlsi with random access scan. In Proceedings Digest of Computer Society International Conference, pages 50–52, 1980.

- [33] D. H. Baik and K. K. Saluja. Progress random access scan: A simultaneous solution to test power, test data volume and test ttime. In *Proceedings International Test Conference*, pages 1–10, 2005.
- [34] Y. Hu, X. Fu, X. Fan, and H. Fujiwara. Localized random access scan: towards low area and routing overhead. In *Proceedings Asia and South Pacific Design Automation Conference*, pages 565–570, 2008.
- [35] A. Orailoglu, W. Rao, and G. Su. Frugal linear network-based test decompression for drastic test cost reduction. In *Proceedings International Conference on Computer-Aided Design*, pages 721–725, 2004.
- [36] L. T. Wang, K. S. Abdel-Hafez, S. Wu, X. Wen, H. Furukawa, F. S. Hsu, S. H. Lin, and S. W. Tsai. Virtualscan: A new compressed scan technology for test cost reduction. In *Proceedings International Test Conference*, pages 916–925, 2004.
- [37] A. R. Pandey and J. H. Patel. An incremental algorithm for test generation in illinois scan architecture based designs. In *Proceedings Design, Automation* and Test in Europe, pages 368–375, 2002.
- [38] S. P. Lin, C. L. Lee, J. E. Chen, J. J. Chen, K. L. Luo, and W. C. Wu. A multilayer data copy test data compression scheme for reducing shiftingin power for multiple scan design. *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, 15(7):767–776, July 2007.
- [39] C. S. Tautermann, A. Wurtenberger, and S. Hellebrand. Data compression for multiple scan using dictionaries with corrections. In *Proceedings International Test Conference*, pages 926–935, 2004.
- [40] L. Li and K. Chakrabarty. Test data compression using dictionaries with fixedlength indices. In *Proceedings VLSI Test Symposium*, pages 219–224, 2003.

- [41] Y. Shi, N. Togawa, S. Kimura, M. Yanagisawa, and T. Ohtsuki. An efficient multiscan-based test compression technique for test cost reduction. In *Proceed*ings Design Automation Conference, pages 653–658, 2006.
- [42] G. Seroussi and M. J. Weinberger. On adaptive strategies for an extended family of golomb-type code. In *Data Compression Conference*, pages 131–140, 1997.
- [43] W. Koh. System in package (sip) technology applications. In Proceedings Electronic Packaging Technology, 2005 6th International Conference, pages 61–66, 2005.
- [44] Y. H. Song, S. G. Kim, K. J. Rhee, D. S. Cho, and T. S. Kim. The reliability issues on asic memory integration by sip (system-in-package) technology. In *Proceedings IEEE International SOC Conference*, pages 7–10, 2003.
- [45] L. T. Wang, C. E. Stroud, and N. A. Touba. System On Chip Test Architectures. Morgan Kaufmann Publishers, 2007.
- [46] D. Appello, P. Bernardi, M. Grosso, and M. S. Reorda. System-in-package testing: Problems and solution. *IEEE Design and Test of Computers*, 23(3):203– 211, May-June 2006.
- [47] B. Dilip. Testing interconnections to static rams. *IEEE Design & Test*, 8(2):63–71, April 1991.
- [48] P. B. Geiger and S. Butkovich. Boundary-scan adoption an industry snapshot with emphasis on the semiconductor industry. In *Proceedings International Test Conference*, pages 1–10, 2009.
- [49] IEEE. *IEEE std* 1500 Standard for Embedded Core Test. http://grouper.ieee.org/groups/1500, 2010.

- [50] F. de Jong and J. L. W. Adriaan. Memory interconnection test at board level. In Proceedings International Test Conference, pages 328–337, 1992.
- [51] A. Chandra and K. Chakrabarty. Frequency-directedrun-length (fdr) codes with application to system-on-a-chip test data compression. In *Proceedings VLSI Test Symposium*, pages 42–47, 2001.
- [52] K. J. Balakrishnan and N. A. Touba. Improving linear test data compression. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 14(11):1227–1237, November 2006.
- [53] K. J. Balakrishnan and N. A. Touba. Relationship between entropy and test data compression. *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, 26(2):386–395, February 2007.
- [54] Synopsys. TetraMax.
- [55] Cadence. SOC Encounter.



- [56] F. J. MacWilliams and N. J. A. Sloane. Pseudo-random sequences and arrays. Proceedings of the IEEE, 64(12):1715–1729, December 1976.
- [57] Sybille Hellebrand, Steffen Tarnick, Janusz Rajski, and Bernard Courtois. Generation of vector patterns through reseeding of multiple-polynomial linear feedback shift registers. In *Proceedings International Test Conference*, pages 120– 129, 1992.
- [58] E. Sarkany and W. Hart. Minimal set of patterns to test ram components. In Proceedings International Test Conference, pages 759–764, 1987.
- [59] J. Zhao, F. J. Meyer, and F. Lombardi. Maximal diagnosis of interconnects of random access memories. In *Proceedings VLSI Test Symposium*, pages 378–383, 1999.

- [60] E. J. Marinissen and Y. Zorian. Testing 3d chips containing through-silicon vias. In Proceedings International Test Conference, pages 1–11, 2009.
- [61] C. G. Hwang. New paradigms in the silicon industry. In IEDM 1 Overview of Wafer-Level 3D ICs, 2006.
- [62] M. Topper et. al. Low cost wafer-level 3-d integration without tsv. In Proceedings Components and Technology Conference, pages 339–344, 2009.
- [63] I. O'Connor et. al. Heterogeneous systems on chip and systems in package. In Proceedings Design, Automation and Test in Europe, pages 737–742, 2007.
- [64] D. Sparks, S. Massoud-Ansari, and N. Najafi. Reliable vacuum packaging using nanogetters and glass frit bonding. *Reliability, Testing and Characterization of* MEMS/MOEMS III, SPIE, 5343:70–78, 2004.
- [65] K. Sasidhar, L. Alkalai, and A. Chatterjee. Testing nasa's 3d-stack mcm space flight computer. *IEEE Design and Test of Computers*, 15(3):44–55, July 1998.
- [66] N. Jarwala and C. W. Yau. A new framework for analyzing test generation and diagnosis algorithms for wiring interconnects. In *Proceedings International Test Conference*, pages 63–70, 1989.
- [67] S. Park. A new complete diagnosis patterns for wiring interconnects. In Proceedings Design Automation Conference, pages 203–208, 1996.
- [68] Z. Barzilai, D. Coppersmith, and A. Rosenberg. Exhaustive generation of bit patterns with applications to vlsi self-testing. *IEEE Transactions on Comput*ers, 32(2):190–194, February 1983.
- [69] D. T. Tang and C. L. Chen. Logic test pattern generation using linear codes. *IEEE Transactions on Computers*, C-33(9):845–850, September 1984.

- [70] F. Siavoshi. Wtpga: A novel weighted test pattern generation approach for vlsi built-in self-test. In *Proceedings International Test Conference*, pages 256–262, 1988.
- [71] B. Chess and T. Larrabee. Bridge fault simulation strategies for cmos integrated circuits. In *Proceedings Design Automation Conference*, pages 458–462, 1993.

