# Single-Ended Subthreshold SRAM With Asymmetrical Write/Read-Assist Ming-Hsien Tu, Jihi-Yu Lin, Ming-Chien Tsai, Shyh-Jye Jou, Senior Member, IEEE, and Ching-Te Chuang, Fellow, IEEE Abstract—In this paper, asymmetrical Write-assist cell virtual ground biasing scheme and positive feedback sensing keeper schemes are proposed to improve the read static noise margin (RSNM), write margin (WM), and operation speed of a single-ended read/write 8 T SRAM cell. A 4 Kbit SRAM test chip is implemented in 90 nm CMOS technology. The test chip measurement results show that at 0.2 V $\rm V_{DD}$ , an operation frequency of 6.0 MHz can be achieved with power consumption of 10.4 $\mu W$ . Index Terms—Low power, low voltage, single-ended SRAM. #### I. INTRODUCTION RAM occupies most of the SoC area and dominates the system performance and power. Especially in bioelectronics and other emerging applications, such as implanted medical instruments and wireless body sensing networks, low supply voltage and low-power SRAMs are required to extend system operation time with limited energy resource. Therefore, low-voltage and low-power subthreshold SRAM circuit designs have become ever-increasingly important [1]–[7]. Although subthreshold logic circuits are becoming popular in ultralow-power applications, designing robust subthreshold SRAM is extremely challenging because of low supply voltage and increasing device variability [1], [8]–[10] in the sub-100 nm CMOS device. SRAM is more vulnerable to process variations and threshold voltage $(V_{\rm T})$ mismatch than logic circuits, since minimum or subgroundrule size devices are used in the cell and there is no "averaging" effect as in the logic data paths. As device size continuously scales down, the increasing leakage current, systematic process variations, and local random variation lead to large spread in read SNM and cause destructive Read errors at the tail of the distribution. Also, Write errors can occur due to the difficulty in maintaining the device strength ratio in subthreshold region. Recently, various 6 T, 7 T, 8 T, 10 T, and 11 T SRAM cells are proposed for subthresold SRAM applications [2]–[7], [10], [13], [15], [24]. The major approach for improving the Read SNM is to decouple the cell storage nodes from the bit-line Read current, thus making the Read SNM equal to the Hold SNM. To Manuscript received April 05, 2010; revised July 01, 2010; accepted July 07, 2010. Date of publication November 11, 2010; date of current version December 15, 2010. The work is supported in part by Ministry of Economic Affairs (MOEA), R.O.C., under Project 98-EC-17-A-01-S1-124 and in part by National Science Council (NCS), R.O.C., under Project NSC95-2220-E-009-021. This paper was recommended by Associate Editor A. Marshall. The authors are with the Electronics Engineering Department and Institute of Electronics, National Chiao Tung University, Hsinchu, 300 Taiwan, R.O.C. (minghsien.ee95g@nctu.edu.tw). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCSI.2010.2071690 Fig. 1. The drawback of traditional 5 T single-ended SRAM cell. improve Write Margin (WM) and Write performance, higher supply voltage for the Write access transistors and/or Write-assist circuits are used at the cost of extra supply voltage and extra control circuits. Single-ended SRAM (or single bit-line SRAM) has become common approaches to reduce leakage and switching power of bit-line (BL) [4], [16]-[19]. A bit-line usually has a very heavy capacitance loading. Every time a Read/Write operation is performed, the switching of bit-line costs significant power consumption. The single-ended scheme reduces one half of the active power for bit-line switching. Fig. 1 shows the traditional 5 T single-ended SRAM cell. A single-ended cell has only one bit-line to carry out the Write operation. To Write "0," the NMOS access transistor can offer strong pull-down strength, while a successful Write "1" operation has to overcome the sink current of the cell pull-down NMOS M4, and flip the left inverter. Once the left inverter changes its state, the feedback will help the cross-coupled inverters to change the storage data. In single-ended scheme, however, the Q node voltage would suffer $V_T$ loss through NMOS access transistor (M5). In addition, the Write "1" signal will be weakened further by the voltage dividing effect between M4 and M5. The Write "1" SNM of the single-ended bit-line structure is severely degraded especially at low supply voltage, as shown in Fig. 2. Notice that since there is no access transistor for the left inverter, the voltage transfer curve (VTC) for the left inverter is essentially the Standby (Retention) mode VTC. On the other hand, since the bit-line is held at "1" (Write "1"), as opposed to being pulled down to GND in 6 T cell, the VTC for the right inverter is essentially the Read mode VTC. Also notice that while Write "1" is successful (negative WSNM) at $V_{\rm DD}=1.0~\rm V$ , it fails (positive WSNM) when $V_{\rm DD}$ is reduced to 0.6 V. This is the main restriction of single-ended SRAM structure. In this paper, we propose Write/Read-assist techniques for an 8 T single-ended SRAM for subthreshold operation [7]. Fig. 2. Write "1" SNM of single-ended bit-line at $V_{\rm DD}$ = 0.6 V and 1 V. To enhance the Write "1" capability and improve the WSNM and Write speed, an asymmetrical floating cell virtual ground (VGND) scheme is proposed. For Read operation, besides using a read buffer to isolate the cell storage nodes from the bit-line, a positive feedback sensing keeper is proposed to eliminate the requirement of bit-line precharge operation and enhance the Read performance. Experimental results for a 4 K bit SRAM in 90 nm CMOS technology show that the proposed scheme operates successfully down to 0.2 V with 6 MHZ operation frequency at power consumption of 10.4 $\mu$ W. The operation concepts of the proposed Write/Read-assist schemes are described in Section II-A. A bit-line positive feedback sensing keeper scheme to improve the Read margin and speed is discussed in Section II-B. Both techniques are implemented in a 256\* 16 bit SRAM test chip in 90 nm CMOS process. The simulation results are elaborated in Section III. Section IV illustrates the measurement results of the test chip. The conclusion is given in Section V. # II. PROPOSED WRITE/READ-ASSIST TECHNIQUES A novel asymmetrical Write-assist cell virtual ground (VGND) biasing scheme is proposed to improve the RSNM, WM, and operation speed of single-ended 8 T SRAM. As shown in Fig. 3, this technique is based on symmetrical cross-coupled inverters (M1-M4) and the isolation of Read and Write paths in single-ended 8 T SRAM cell. The separation of Read/Write paths is one of the common approaches to mitigate noise disturbance. Here, the Read isolation port (M6 and M7) eliminates the Read disturbance. As such, the static noise margin in Read mode is the same as that in Standby mode, thus facilitating low voltage operation. The Read and Write operations share the same BL to reduce active BL switching power. Fig. 4 depicts the control-signal setting in Hold, Read, and Write operations. Read Word-Line (RWL) and Write Word-Line (WWL) are set to "Low," and virtual ground enable (VGND\_EN) is set to "High" to tie virtual ground (VGND) to "GND" in Hold operation. In Read operation, WWL and VGND\_EN remain at "Low" and "High," respectively, and Fig. 3. Single-ended 8 T SRAM array and cell structure with asymmetrical cell VGND biasing. Fig. 4. Operation timing diagram. RWL turns on to "High" to charge or discharge BL according to cell data stored at node Q. Finally, in Write operation, VGND\_EN is switched to "Low" with a pulse to float VGND, and WWL and RWL are set to "High" and "Low," respectively. ### A. Write Operation of Asymmetrical Write-Assist Cell Several techniques have been proposed to mitigate the Write "1" problem of single bit-line structure. For examples, asymmetrical SRAM cells with asymmetrical cell transistor sizing [Fig. 5(a) [22]] or multi- $V_{\rm T}$ access transistors [14] and virtual floating cell VDD/GND [15], [20], [21] [Fig. 5(b)]. The stable states of a cross-coupled inverter latch can be viewed as at state A and state C in the voltage transfer curve [Fig. 6(a)]. A successful Write operation using single bit-line has to move state Fig. 5. (a) Asymmetrical cell transistor sizing. (b) 1 T equalizer (M6) insertion in the cell and virtual floating cell ground. Fig. 6. (a) The voltage transfer curve of traditional cell latch. (b) The effect of the 1 T equalizer insertion in the cell. Fig. 7. (a) The voltage transfer curve of asymmetrical transistor sizing. (b) The lower voltage barrier effect. A to state C. However, the voltage barrier (from state A to state B) is too high for single bit-line to overcome it. The asymmetrical cell structure can improve the SNM without degrading Read/Write operation. The asymmetrical cell transistor sizing and multi-V<sub>T</sub> are used to lower the voltage barrier between state A and state B. The voltage transfer curve will be asymmetrical as shown in Fig. 7(a). The voltage barrier of Write "1" is reduced due to the asymmetrical SNM [Fig. 7(b)], making it easier to move from state A to state C. Although the Write "0" voltage barrier remains the same, the NMOS access transistor can offer large enough pull-down force to flip the storage data. The asymmetrical SRAM cell structure also has a better Read SNM due to higher voltage barrier for the Read port. Nevertheless, the Hold SNM of the asymmetrical SRAM cell is sacrificed and the multi-V<sub>T</sub> technique will increase the process complexity, cell area, and cost. The asymmetrical sizing technique is still limited to about 0.7 V supply voltage, because the driving capability of bit-line is reduced at lower supply voltage. The 1 T equalizer transistor (M6) shown in Fig. 5(b) can move state A to near state B [Fig. 6(b)]. The cross-coupled inverters become unstable, thus facilitating the Write operation. When the cross-coupled inverters stay (temporarily) at meta-stable state B, they do not turn off completely. There are large static currents in Fig. 8. Fully floating cell VGND [1]. the cross-coupled inverters, resulting in large power consumption. A virtual ground floating structure (Fig. 8) can be used to minimize the power consumption. The virtual floating ground of cells cuts off the dc paths to limit the power consumption. Also, the voltage dividing effect (between M5 and M4) is reduced by fully floating the cell VGND during Write operation. However, floating cell VGND reduces V<sub>GS</sub> across M2 due to the rising VGND. Furthermore, the threshold voltage of M2 is raised by the body effect due to its negative V<sub>BS</sub>. Thus, the trip voltage of the left inverter increases, impeding the start of positive feedback to flip the cell storage data. Finally, the timing of virtual floating ground and the 1 T equalizer transistor has to be very accurate. If the virtual floating ground is removed too early, the blocking of static short-circuit current will not function well. On the contrary, if the virtual floating ground is removed too late, the Write operation may fail. These drawbacks limit the yield due to serious parameter variations at low-voltage operation. In our asymmetrical cell VGND biasing scheme (Fig. 3), only the source node of M4 is floating during Write operation. The scheme mitigates the M4/M5 voltage dividing effect without increasing the threshold voltage of M2 and the trip voltage of the left inverter. Therefore, it is easier to turn on M2 to start the feedback process to flip the cell storage data. Notice that with M4 source node floating, the VTC of the right inverter becomes essentially a straight line at $V_{\rm DD}$ as shown in Fig. 9. This implies that the right inverter loses the capability of pulling down. With the proposed asymmetrical Write-assist cell virtual ground biasing scheme, the WSNM is widened/improved significantly, even at low $V_{\rm DD}$ as shown in Fig. 9. The WSNM can be seen to be almost 0.5 $V_{\rm DD}$ and is even substantially larger than that in traditional 6 T cell with differential bit-line structure. In Write "0" mode, the access transistor M5 offers a strong "0" pull-down signal to flip the storage data. The Q node is pull down to change the state of the left inverter. Although the latch feedback loop is broken due to floating of the M4 source node, the Write "0" operation wouldn't be affected. The Write-ability of the cell under row-based VGND and column-based VGND array architecture has also been assessed. It is found that for row-based VGND array architecture, a row-based VGND is connected to discharged BLs through the path of M4 and M5 of Write "0" cells. For a mostly Write "0" pattern, the row-based VGND would essentially be held at "0." Therefore, the Write "1" operation of a row-based VGND array architecture could fail with mostly Write "0" patterns (e.g., at 0.9 V $V_{\rm DD}$ with 16 cells per row). On the other hand, with Fig. 9. Improved WSNM for writing "1" at $V_{\rm DD}\,=\,0.5\,$ V with proposed scheme. Fig. 10. Pertinent waveforms during Write "1" operation. column-based VGND during Write operation, although the hold SNM of the half-selected cells at an active column would be degraded, a column-based VGND is really floated. As shown in Fig. 10, the Write operation of column-based VGND array architecture can be successfully performed down to $V_{\rm DD}=0.5$ V at SNFP $-40^{\circ} \rm C$ (Write "1" worst corner). Besides, Write replica tracing circuit could be used to optimize the word-line and VGND pulse widths to minimize the impacts on half-selected cells at an active column. Another commonly used metrics to characterize Write-ability is the Write "1" Margin (WM1). The Write "1" Margin for differential BL SRAM cell is defined as the highest voltage level of the low-going bit-line that can flip the cell. The Write "1" Margin for single BL SRAM cell is defined as the lowest voltage level of the high-going bit-line that can flip the cell. Fig. 11 shows the Write "1" Margin distribution generating from the 50 000 $(4\sigma)$ Monte Carlo simulations at 0.5 V $V_{DD}$ and 25°. Both the Write "1" worst case waveforms and the Write "1" Margin distribution indicate the proposed asymmetrical cell VGND biasing scheme can tolerate severe process variation at low voltage operation. The asymmetrical cell VGND biasing scheme has a larger Write Margin than the traditional single bit-line cell, as shown in Fig. 11. Write "1" Margin of 50 000 Monte Carlo simulation $(4\sigma)$ at SNFP, $-40^{\circ}\mathrm{C}$ (worst corner). The Write "1" active cell is in worst case that there are 31 half-selected cells storing "1" in the same column. Fig. 12. Write "1" Margin versus supply voltage. Fig. 12. The Write Margin of 6 T SRAM cell is marginally better than the proposed scheme. This is due to the push-pull configuration with dual bit-lines to assist in flipping data during Write operation. As can be seen, the proposed scheme offers over 2X improvement in Write Margin compared with the traditional single bit-line structure, and almost the same Write Margin as the dual bit-line 6 T cell at low supply voltage. The Write Margin is about 35% of the supply voltage in subthreshold region (0.2 to 0.4 V). The proposed circuit also achieves better Write time (or Time-to-Write) compared with the case without Write-assist as shown in Fig. 13. The Write operation can be successfully performed down to 200 mV. These characteristics renders the cell more useful with wide operating supply voltage range. In SRAM design, bit-interleaving architecture often is combined with error correction code (ECC) circuit to reduce soft error rate. Notice that, similar to previously reported subthreshold SRAM cells in [2], [3], [5], [20], [23], and [24], the proposed scheme still has Write half-select disturb problem in bit-interleaving architecture. However, in this work, row-based WWL, column-based VGND, and BL precharge operation can be used to eliminate Write half-selected disturb problem and form a bit-interleaving architecture. The design target of this work is the low-power dissipation. In an interleaving architecture, half-selected cells at an active row will perform Read operation and dissipate extra power during Read or Write operation. Therefore, non-bit-interleaving architecture or Byte Fig. 13. Cell "Write time" (Time-to-Write) versus supply voltage. Fig. 14. Read operation of the proposed Asymmetrical-floating-VGND 8 T cell: (a) Read "1," and (b) Read "0." Fig. 15. RSNM versus supply voltage. Writing architecture would be used not only to best exploit the improved RSNM and WSNM of the proposed low-voltage low-power SRAM cell but also to reduce power dissipation. 1) Read Operation With Isolation Buffer: The RSNM of the traditional 6 T RSNM degrades rapidly with supply voltage scaling [1] due to voltage dividing effect between the access transistor and the pulled down transistor. The proposed 8 T cell uses a CMOS inverter (M6/M7) for Read-out. In Read operation, the RWL signal turns on pass-transistor M8, and the stored data is transferred through M6/M7 inverter and M8 to the bit-line as indicated in Fig. 14. Due to the isolation of cell storage node from the Read current path, the RSNM is significantly better than the traditional 6 T cell, as shown in Fig. 15. The RSNM is almost 50% of the supply voltage in subthreshold region (0.2 to 0.4 V). Fig. 16. Positive feedback sensing keeper. The Read operation is executed by the static M6/M7 inverter buffer, so power dissipation can be reduced by not precharging the bit-lines [19]. The circuit consumes AC power only when the Read-out data changes. Moreover, the static Read-out buffer of 8 T cell also reduces the impact of process variation and enhances reliability. Traditionally, bit-lines are precharged to "High" [2], [3]. During Read operation, the precharge transistor and the half-selected cells along the selected bit-line pair form a configuration similar to multi-input dynamic NOR gate. The dynamic bit-line is sensitive to the leakage due to process variations because the cell leakage will degrade the Read margin. The static Read-out inverter offers a larger Read margin to mitigate the effect of process variation. If a design target is for area efficiency, not power efficiency and robustness, the simplified 7 T SRAM cell [Fig. 19(b)] by removing M6 from the proposed 8 T SRAM cell could be used to achieve high density, but BL precharge circuit should be used to make function work. However, the Read-out data has to pass through M8. Thus, the Read-out data cannot charge the bit-line to full rail through the pass transistor M8 in Read "1" operation. The weak "1" bit-line signal may result in wrong Read judgment for the subsequent output stage. The power dissipation would also increase due to leakage current of subsequent stage caused by the non-full-rail bit-line signal. By using the positive feedback sensing keeper introduced in the next section, the bit-line voltage swing and Read-out speed can be improved. # B. Positive Feedback Sensing Keeper Fig. 16 shows the proposed positive feedback sensing keeper to improve the Read "1" operation. Keeper circuit is a common technique for single bit-line scheme. The proposed positive feedback sensing keeper not only pulls the bit-line to full rail, but also improves the Read-out speed. In Standby mode, the SA\_En signal stays "High" and M2 turns on to send a "High" signal to turn off M3. With M3 off, the positive feedback loop is cut off, thus having no effect on the bit-line. In Read operation, the SA\_En goes "Low" to turn on M1 to sense the bit-line signal. When the bit-line is "Low," the inverter turns off M3, so the positive feedback sensing keeper remains off and would not affect the pull-down of the bit-line by the cell. When the bit-line is "High," the high bit-line signal enables the positive feedback loop to pull-up the bit-line to full rail through M3. To drive the Fig. 17. (a) The VTC of the 8 T SRAM cell with asymmetrical Write-assist cell VGND scheme. The voltage barrier diagram of (b) Hold and Read, (c) Write "0," and (d) Write "1" operations. output load, the Read-out buffer uses large size devices. The SA.En signal resets to "High" at the end of every cycle. The butterfly curve and operation voltage barriers of the single-end 8 T SRAM cell with the asymmetrical Write-assist cell VGND scheme are shown in Fig. 16. Due to symmetrical cross-coupled inverter, the Hold SNM of the proposed structure is better than that of an asymmetrical cell [Fig. 5(a)]. In addition, the voltage barriers become dynamically optimized according to operations because of the asymmetrical Write-assist cell VGND scheme. The Hold and Read operation have large voltage barriers to retain data reliably by setting VGND to "0," as shown in Fig. 16(b). Fig. 16(c) shows that floating VGND does not affect Write "0" voltage barrier from C to B. Finally, due to floating VGND, the voltage barrier from A to B is reduced significantly to facilitate Write "1." # III. IMPLEMENTATION AND SIMULATION RESULTS The aforementioned techniques have been implemented in a 256\*16 bits SRAM test chip in 90 nm CMOS technology. The SRAM is designed for portable bioelectronics applications. To extend battery life, the SRAM test chip adopts supply voltage of 0.2–0.6 V to achieve low power consumption. The 4 Kbits SRAM is composed of 8 banks (Fig. 18). Each bank consists of 32 words \* 16 bits. The global data bus can be used to communicate data with the banks. The unselected banks are kept in Standby mode, and only one bank will be active. The CLK gating for unselected bank also saves substantial power. The layout of the proposed 8 T SRAM cell with VGND is shown in Fig. 19(a) and the 8 T SRAM cell area is $3.96 \mu \text{m} * 0.8 \mu \text{m} =$ $3.168 \, \mu \mathrm{m}^2$ . The area-efficient simplified 7 T SRAM cell layout is shown in Fig. 19(b). The area of the simplified 7 T SRAM cell is 3.14 $\mu\mathrm{m}*0.8~\mu\mathrm{m}=2.512~\mu\mathrm{m}^2$ , which is almost equal to the area of conventional 8 T SRAM. The proposed design offers 5.7X improvement in power dissipation compared with traditional 6 T SRAM cell at 1 V supply voltage, and 17.3X improvement with supply scaled to 0.6 V as shown in Fig. 20. At 0.6 V $V_{\rm DD}$ and 25°C, the average power dissipation increases significantly at FF corner as shown Fig. 18. System block diagram of 256\*16 bits SRAM. Fig. 19. The cell layouts of (a) the proposed 8 T SRAM cell and (b) the simplified 7 T SRAM cell. Fig. 20. Power comparison between conventional 6 T cell and proposed cell. in Fig. 21. This is due to the dramatic increase of leakage power for the Standby cells. The target operation frequency is 6 MHz with 0.2–0.6 V supply voltage. From postlayout simulation results, the circuit can achieve 234 MHz and 6 MHz at 0.6 and 0.2 V, respectively. The asymmetrical Write-assist cell VGND biasing greatly improves Write operation, especially at low supply voltage. Fig. 21. Power/bit at various process corners at 0.6 V $V_{\rm DD}$ and 25°C. Fig. 22. Layout of the 4 Kbits SRAM. Fig. 23. Microphotograph of SRAM test chip. The improved $V_{\rm MIN}$ and operation speed offer more flexible wide range applications, and enable continuing scaling of SoC supply voltage. #### IV. MEASUREMENT RESULTS A test chip of 4 Kbits SRAM (8 bank $\times$ 32 words $\times$ 16 bits) is implemented in 90 nm 1P9M SPRVT process. The core area is 168 $\mu$ m\* 265 $\mu$ m and the layout is shown in Fig. 22. The photo of a bonded die is shown in Fig. 23. Fig. 24 shows the measured waveforms captured by logic analyzer. A 6 MHz clock signal triggers Write/Read operation of the 32 words $\times$ 16 bits. The FFFF signal of data-out signals implies that the 16 bits data is correct under Read "1" mode. The 0 Fig. 24. Measurement results using Logic Analyzer. Fig. 25. The measured results of pertinent waveforms on Oscilloscope. Fig. 26. Pertinent waveforms under 200 mV core VDD (300 mV Pad $V_{\rm DD}$ ). signal of data-out implies the correct Read "0" mode. All cells of one bank (512 bis) work correctly with 600 mV supply voltage at 6 MHz. Fig. 25 shows pertinent waveforms captured by oscilloscope. The scope shows CLK signal, WEN signal and one bit of Data-out signal. The CLK signal is 6 MHz, and the data rate is 6 MHz word per second. The average pulse width of Read "1" is 169.78 ns, and the average pulse width of Read "0" is 160.85 ns. The non-50% Read-out pulse is due to the delayed CLK causing longer active time of keeper. The keeper would keep "1" Read-out data, even though the next address is active. The core of the test chip can work down to 200 mV supply at about 6 MHz as shown in Fig. 26 with pad $V_{\rm DD}$ of 300 mV. Fig. 27 shows the measured average power versus supply voltage at 6 MHz operation frequency, and the summary of the measured results are listed in Table I. Finally, the key performance of some recently reported subthreshold SRAM designs are listed in Table II for comparison. The $V_{\rm MIN}$ of our design is comparable to others. The distinct feature of the proposed 8 T single-ended SRAM with asymmetrical Write/Read-assist is its Fig. 27. Measured average power versus supply voltage at 6 MHz. # TABLE I SUMMARY OF MEASURED RESULT | Capacity Process | 256*16 bits SRAM<br>(Total 4K bits ) | | | | |-----------------------------------------------------|--------------------------------------|--|--|--| | Area | 90nm 1P9M SPRVT 1.0V<br>168um*265um | | | | | Performance | | | | | | Average Core Power @ 0.6V & 6MHz | 89.4 uW | | | | | Minimum measured VDD<br>in our fixed pattern @ 6MHz | 200 mV | | | | | Standby Mode @ 0.6V | 82.8 uW | | | | | Average Core Power @ 0.2V 6MHz | 10.4 uW | | | | TABLE II COMPARISON OF RESULTS | COMPARISON OF RESCETS | | | | | | | |-----------------------|----------------------|----------------------------|----------------------------------|----------------------------|-----------------------------------|--| | | 2007<br>JSSCC [23] | 2007<br>ISSCC [2] | 2007<br>ISSCC [24] | 2008<br>ISSCC [6] | This Work | | | Cell | 10T<br>(2 WBL+1 RBL) | 8T<br>(2 WBL+1 RBL) | 10T<br>(2 WBL+1 RBL) | 10T<br>(2 BL) | 8T<br>(1 BL) | | | Process | 65nm CMOS | 65nm CMOS | 130nm CMOS | 90nm CMOS | 90nm CMOS | | | Capacity | 256kbit | 256 kbits | 480 kbits | 32 kbits | 4 kbit | | | Vmin | 380 mV | 350 mV | 200 mV | 160 mV | 200mV <sup>1</sup> | | | | @ 400mV | @ 350mV | @ 200 mV | @ 160mV | @ 200mV | | | Frequency | 475 kHz | 25 kHz | 120kHz | 500Hz | 6MHz | | | Power | 3.28 uW | 3.39 uW | non | 0.123uW | 10.4uW | | | Comment | Floating VDD | 50mV word line<br>boosting | 90% leakage<br>current reduction | 80mV word line<br>boosting | Proposed Asy-<br>floating 8T cell | | <sup>&</sup>lt;sup>1</sup>Minimum measured V<sub>DD</sub> based on our fixed pattern. operation speed of 6 MHz at 200 mV, which is much superior to the other works. #### V. CONCLUSION In this paper, a 256\*16 bits subthreshold SRAM, based on a single-ended 8 T cell with asymmetrical Write-assist virtual ground biasing scheme and positive feedback sensing keeper, was described. The asymmetrical Write-assist virtual ground biasing scheme improved the Write Margin (35% of supply voltage) and enhanced the Write performance. The isolated Read buffer with positive feedback sensing keeper enhanced the RSNM to 50% of the supply voltage, enabled the elimination of BL precharge operation and full-rail BL swing to minimize the power dissipation of BL sensing circuit. The postlayout simulation results showed the chip achieves 234 MHz and 6 MHz operation frequency at 600 mV and 200 mV supply voltage. The measured results verified that the chip achieved 6 MHz operation frequency at 200 mV $\rm V_{\rm DD}$ with power consumption of 10.4 $\mu \rm W$ . #### ACKNOWLEDGMENT The authors are grateful to National Chip Implement Center (CIC), Taiwan, and United Microelectronics Corporation (UMC), Taiwan, for technology support. #### REFERENCES - [1] B. H. Calhoun and A. P. Chandrakasan, "Static noise margin variation for sub-threshold SRAM in 65-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 41, no. 7, pp. 1673–1679, Jul. 2006. - [2] B. H. Calhoun and A. P. Chandrakasan, "A 256-kb 65-nm sub-threshold SRAM design for ultra-low-voltage operation," *IEEE J. Solid-State Circuits*, vol. 42, no. 3, pp. 680–688, Mar. 2007. - [3] B. Zhai, S. Hanson, D. Blaauw, and D. Sylvester, "A variation-tolerant sub-200 mV 6-T subthreshold SRAM," *IEEE J. Solid-State Circuits*, vol. 43, no. 10, pp. 2338–2348, Oct. 2008. - [4] M. Sharifkhani and M. Sachdev, "An energy efficient 40 Kb SRAM module with extended read/write noise margin in 0.13 um CMOS," *IEEE J. Solid-State Circuits*, vol. 44, no. 2, pp. 620–630, Feb. 2009. - [5] T.-H. Kim, J. Liu, J. Keane, and C. H. Kim, "A 0.2 V, 480 kb subthreshold SRAM with 1 k cells per bitline for ultra-low-voltage computing," *IEEE J. Solid-State Circuits*, vol. 43, no. 2, pp. 518–529, Feb. 2008. - [6] I. J. Chang, J. J. Kim, S. P. Park, and K. Roy, "A 32 kb 10 T sub-threshold SRAM array with bit-interleaving and differential read scheme in 90 nm CMOS," in *ISSCC Dig. Tech. Papers*, Feb. 3–7, 2008, pp. 388–622. - [7] J. Y. Lin, M. H. Tu, M. C. Tsai, S. J. Jou, and C. T. Chuang, "Asymmetrical write-assist for single-ended SRAM operation," in *Proc. SOCC*, Sep. 9–11, 2009, pp. 101–104. - [8] S. T. Eid, M. Whately, and S. Krishnegowda, "A microcontroller-based PVT control system for a 65 nm 72 Mb synchronous SRAM," in *ISSCC Dig. Tech. Papers*, Feb. 7–11, 2010, pp. 184–185. - [9] M. F. Chang, S. M. Yang, and K. T. Chen, "Wide V<sub>DD</sub> embedded asynchronous SRAM with dual-mode self-timed technique for dynamic voltage systems," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 56, no. 8, pp. 1657–1667, Aug. 2009. - [10] S. C. Luo and L. Y. Chiou, "A sub-200-mV voltage-scalable SRAM with tolerance of access failure by self-activated bitline sensing," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 57, no. 6, pp. 440–445, Jun. 2010 - [11] C. T. Chuang, S. Mukhopadhyay, J. J. Kim, K. Kim, and R. Rao, "High-performance SRAM in nanoscale CMOS: Design challenges and techniques," in *IEEE Int. Workshop Memory Technol.*, *Des. Testing*, Dec. 3–5, 2007, pp. 4–12. - [12] M. Yabuuch et al., "A 45 nm low-standby-power embedded SRAM with improved immunity against process and temperature variations," in ISSCC Dig. Tech. Papers, 2007, pp. 326–327. - [13] K. Takeda et al., "A read-static-noise-margin-free SRAM cell for low-Vdd and high-speed applications," in ISSCC Dig. Tech. Papers, 2005, pp. 478–479. - [14] L. Chang et al., "Stable SRAM cell design for the 32 nm node and beyond," in Symp. VLSI Technol. Dig. Tech. Papers, 2005, pp. 128–129. - [15] J. Chen, L. T. Clark, and T. Chen, "An ultra-low-power memory with a subthreshold power supply voltage," *IEEE J. Solid-State Circuits*, vol. 41, no. 10, pp. 2344–2353, Oct. 2006. - [16] R. F. Hobson, "A new single-ended SRAM cell with write-assist," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 15, no. 2, pp. 173–181, Feb. 2007. - [17] H. I. Yang, M. H. Chang, S. Y. Lai, H. F. Wang, and W. Hwang, "A low-power low-swing single-ended multi-port SRAM," in *Proc. VLSI-DAT*, 2007, pp. 28–31. - [18] A. Sil, S. Ghosh, and M. Bayoumi, "A novel 90 nm 8 T SRAM cell with enhanced stability," presented at the IEEE Int. Conf. Integr. Circuit Des. Technol. 2007 (ICICDT '07), Austin, TX. - [19] H. Noguchi, S. Okumura, Y. Iguchi, H. Fujiwara, Y. Morita, K. Nii, H. Kawaguchi, and M. Yoshimoto, "Which is the best dual-port SRAM in 45-nm process technology?—8 T, 10 T single end, and 10 T differential—," in *Proc. ICICDT*, 2007, pp. 55–58. - [20] T. Suzuki, H. Yamauchi, Y. Yamagami, K. Satomi, and H. Akamatsu, "A stable SRAM cell design against simultaneously R/W disturbed accesses," in *Symp. VLSI Circuits Dig. Tech. Papers*, 2006, pp. 11–12. - [21] R. E. Aly and M. A. Bayoumi, "Low-power cache design using 7 T SRAM cell," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 54, no. 4, pp. 318–322, Apr. 2007. - [22] K. Kim, J. J. Kim, and C. T. Chuang, "Asymmetrical SRAM cells with enhanced read and write margins," in *Proc. VLSI-TSA*, 2007, pp. 162–163. - [23] N. Verma and A. P. Chandrakasan, "A 65 nm 8 T sub-Vt SRAM employing sense-amplifier redundancy," in *ISSCC Dig. Tech. Papers*, Feb. 11–15, 2007, pp. 328–606. - [24] T. H. Kim, J. Liu, J. Keane, and C. H. Kim, "A high-density subthreshold SRAM with data-independent bitline leakage and virtual ground replica scheme," in *ISSCC Dig. Tech. Papers*, Feb. 11–15, 2007, pp. 330–606. Ming-Hsien Tu received the B.S. and M.S. degrees in electrical engineering from National Central University, Chung-Li, Taiwan, in 2004 and 2006, respectively. He is currently working toward the Ph.D. degree at National Chiao Tung University, Hsinchu, Taiwan. His research interests include noise suppression design technologies, embedded measurement circuit design, and ultralow-power SRAM design. **Jihi-Yu Lin** was born in Taichung, Taiwan. He received the B.S. degree in electrical engineering from National Central University, Chung-Li, Taiwan, in 2007 and the M.S. degree in electronics from National Chiao Tung University, Hsinchu, Taiwan, in 2009. His research interests are in the areas of low power and low voltage embedded memory circuit design. Ming-Chien Tsai received the B.S. degree in electrical engineering from National Central University, Chung-Li, Taiwan, in 2008. He is currently working toward the M.S. degree at National Chiao Tung University, Hsinchu, Taiwan. His research interests include low-power digital circuit design, SRAM design and monitoring structure for NBTI/PBTI degradation of nanoscale CMOS SRAM. **Shyh-Jye Jou** (S'85–M'88–SM'99) received the B.S. degree in electrical engineering from National Chen Kung University, Taiwan in 1982, and the M.S. and Ph.D. degrees in electronics from National Chiao Tung University, Hsinchu, Taiwan in 1984 and 1988, respectively. He joined the Electrical Engineering Department of National Central University, Chung-Li, Taiwan, from 1990 to 2004 and became a Professor in 1997. Since 2004, he has been Professor of Electronics Engineering Department of National Chiao Tung University and was the Chairman from 2006 to 2009. He was a Visiting Research Associate Professor in the Coordinated Science Laboratory at University of Illinois, Urbana-Champaign, during 1993–1994 academic years. In Summer 2001, he was a Visiting Research Consultant in the Communication Circuits and Systems Research Laboratory of Agere Systems. His research interests include design and analysis of high speed, low power mixed-signal integrated circuits, and communication integrated circuits and systems. Prof. Jou served on the technical program committees in ISCAS, CICC, A-SSCC, ICCD, ASPDAC, VLSI-DAT and other international conferences. Ching-Te Chuang (S'78–M'82–SM'91–F'94) received the B.S.E.E. degree from National Taiwan University, Taipei, Taiwan, in 1975 and the Ph.D. degree in electrical engineering from the University of California, Berkeley, in 1982. From 1977 to 1982 he was a Research Assistant in the Electronics Research Laboratory, University of California, Berkeley, working on bulk and surface acoustic wave devices. He joined the IBM T. J. Watson Research Center, Yorktown Heights, NY, in 1982. From 1982 to 1986, he worked on scaled bipolar devices, technology, and circuits. He studied the scaling properties of epitaxial Schottky barrier diodes, did pioneering works on the perimeter effects of advanced double-poly self-aligned bipolar transistors, and designed the first subnanosecond 5-Kb bipolar ECL SRAM. From 1986 to 1988, he was Manager of the Bipolar VLSI Design Group, working on low-power bipolar circuits, high-speed high-density bipolar SRAMs, multi-Gb/s fiber-optic data-link circuits, and scaling issues for bipolar/BiCMOS devices and circuits. Since 1988, he has managed the High Performance Circuit Group, investigating high-performance logic and memory circuits. Since 1993, his group has been primarily responsible for the circuit design of IBM's high-performance CMOS microprocessors for enterprise servers, PowerPC workstations, and game/media processors. Since 1996, he has been leading the efforts in evaluating and exploring scaled/emerging technologies, such as PD/SOI, UTB/SOI, strained-Si devices, hybrid orientation technology, and multigate/FinFET devices, for high-performance logic and SRAM applications. Since 1998, he has been responsible for the research VLSI technology circuit codesign strategy and execution. His group has also been very active and visible in leakage/variation/degradation tolerant circuit and SRAM design techniques. He took early retirement from IBM to join National Chiao Tung University, Hsinchu, Taiwan, as a Chair Professor in the Department of Electronics Engineering in February 2008. He is currently the Director of the Intelligent Memory and SoC Laboratory at National Chiao Tung University. He has received the Outstanding Scholar Award from Taiwan's Foundation for the Advancement of Outstanding Scholarship for 2008 to 2013. He has authored or coauthored over 290 papers. He has authored many invited papers in international journals such as International Journal of High Speed Electronics, PROCEEDINGS OF THE IEEE, IEEE Circuits and Devices Magazine, and Microelectronics Journal. He holds 31 U.S. patents with another 11 pending. Dr. Chuang served on the Device Technology Program Committee for IEDM in 1986 and 1987, and the Program Committee for Symposium on VLSI Circuits from 1992 to 2006. He was the Publication/Publicity Chairman for Symposium on VLSI Technology and Symposium on VLSI Circuits in 1993 and 1994, and the Best Student Paper Award Sub-Committee Chairman for Symposium on VLSI Circuits from 2004 to 2006. He has presented numerous plenary, invited or tutorial papers/talks at international conferences such as International SOI Conf., DAC, VLSI-TSA, ISSCC Microprocessor Design Workshop, VLSI Circuit Symposium Short Course, ISQED, ICCAD, APMC, VLSI-DAT, ISCAS, MTDT, WSEAS, and VLSI Design/CAD Symposium, etc. He was the corecipient of the Best Paper Award at the 2000 IEEE International SOI Conference. He has received 1 Outstanding Technical Achievement Award, 1 Research Division Outstanding Contribution Award, 5 Research Division Awards, and 12 Invention Achievement Awards from IBM.