標題: 抗變異奈米互補式金氧半導體靜態隨機存取記憶體設計
Variation-Tolerant Nanoscale CMOS SRAM Design
作者: 盧建宇
Lu, Chien-Yu
莊景德
Chuang, Ching-Te
電子工程學系 電子研究所
關鍵字: 抗變異;次臨界;記憶體;靜態隨機存取;低功率;積體電路設計;variation-tolerant;subthreshold;memory;static random access;low power;VLSI
公開日期: 2014
摘要: 能低電壓操作的低功率的嵌入式記憶體近數年來被逐漸成為晶片系統領域的研發重心,用以降低晶片系統整體的動靜態功耗,也藉此能應用在可攜帶式的手持攜帶或低功率生醫感測等裝置上。為了更進一步使晶片運算需要消耗的能量最小化,來使有限電池的壽命與效益最佳化,將晶片系統的電路設計操作在次臨界的偏壓狀態,略低於元件閘電壓(Vt),便能讓每個系統運算盡可能地接近傳統CMOS元件的最小電路能量消耗,達成整體超低功率晶片系統的實現。 首先, 根據研究次臨界電壓元件功耗特性曲線,本論文提出一個新的漣波位元線架構,以及漣波觸發負壓位元寫入電路設計,來實現一個七十二千位元的次臨界低功率靜態隨機存取記憶體晶片,將整體讀寫操作電壓同時降低到低於元件閘電壓的最低可操作電壓。與傳統階層式位元線架構相比,該提出的設計使記憶體電路改善了大幅在低電壓下的操作速度、功耗以及節省所需金屬層條件。實做晶片的量測上,藉由記憶體編譯器規格的評估以及量化,也驗證了設計時預期目標的功耗、效能和高良率。 除此之外,我們也提出了更多種新的次臨界抗變異記憶體單元,來改善先前提出的抗變異次臨界記憶體單元設計,提供更好的效能。與本論文提出另一套同步化雙輔助寫入電路設計,實現更低電壓、低功耗和更高速的七十二千位元次臨界靜態隨機存取記憶體。另外,針對在次臨界電路下驅動高負載廣泛產生的巨大變異度問題,本論文也提出高效能的自舉驅動電路設計,相較於傳統驅動電路,在模擬上、實做晶片上皆驗證到更好的效能。 本論文在研究設計次臨界低功耗記憶體及電路之外,針對傳統式單埠、雙埠記憶體單元的低壓操作所產生的問題,亦提出許多具有抗變異性、低功耗的設計,包括功耗閘以及低電位大訊號感測電路等,在實做晶片的量測中,皆驗證到研究上設計的可靠性、穩定性,以及帶來的更佳的最小操作電壓以及效能上的諸多改善。 最後,許多尚未發表以及進行中的研究題目,會簡要的介紹於附錄。
Since low-power embedded memory with low minimum operating voltage (VMIN) is desired to reduce overall system power dissipation for portable and handheld devices and for ultra-low power bio-medical and wireless sensor applications, and it is crucial to minimize energy per operation to extend battery life. The minimum energy operation is reported to operate the circuits in subthreshold region, slightly below the threshold voltage. First, for sub-threshold VLSI, we present an energy efficient bootstrapped CMOS driver to enhance switching speed for driving large RC load for ultra-low-voltage CMOS VLSI. The proposed bootstrapped driver eliminates the leakage paths in the conventional bootstrapped driver to achieve and maintain more positive and negative boosted voltage levels of the boosted nodes, thus improving boosting efficiency and enhancing driver switching speed. Measured performance from test chips implemented with UMC 65nm low-power CMOS technology (VTN≈VTP≈0.5V) indicates that the proposed driver provides rising-delay improvement of 37%-50% and falling-delay improvement of 25%-47% at 0.3V for loading ranging from 0 to 24mm long M6 metal line compared with the conventional bootstrapped driver. Although designed and optimized for subthreshold ultra low-voltage operation, the proposed bootstrapped driver is shown to be advantageous at higher nearly-threshold supply voltage as well. The proposed driver provides rising delay improvement of 20% to 52% and falling delay improvement of 23%-43% for VDD ranging from 0.3V to 0.5V while consuming about 15% less average power than the conventional bootstrapped driver driving 16mm long M6 wire. Then, we present an ultra-low power 72Kb 9T Static Random Access Memory (SRAM) with a Ripple Bit-Line structure and Ripple-initiated Negative Bit-Line Write-assist. The Ripple Bit-Line scheme provides over 40% Read access performance improvement for VDD below 0.4 V compared with the conventional Hierarchical Bit-Line (HBL) structure. A variation-tolerant ripple-initiated NBL Write-assist scheme with the transient negative pulse coupled only into the single selected Local Bit-Line segment is employed to enhance the NBL boosting efficiency and reduce power consumption. The 72Kb SRAM test chip has been fabricated in UMC 40nm Low Power CMOS technology. Error free full functionality without redundancy is achieved from 1.5 V down to 0.33 V. The measured maximum operation frequency is 220 MHz (500 KHz) at 1.1 V (0.33 V) and 25 oC. The measured total power consumption is 3.94 μW at 0.33 V, 500 KHz and 25 oC. And follow up the previous design; we present a two-port disturb-free 9T subthreshold SRAM cell with independent single-ended Read and Write Bit-Lines and cross-point data-aware Write structure. The cell provides robust variation tolerance for subthreshold application and facilitates bit-interleaving architecture for enhanced soft error immunity. This design employs a variation-tolerant Line-Up Write-Assist scheme where the timing of area/energy-efficient boosted Write Word-Line and negative WBL are aligned and triggered/initiated by the same low-going Global WBL to maximize the Write-ability enhancement. A 72kb SRAM test chip is implemented in UMC 40nm Low-PowerCMOS technology. 65 dies are characterized with full memory complier product qualification patterns. Full functionality is achieved for VDD ranging from 1.5V to 0.32V without any redundancy. The measured maximum operation frequency is 260MHz (450kHz) at 1.1V (0.32V) and 25oC. At 0.325V and 25oC, the chip operates at 600kHz with 5.78μW total power and 4.69μW leakage power, offering 2X frequency improvement compared with 300kHz of our previous 72kb 9T subthreshold SRAM design in the same 40LP technology. The energy efficiency (Power/Freq/IO) at 0.325V and 25oC is 0.267 pJ/bit, a 23.7% improvement over the 0.350 pJ/bit of our previous design. We also present a novel subthreshold 9T SRAM cell with row-based Word-Line and column-based data-aware Write Word-Lines. The decoupled Read port and cross-point Write structure provide a disturb-free cell and facilitate bit-interleaving architecture. Compared with a previous cross-point Write 9T subthreshold SRAM cell reported in the literature, the proposed 9T SRAM cell offers comparable stability with improved Read performance and variation-tolerance. Monte Carlo simulations based on UMC 40nm Low-Power technology indicate that the BL access time improves by 15.35% to 17.37%, and the variation (σ of BL access time) improves by 5.12% to 9.22% for VDD ranging from 0.3V to 0.6V. Based on a 72Kb SRAM macro design in UMC 40LP process, the proposed 9T cell achieves about 9% better chip access time at SS corner for VDD ranging from 0.3V to 0.45V. In addition to subthreshold design, we presents a 256kb 6T SRAM with threshold power-gating, low-swing global read bit-line, and charge-sharing write with Vtrip tracking and negative source-line write-assists. The TPG facilitates lower NAP mode voltage/power and faster wake-up for the cell array, while low-swing GRBL reduces the dynamic read power. A variation-tolerant charge-sharing write scheme, where the floating “Low” global write bit-line (GWBL) is used to capacitively couple down the local bit-line, is combined with a cell Vtrip-tracking and NVSL write-assists to improve the write-ability. The 256kb test chip is implemented in UMC 40nm low-power CMOS technology. Error-free full-functionality is achieved from 1.18GHz at 1.5V to 100MHz at 0.65V without redundancy. The TPG scheme reduces the power by 70% (55%) at 1.5V (0.5V) in NAP mode. The low-swing GRBL reduces dynamic read power by 3.5% (8%) at 1.1V (0.65V). The VTP-WA and NVSL-WA improve the write VMIN by 50mV (from 0.7V to 0.65V) and reduce write bit failure rate by 2.75x at 0.65V. Due to limited resources and time, plenty to-be-published and ongoing future works, including several single-port, dual-port SRAM designs from 40nm to 28nm, and further subthreshold SRAM designs even ultra-low voltage 2-port register file are mentioned in brief.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079711669
http://hdl.handle.net/11536/76420
顯示於類別:畢業論文