將快閃記憶體作為主記憶體的記憶體階層設計

標題:	將快閃記憶體作為主記憶體的記憶體階層設計 Memory hierarchy design for using flash memory as main memory
作者:	張賢仁 Hsien-Jen Chang 羅正忠 Dr. Jen-Chung Lou 電子研究所
關鍵字:	快取記憶體;記憶體階層架構;快閃記憶體;cache memory;memory hierarchy;flash memory
公開日期:	1999
摘要:	由於快閃記憶體的優異性能及具有比動態隨機存取記憶體更低成本的可能性，數年前即有人預測將可利用快閃記憶體作為主記憶體。然而實際上，這樣的架構並未被採用，因其性能及價格皆未達理想。隨著技術的進步，快閃記憶體的性能不斷提升，本論文即基於此技術的趨勢，探討使用快閃記憶體做為主記憶體時，記憶體階層架構的相關設計問題。本論文首先回顧既有的技術、方法。包括了現今使用快閃記憶體的方法，以及現有的記體體階層式設計及其問題與對策。而設計使用快閃記憶體為主記憶體的重點在於設計其上層之快取計記憶體控制方法。分析快閃記憶體的特性後本論文提出"寫入時，無法於快閃記憶體上層的快取記憶體配置空間供其寫入"是主要必須克服的問題。而這是由於快閃記憶體的兩大使用限制；寫入前必須抹除一整個大的區塊，寫入、抹除時間過長。本論文首先討論既有的記憶體階層設計的各種方法須如何選擇以配合這些特性。再談新的方法來克服問題。對於寫入時間過長，本論文提出用平行寫入來克服。本論文對於平行寫入亦提出兩個方案；每個記憶體單元陣列皆可獨立平行寫入，以及Ｎ個平行寫入記憶體庫。對於如何設計快取記憶體的區塊大小以符合抹寫的大區塊大小，以及兼顧適當的大小以增加效率，本論文提出兩個方法；兩種區塊大小的快取記憶體，另一個則是文本論文提出的新架構－超區塊（super block）。輔助子區塊(co-sub block)也被提出來做為協助改良區塊越來越大的傳輸時間增長問題。對於快取記憶體的技術趨勢以解決上述問題本論文亦予以介紹、探討；如新的寫入方式BBHE，使得寫入速度提高並使平行寫入更加可行。將位元線、字元線分段以縮小抹除區塊的新設計亦被介紹，對於此新技術，本論文亦提出其使用上限制以及建議的改良方向。由上述討論後，作者設計了一個模擬的系統設定，並討論了相關的設計問題。由此模擬結果，驗證出系統確可避免"寫入時，無法於快閃記憶體上層的快取記憶體配置空間供其寫入"之問題。本論文隨後討論了搭配的軟體設計之考量－避免不必要的寫入，亦即記憶區的回收使用之問題。本論文提出軟體應優先以堆疊分配為資料記憶體分配，其次是堆積分配，靜態分配應是最不得已的選擇。本論文除了提出了以快閃記憶體為主記體的架構中，各種快取記憶體硬體設計方法、及程式軟體的設計建議外，與此問題有關的知識亦于於介紹：並在假設將快閃記憶體改成每個記憶體單元陣列皆可獨立平行寫入條件下，模擬並驗證此架構的可行性。 Because of the excellent characteristics and low price potential, using flash memory as main memory is not a novel idea. But it is not implemented in current design because flash memory is still expansive and some characteristics of it are not very good. But the disadvantages of it will be overcome. This thesis just discuss the methodologies and architecture to using flash memory as parts of main memory After discussing current methodologies of memory hierarchy and current architecture using flash memory now, it discusses the challenges of using flash memory: large erasing unit and long writing time. First, the thesis discusses how the traditional methodologies work with this new architecture. After analyzing the behavior of programs and memory sub-system, it finds the write mix is low and lower memory is not always busy. These make writing time can be longer. This thesis brings up using parallel writing to overcome the long writing time. Parallel writing the cell in the same row has been announced and it takes 4ms to writing 512 cells. i.e. 7ns/bit [12] Parallel writing arrays will increase more throughput of writing. The thesis brings two parallel writing architectures: full parallel writing arrays and multi-parallel writing banks. For fitting the large erasing unit and moderate block size to reduce miss rate, the thesis brings up two architectures: dual block size and super block. The thesis also brings the co-sub block to assist solving the problem of the larger and larger block size. With rational assumption, the thesis simulates the memory hierarchy architecture that uses flash memory as main memory. The unaccepted condition - writing miss and allocate fail on upper level memory is not occurred. In the other words, the performance of this architecture is usable. Then, the thesis discusses the software methods to assist the architecture. Reducing unnecessary writing - writing dead data is the discussion topic. The method is to consider data allocate. Stack allocation is the most recommended. Heap allocation is the second choice. Static allocation may need to be reduced. If the system is executing special programs and the memory space is large enough, memory hierarchy with using flash memory as main memory is feasible. The thesis already discusses several hardware method and software considerations. And verify it in simulations English Abstract ………………………………………………………… iii Acknowledge ………………………………………………………… v Contents ………………………………………………………… vi Table list ………………………………………………………… ix Figure list ………………………………………………………… x Word illustration ………………………………………………………… xii Chapter One Motivation and Introduction ……………………… 1 1.1 Advantage of flash memory ………………………… 1 1.1.1 Excellent characteristics of flash memory …… 1 1.1.2 Low cost potential of flash memory ……………… 3 1.2. Consideration for applications of flash memory 6 1.2.1 Various cell architecture for different application ………………………………………… 6 1.2.2 Methodologies for using flash memory as mass storage ……………………………………………… 7 1.2.3 Using flash memory as system memory now ………… 8 1.2.4 An example of methodology for using flash memory: AMD's DMS(Data Management Software) with simultaneous read/write flash memory architecture ……………………………………… 9 1.3 The object of this thesis…………………………… 11 Chapter Two Present memory hierarchy design ………………… 13 2.1 Introduction of memory hierarchy design ……… 13 2.1.1 Motivation of memory hierarchy………………… 13 2.1.2 Why memory hierarchy design is useful? ……… 14 2.1.3. What is memory hierarchy design?………………… 15 2.1.4 Metrics of the memory hierarchy ………………… 15 2.2 Methodologies of cache system and design parameters …………………………………………… 16 2.2.1 Basic specifications ……………………………… 16 2.2.2 Strategies for write………………………………… 21 2.2.3 Other considerations……………………………… 23 2.2.4 Cache optimization summary………………………… 25 2.3 Virtual memory - another memory hierarchy 26 Chapter Three Challenge and Strategies ………………………… 28 3.1 The position of flash memory in the memory hierarchy …………………………………………… 28 3.2 The special characteristics of flash memory…… 29 3.2.1 Unbalance writing time and reading time - very long writing time …………………………………… 29 3.2.2 Erase sector before writing - large block and very long erase time………………………………… 29 3.3 Write miss & allocate fail (in the cache for flash memory) and other concerns ………………… 29 3.3.1 Two basic considerations for the properties of flash memory ………………………………………… 30 3.3.2 Write miss & allocate fail (in the cache for flash memory) - the major and induced concern 31 3.4 Considerations of traditional cache methodologies………………………………………… 31 3.5 Considerations for traditional performance improvement methodologies of "main memory" …… 33 3.5.1 Traditional methodologies for improving performance of main memory ………………………… 34 3.5.2 Considerations for traditional performance improvement methodologies of "main memory" to fit the architecture of this thesis ……………… 35 3.5.3 Reveal from this section ………………………… 36 3.6 Strategies for long writing time ………………… 36 3.6.1 Time budget from lower level memory is not always busy …………………………………………………… 36 3.6.2 Time budget from the low write instruction mix … 37 3.6.3 Feasibility of parallel writing ………………… 40 3.6.4 Basic parallel writing strategy to solve the long writing time…………………………………… 41 3.6.5 Strategy I: Fully parallel writing cell array… 42 3.6.6 Strategy II: Multiple parallel writing banks … 43 3.6.7 Summary of this section…………………………… 45 3.7. Strategies for that sector size is very large than traditional cache block size………………… 45 3.7.1 Disadvantage of large block size………………… 45 3.7.2 Constraints from characteristics of flash memory ………………………………………………… 45 3.7.3 Strategy I: Dual block size………………………… 46 3.7.4 Strategy II: Super block…………………………… 46 3.7.5 Co-sub block indicator …………………………… 48 3.7.6 Summary for this section …………………………… 49 3.8. Considerations about architecture of flash memory ………………………………………………… 50 3.8.1 Choice Nor-type flash memory …………………… 50 3.8.2 Length of word line and bit line…………………… 50 3.8.3 Architecture of flash memory to reduce sector size …………………………………………………… 51 3.8.4 About using DINOR flash memory…………………… 51 Chapter Four Simulation and implementation consideration… 53 4.1 System architecture for simulation……………… 53 4.2 Novel cache controller……………………………… 54 4.3 Traces for simulation……………………………… 56 4.4 Parameters for simulation………………………… 57 4.5 Simulation result…………………………………… 59 4.5.1 Result of SRAM cache simulation ………………… 59 4.5.2 Simulation result of DRAM cache ……………… 59 Chapter Five Software optimization ……………………………… 62 5.1 Traditional software optimization methods for traditional memory hierarchy design ………… 62 5.1.1 Target of traditional methods ………………… 62 5.1.2 Visible compiler techniques for memory hierarchy …………………………………………… 62 5.2 New issue for flash memory: reduce unnecessary write ………………………………………………… 63 5.3 Relative software knowledge ……………………… 63 5.3.1 Life time and scope………………………………… 63 5.3.2 Storage-allocation strategies …………………… 63 5.4 Software guidelines for using flash memory as parts of main memory………………………………… 66 5.4.1 Guidelines for compiler technique ……………… 66 5.4.2 Notion for run-time heap manager ………………… 66 5.4.3 Guidelines for coding software …………………… 67 5.4.4 Co-design H/W to elaborate above function ……… 67 Chapter Six Conclusion ………………………………………… 69 6.1 Limitation of the architecture …………………… 69 6.1.1 Limitation from device: endurance ……………… 69 6.1.2 Limitation from architecture: memory should be large enough ………………………………………… 69 6.2 Possible application of the architecture of this thesis ………………………………………………… 69 6.3 Future study ………………………………………… 70 6.4 Conclusion …………………………………………… 70 Reference ………………………………………………………… 72 Appendix 1 Main source code of SRAM cache ………………… A1-1 Appendix 2 Main source code of DRAM cache ………………… A2-1
URI:	http://140.113.39.130/cdrfb3/record/nctu/#NT880428053 http://hdl.handle.net/11536/65690
Appears in Collections:	Thesis