标题: | 依据扩充式跳跃目标缓冲区的指令快取记忆体预先提取 Instruction Cache Prefetching with Extended BTB |
作者: | 汲世安 Chi, Shyh-An 钟崇斌 Chung-Ping Chung 资讯科学与工程研究所 |
关键字: | 指令快取记忆体预先提取;基于预测表的预取方法;instruction cache prefetching;prediction table based prefetching |
公开日期: | 1996 |
摘要: | 处理器与记忆体之间的速度差异越来越大, 使得指令快取记忆体失误( miss)所造成的损害(penalty)也越来越高. 此外, 目前的微处理机使用超 纯量(superscalar)及超管线(superpipelining)的技术藉以增加指令的引 进(issue)速率. 指令快取记忆体失误所造成的效能降低也就越形严重了. 目前的VLSI技术允许以更多的晶片面积来建构晶片上的指令快取记忆体. 假如处理器的时脉速率超过100MHz, 那么晶片上的指令快取记忆体将限制 于大小为4至16KB, 且有较低的关连度(associativity). 更突显出快取记 忆体失误率的问题.在这篇论文中, 我们首先探讨数种已发表过, 用来减 少指令快取记忆体失误的方法. 然后发表一个新的方法称为BIB (Branch Instruction Based) 预取方法, 此方法是根据跳跃(branch)指令的预测 结果来作预取. 此方法储存预取资讯(prefetching information)在一个 扩充的BTB (eBTB)中. 当eBTB辨识出一个跳跃指令时, 储存在相对应eBTB 登录项(entry)中的预取资讯将用来作预取. 假如eBTB失误, 则预取循序 下一个位址的指令.为了要评估列出之指令快取记忆体预取方法, 我们为 每一个方法建立了一个简单但趋真实的机器模型(machine model). 论文 中使用了六种SPECint95标竿程式(benchmark)来评估这些方法. 在我们的 研究中主要的效能评估项目是MCPI (每个指令在记忆体存取时所用的时脉 数). 模拟结果显示出BIB预取方法优于循序(sequential)预取方法七个百 分点, 优于PBN预取方法十七个百分点. As the speed gap between processor and memory grows, the penalty caused byinstruction cache misses gets higher. Furthermore, modern microprocessors increase the instruction issue rate by employing techniques such as superscalar processing and superpipelining. The performance degradation caused by instruction cache misses becomes more vital. Modern VLSI technology makes it possible to allocate more die area to allow instruction cache structure on-chip. The on-chip instruction cache usually has low associativity, and its size is limited to 4 to 16KB if the processor has a clock rate of 100 MHz or higher, making the cache miss rate a problem.In this thesis, We first study several existing solutions to reducing instruction cache misses. We then propose a new approach, called the BIB (Branch Instruction Based) prefetching method in which the prefetch is directed by the prediction on branches. This method stores the prefetching information in an extended BTB (eBTB). When eBTB recognizes a branch instruction, the prefetching information in the corresponding eBTB entry is used to prefetch. If the eBTB misses, the sequentially next line address is used to prefetch.In order to evaluate these instruction cache prefetching approaches, we establish one simple but realistic machine model for each. Six SPECint95 benchmarks are used in evaluating each approach in this study. The major performance metric in our study is MCPI (cycles per instruction contributed by memory accesses). The simulation results show that the BIB prefetching method outperforms the sequential prefetching by 7% and the PBN prefetching by 17% on average. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT850392039 http://hdl.handle.net/11536/61789 |
显示于类别: | Thesis |