标题: 依据扩充式跳跃目标缓冲区的指令快取记忆体预先提取
Instruction Cache Prefetching with Extended BTB
作者: 汲世安
Chi, Shyh-An
钟崇斌
Chung-Ping Chung
资讯科学与工程研究所
关键字: 指令快取记忆体预先提取;基于预测表的预取方法;instruction cache prefetching;prediction table based prefetching
公开日期: 1996
摘要: 处理器与记忆体之间的速度差异越来越大, 使得指令快取记忆体失误(
miss)所造成的损害(penalty)也越来越高. 此外, 目前的微处理机使用超
纯量(superscalar)及超管线(superpipelining)的技术藉以增加指令的引
进(issue)速率. 指令快取记忆体失误所造成的效能降低也就越形严重了.
目前的VLSI技术允许以更多的晶片面积来建构晶片上的指令快取记忆体.
假如处理器的时脉速率超过100MHz, 那么晶片上的指令快取记忆体将限制
于大小为4至16KB, 且有较低的关连度(associativity). 更突显出快取记
忆体失误率的问题.在这篇论文中, 我们首先探讨数种已发表过, 用来减
少指令快取记忆体失误的方法. 然后发表一个新的方法称为BIB (Branch
Instruction Based) 预取方法, 此方法是根据跳跃(branch)指令的预测
结果来作预取. 此方法储存预取资讯(prefetching information)在一个
扩充的BTB (eBTB)中. 当eBTB辨识出一个跳跃指令时, 储存在相对应eBTB
登录项(entry)中的预取资讯将用来作预取. 假如eBTB失误, 则预取循序
下一个位址的指令.为了要评估列出之指令快取记忆体预取方法, 我们为
每一个方法建立了一个简单但趋真实的机器模型(machine model). 论文
中使用了六种SPECint95标竿程式(benchmark)来评估这些方法. 在我们的
研究中主要的效能评估项目是MCPI (每个指令在记忆体存取时所用的时脉
数). 模拟结果显示出BIB预取方法优于循序(sequential)预取方法七个百
分点, 优于PBN预取方法十七个百分点.
As the speed gap between processor and memory grows, the penalty
caused byinstruction cache misses gets higher. Furthermore,
modern microprocessors increase the instruction issue rate by
employing techniques such as superscalar processing and
superpipelining. The performance degradation caused by
instruction cache misses becomes more vital. Modern VLSI
technology makes it possible to allocate more die area to allow
instruction cache structure on-chip. The on-chip instruction
cache usually has low associativity, and its size is limited to
4 to 16KB if the processor has a clock rate of 100 MHz or
higher, making the cache miss rate a problem.In this thesis, We
first study several existing solutions to reducing instruction
cache misses. We then propose a new approach, called the BIB
(Branch Instruction Based) prefetching method in which the
prefetch is directed by the prediction on branches. This method
stores the prefetching information in an extended BTB (eBTB).
When eBTB recognizes a branch instruction, the prefetching
information in the corresponding eBTB entry is used to prefetch.
If the eBTB misses, the sequentially next line address is used
to prefetch.In order to evaluate these instruction cache
prefetching approaches, we establish one simple but realistic
machine model for each. Six SPECint95 benchmarks are used in
evaluating each approach in this study. The major performance
metric in our study is MCPI (cycles per instruction contributed
by memory accesses). The simulation results show that the BIB
prefetching method outperforms the sequential prefetching by 7%
and the PBN prefetching by 17% on average.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT850392039
http://hdl.handle.net/11536/61789
显示于类别:Thesis