Dynamically Resizable Instruction Cache: An Energy-Efficient and High-Performance Deep-Submicron Instruction Cache
Document Type Article
Increasing levels of on-chip integration have enabled steady improvements in modem microprocessor performance, but have also resulted in high energy dissipation. Deep-submicron CMOS designs maintain high transistor switching speeds by scaling down the supply voltage and proportionately reducing the transistor threshold voltage. Lowering the threshold voltage increases leakage energy dissipation due to an exponential increase in the leakage current flowing through a transistor even when the transistor is not switching. Estimates from the VLSI circuit community suggest a five-fold increase in leakage energy dissipation in every future generation. Modem microarchitectures aggravate the leakage energy problem by investing vast resources in on-chip cache hierarchies because leakage energy grows with the number of transistors. While demand on cache hierarchies varies both within and across applications, modem caches are designed to meet the worst-case application demand, resulting in poor utilization of on-chip caches, which in turn leads to energy inefficiency. This paper explores an integrated architectural and circuit-level approach to reduce leakage energy dissipation in instruction caches (i-caches) while maintaining high performance. Using a simple adaptive scheme, we exploit the variability in application demand in a novel cache design, the Dynamically Resizable i-cache (DRI i-cache), by dynamically resizing the cache to the size required at any point in application execution. At the circuit-level, the DRI i-cache employs a novel mechanism, called gatedmVddw, hich effectively turns off the supply voltage to the SRAM cells in the DRI i-cache's unused sections, virtually eliminating leakage in these sections. Ow adaptive scheme gives DRI i-caches tight control over the number of extra misses caused by resizing, enabling the DRI i-cache to contain both performance degradation and extra energy dissipation due to increased number of accesses to lower cache levels. simulations using the SPEC95 benchmarks show that a 64K DRI i-cache reduces, on average, both the leakage energy-delay product. and average size by 62%, with less than 4% impact on execution time.