Prefetching techniques for STT-RAM based last-level cache in CMP systems
Prefetching is widely used in modern computer systems to mitigate the impact of long memory access latency by paying extra cost in memory and cache accesses. However, the efficacy of prefetching significantly degrades in the memory hierarchy using the emerging spintransfer torque random access memory (STT-RAM) as last-level cache (LLC) due to the long write access latency. In this work, we propose two orthogonal but complimentary techniques to improve the prefetching efficacy of STT-RAM based LLC in chip multi-processor (CMP) systems, namely, request prioritization (RP) and hybrid local-global prefetch control (HLGPC). Simulation results show that by combining these two techniques, we can achieve 6.5%∼11% system performance improvement and 4.8%∼7.3% LLC energy saving in a quadcore system with a 2MB∼8MB STT-RAM based LLC, compared to the system with only basic prefetching. © 2014 IEEE.