Cache Equalizer: A Placement Mechanism for Chip Multiprocessor Distributed Shared Caches

Mohammad H. Hammoud, Sangyeun Cho, and Rami Melhem.

Proceedings of the 6th Int'l Conference on High Performance and Embedded Architectures and Compilers (HiPEAC), Heraklion, Crete, Greece, January 2011.

Abstract:

This paper describes Cache Equalizer (CE), a novel distributed cache management scheme for large-scale chip multiprocessors (CMPs). Our work is motivated by large asymmetry in cache sets' usages. CE decouples the physical locations of cache blocks from their addresses for the sake of reducing misses caused by destructive interferences. Temporal pressure at the on-chip last-level cache is continuously collected at a group (comprised of cache sets) granularity, and periodically recorded at the memory controller to guide the placement process. An incoming block is consequently placed at a cache group that exhibits the minimum pressure. Simulation results using a full-systemsimulator demonstrate thatCE achieves an average L2 miss rate reduction of 13.6% over a shared NUCA scheme and by as much as 46.7% for the benchmark programs we examined. Furthermore, evaluations showed that CE outperforms related cache designs.