Accurately Modeling Superscalar Processor Performance with Reduced Trace

Kiyeon Lee and Sangyeun Cho.

Journal of Parallel and Distributed Computing, 73(4):509~521, April 2013.

Abstract:

Trace-driven simulation of out-of-order superscalar processors is far from straightforward. The dynamic nature of out-of-order superscalar processors combined with the static nature of traces can lead to large inaccuracies in the results when the traces contain only a subset of executed instructions for trace reduction. In this paper, we describe and comprehensively evaluate the pairwise dependent cache miss model (PDCM), a framework for fast and accurate trace-driven simulation of out-of-order superscalar processors. The model determines how to treat a cache miss with respect to other cache misses recorded in the trace by dynamically reconstructing the reorder buffer state during simulation and honoring the dependencies between the trace items. Our experimental results demonstrate that a PDCM-based simulator produces highly accurate simulation results (less than 3% error) with fast simulation speeds (62.5× on average) compared with an execution-driven simulator. Moreover, we observed that the proposed simulation method is capable of preserving a processor’s dynamic off-core memory access behavior and accurately predicting the relative performance change when a processor’s low-level memory hierarchy parameters are changed.