The SPEC benchmark suite consists of ten public-domain, non-trivial programs that are widely used to measure the performance of computer systems, particularly those in the Unix workstation market. These benchmarks were expressly chosen to represent real-world applications and were intended to be large enough to stress the computational and memory system resources of current-generation machines. The extent to which the SPECmark (the figure of merit obtained from running the SPEC benchmarks under certain specified conditions) accurately represents performance with live real workloads is not well established; in particular, there is some question whether the memory referencing behavior (cache performance) is appropriate.

In this paper, we present measurements of miss ratios for the entire set of SPEC benchmarks for a variety of CPU cache configurations; this study extends earlier work that measured only the performance of the integer (C) SPEC benchmarks. We find that instruction cache miss ratios are generally very low, and that data cache miss ratios for the integer benchmarks are also quite low. Data cache miss ratios for the floating point benchmarks are more in line with published measurements for real workloads. We believe that the discrepancy between the SPEC benchmark miss ratios and those observed elsewhere is partially due to the fact that the SPEC benchmarks are all almost exclusively user state CPU benchmarks run until completion as the single active user process. We therefore believe that SPECmark performance levels may not reflect system performance when there is multiprogramming, time sharing and/or significant operating systems activity.




Download Full History