Vector processors have typically used vector registers, interleaved memory, and pipelined access to data to provide sufficient memory system performance. Caches have been used mainly for instructions, while data references are usually uncached, presumably partially because of the belief that there is insufficient data locality in vector workloads. In this study we use memory address traces from Cray X-MP and Ardent Titan machines to examine both reference locality and cache performance in a vector processing environment. Many of the Titan traces in particular are from real vectorized applications which reference large amounts of data. We have found that vector references contain somewhat less temporal locality, but large amounts of spatial locality compared to instruction and scalar references. Cache miss ratios are found to be comparable to those measured and published previously for various non-vectorized workloads. We provide analyses of trace behavior with regard to parameters of interest to cache designers. Calculations based on our measured miss ratios indicate that caches will improve average access times, which in turn can be expected to translate into significant improvements in machine performance.




Download Full History