Understanding the characteristics of physical I/O traffic is increasingly important as the performance gap between processor and disk-based storage continues to widen. Moreover, recent advances in technology, coupled with market demands, have led to several new and exciting developments in storage, including network storage, storage utilities, and intelligent self-optimizing storage. In this paper, we empirically examine the I/O traffic of a wide range of real PC and server workloads with the intent of understanding how well they will respond to these new storage developments. As part of our analysis, we compare our results with historical data and reexamine rules of thumb that have been widely used for designing computer systems. Our results show that there is a strong need to focus on improving I/O performance. We find that the I/O traffic is bursty and appears to exhibit self-similar characteristics. In addition, our analysis indicates that there is little cross-correlation in traffic volume among the server workloads, which suggests that aggregating these workloads will likely help to smooth out the traffic and enable more efficient utilization of resources. We also discover that there is a lot of potential for harnessing "free" system resources for purposes such as automatic optimization of disk block layout. In general, the characteristics of the I/O traffic are relatively insensitive to the amount of caching upstream and our qualitative results apply when the upstream cache is increased in size.





Download Full History