We introduce general methodologies for benchmarking the availability and maintainability of computer systems. Our methodologies are based on fault injection, used to purposefully compromise availability and to bring systems to a state where maintenance is required. Our availability benchmarks leverage existing performance benchmarks for workload generation and data collection, measure availability in terms of quality of service variation over time, and can produce results in both detail-rich graphical presentations or in distilled numerical summaries. Our maintainability benchmarks characterize several different axes of maintainability, including the time, impact, and learning curve associated with maintenance tasks, and rely on the use of human experiments to capture the subtle interactions between system and administrator.
We demonstrate and evaluate our methodologies by applying them to measure the availability and maintainability of the software RAID systems shipped with RedHat Linux 6.0, Solaris 7 for Intel Architectures, and Windows 2000 Server. We find that the availability benchmarks are powerful enough not only to quantify the impact of various failure conditions on the availability of these systems, but also to unearth their undocumented design philosophies with respect to transient errors and recovery policy. Similarly, the maintainability benchmarks draw clear distinctions between the systems on the time and learning curve metrics, and furthermore are able to identify key factors and design decisions influencing the maintainability of the three systems.