Decentralized systems require new benchmarks and new benchmarking techniques. We propose a general methodology for benchmarking the performability of one class of decentralized system: peer-to-peer applications built on top of distributed hash tables (DHTs). Furthermore, we argue that benchmarks for decentralized systems must be designed and implemented with similar concern for scalability and robustness as the systems they are designed to benchmark, implying a need for decentralized load generation, fault injection, and metric collection. These criteria lead us to propose a benchmark implementation that uses a DHT to publish the faultload description and to store collected metrics, and uses a DHT-based relational query engine to analyze benchmark results. Finally, we argue that the fault injection and monitoring mechanisms required to run such benchmarks are reusable for online robustness testing, problem detection, and problem diagnosis, and that they therefore should be provided as infrastructure services.