This paper describes the design and implementation of SWORD, a scalable resource discovery service for wide-area distributed systems. SWORD locates a set of machines matching user-specified constraints on both static and dynamic node characteristics, including both single-node and inter-node characteristics. We explore a range of system architectures to determine the appropriate tradeoffs for building a scalable, highly-available, and efficient resource discovery infrastructure. We describe: i) techniques for efficient handling of multi-attribute range queries that describe application resource requirements; ii) an integrated mechanism for scalably measuring and querying inter-node attributes without requiring O(n^2) time and space; iii) a mechanism for users to encode a restricted form of utility function indicating how the system should filter candidate nodes when more are available than the user needs, and an optimizer that performs this node selection based on per-node and inter-node characteristics; and iv ) working prototypes of a variety of architectural alternatives -- running the gamut from centralized to fully distributed -- along with a detailed performance evaluation. SWORD is currently deployed as a continuously-running service on PlanetLab. We find that SWORD offers good performance, scalability , and robustness in both an emulated environment and a real-world deployment.




Download Full History