In very large distributed computer systems, there are significant problems when one considers decentralization of control amongst agents managing resources. Probably the most difficult is that agents must make good fast coordinated decisions based on uncertain and differing views of the global system state. Our thesis is that despite such problems, effective decentralized control systems can be built based on a set of seven design principles which we describe. We also apply these principles to the problem of decentralized load balancing, and provide results based on trace-driven simulation experiments.

Our approach is knowledge-based, by which we mean that an agent will make use of heuristics and domain-specific knowledge about the behavior of itself and other agents to make good decisions. A powerful technique we present is one that agents use to quantify the uncertainty of information they have, and, based on these quantifications, to make better decisions. Agents adapt their decisionmaking to changing conditions by observing the system at infrequent (to minimize communication overhead) and opportune times, and then relying on their inference capabilities between observations. To minimize the occurrence of mutually conflicting decisions, we introduce a technique called SPACE/TIME Randomization, which provides implicit coordination of agents and requires minimal communication. The solutions we present are based on a combination of extensions of decision theoretic techniques and artificial intelligence techniques.





Download Full History