Hostname: page-component-586b7cd67f-dsjbd Total loading time: 0 Render date: 2024-11-22T15:54:53.473Z Has data issue: false hasContentIssue false

A unifying framework for iterative approximate best-response algorithms for distributed constraint optimization problems1

Published online by Cambridge University Press:  01 December 2011

Archie C. Chapman*
Affiliation:
School of Electronics and Computer Science, University of Southampton, Highfield, Southampton SO17 1BJ, UK; email: [email protected], [email protected], [email protected]
Alex Rogers*
Affiliation:
School of Electronics and Computer Science, University of Southampton, Highfield, Southampton SO17 1BJ, UK; email: [email protected], [email protected], [email protected]
Nicholas R. Jennings*
Affiliation:
School of Electronics and Computer Science, University of Southampton, Highfield, Southampton SO17 1BJ, UK; email: [email protected], [email protected], [email protected]
David S. Leslie*
Affiliation:
Department of Mathematics, University of Bristol, University Walk, Bristol BS8 1TW, UK; e-mail: [email protected]

Abstract

Distributed constraint optimization problems (DCOPs) are important in many areas of computer science and optimization. In a DCOP, each variable is controlled by one of many autonomous agents, who together have the joint goal of maximizing a global objective function. A wide variety of techniques have been explored to solve such problems, and here we focus on one of the main families, namely iterative approximate best-response algorithms used as local search algorithms for DCOPs. We define these algorithms as those in which, at each iteration, agents communicate only the states of the variables under their control to their neighbours on the constraint graph, and that reason about their next state based on the messages received from their neighbours. These algorithms include the distributed stochastic algorithm and stochastic coordination algorithms, the maximum-gain messaging algorithms, the families of fictitious play and adaptive play algorithms, and algorithms that use regret-based heuristics. This family of algorithms is commonly employed in real-world systems, as they can be used in domains where communication is difficult or costly, where it is appropriate to trade timeliness off against optimality, or where hardware limitations render complete or more computationally intensive algorithms unusable. However, until now, no overarching framework has existed for analyzing this broad family of algorithms, resulting in similar and overlapping work being published independently in several different literatures. The main contribution of this paper, then, is the development of a unified analytical framework for studying such algorithms. This framework is built on our insight that when formulated as non-cooperative games, DCOPs form a subset of the class of potential games. This result allows us to prove convergence properties of iterative approximate best-response algorithms developed in the computer science literature using game-theoretic methods (which also shows that such algorithms can also be applied to the more general problem of finding Nash equilibria in potential games), and, conversely, also allows us to show that many game-theoretic algorithms can be used to solve DCOPs. By so doing, our framework can assist system designers by making the pros and cons of, and the synergies between, the various iterative approximate best-response DCOP algorithm components clear.

Type
Articles
Copyright
Copyright © Cambridge University Press 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aji, S. M., McEliece, R. J. 2000. The generalized distributive law. IEEE Transactions on Information Theory 46, 325343.CrossRefGoogle Scholar
Anderson, S. P., de Palma, A., Thisse, J. 1992. Discrete Choice Theory of Product Differentiation. MIT Press.CrossRefGoogle Scholar
Apt, K. R. 2003. Principles of Constraint Programming. Cambridge University Press.CrossRefGoogle Scholar
Arshad, M., Silaghi, M. C. 2003. Distributed simulated annealing and comparison to DSA. In Proceedings of the 4th International Workshop on Distributed Constraint Reasoning (DCR–03), Acapulco, Mexico.Google Scholar
Arslan, G., Marden, J. R., Shamma, J. S. 2007. Autonomous vehicle-target assignment: a game theoretical formulation. ASME Journal of Dynamic Systems, Measurement and Control 129, 584596.CrossRefGoogle Scholar
Aumann, R. J. 1959. Acceptable points in general cooperative n-person games. In Contributions to the Theory of Games IV, Tucker, A. W. & Luce, R. D. (eds). Princeton University Press, 287324.Google Scholar
Benaïm, M., Hofbauer, J., Sorin, S. 2005. Stochastic approximation and differential inclusions. SIAM Journal of Control and Optimisation 44(1), 328348.CrossRefGoogle Scholar
Benaïm, M., Hofbauer, J., Sorin, S. 2006. Stochastic approximations and differential inclusions, part II: applications. Mathematics of Operations Research 31(4), 673695.CrossRefGoogle Scholar
Berger, U. 2007. Brown's original fictitious play. Journal of Economic Theory 135(1), 572578.CrossRefGoogle Scholar
Blum, A., Mansour, Y. 2007. From external to internal regret. Journal of Machine Learning Research 8, 13071324.Google Scholar
Bowring, E., Pearce, J., Portway, C., Jain, M., Tambe, M. 2008. On k–optimal distributed constraint optimization algorithms: new bounds and algorithms. In Proceedings of the 7th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-08), Estoril, Portugal, 607–614.Google Scholar
Brown, G. W. 1951. Iterative solution of games by fictitious play. In Activity Analysis of Production and Allocation, Koopmans, T. C. (ed.). Wiley, 374376.Google Scholar
Chapman, A., Micillo, R. A., Kota, R., Jennings, N. 2009. Decentralised dynamic task allocation: a practical game-theoretic approach. In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-09), 915922.Google Scholar
Cooper, M., de Givry, S., Schiex, T. 2007. Optimal soft arc consistency. In Proceedings of the 20th Internation Joint Conference on Artificial Intelligence (IJCAI-07), 6873.Google Scholar
Crawford, V. P. 1995. Adaptive dynamics in coordination games. Econometrica 63, 103143.CrossRefGoogle Scholar
Farinelli, A., Rogers, A., Petcu, A., Jennings, N. R. 2008. Decentralised coordination of low-power embedded devices using the max–sum algorithm. In Proceedings of the 7th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-08), 639–646.Google Scholar
Fitzpatrick, S., Meertens, L. 2003. Distributed coordination through anarchic optimization. In: Distributed Sensor Networks: A Multiagent Perspective, Lesser, V., Ortiz Jr, C. L. & Tambe, M. (eds), Kluwer Academic Publishers, 257295.CrossRefGoogle Scholar
Fudenberg, D., Kreps, D. 1993. Learning mixed equilibria. Games and Economic Behavior 5, 320367.CrossRefGoogle Scholar
Fudenberg, D., Levine, D. K. 1995. Consistency and cautious fictitious play. Journal of Economic Dynamics and Control 19, 10651089.CrossRefGoogle Scholar
Fudenberg, D., Levine, D. K. 1998. The Theory of Learning in Games. MIT Press.Google Scholar
Gerkey, B. P., Mataric, M. J. 2002. Sold!: auction methods for multirobot coordination. IEEE Transactions on Robotics and Automation 18(5), 758768.CrossRefGoogle Scholar
Grimmett, G., Stirzaker, D. 2001. Probability and Random Processes, 3rd edn, Oxford University Press.CrossRefGoogle Scholar
Hart, S., Mas-Colell, A. 2000. A simple adaptive procedure leading to correlated equilibrium. Econometrica 68, 11271150.CrossRefGoogle Scholar
Hart, S., Mas-Colell, A. 2001. A reinforcement procedure leading to correlated equilibrium. In Economic Essays: A Festschrift for Werner Hildenbrand, Debreu, G., Neuefeind, W. & Trockel, W. (eds), Springer, 181200.CrossRefGoogle Scholar
Hayajneh, M., Abdallah, C. T. 2004. Distributed joint rate and power control game-theoretic algorithms for wireless data. IEEE Communications Letters 8, 511513.CrossRefGoogle Scholar
Heikkinen, T. 2006. A potential game approach to distributed power control and scheduling. Computer Networks 50, 22952311.CrossRefGoogle Scholar
Hirayama, K., Yokoo, M. 2005. The distributed breakout algorithms. Artificial Intelligence 161(1–2), 89115.CrossRefGoogle Scholar
Hofbauer, J., Sandholm, W. H. 2002. On the global convergence of stochastic fictitious play. Econometrica 70, 22652294.CrossRefGoogle Scholar
Kho, J., Rogers, A., Jennings, N. 2009. Decentralised control of adaptive sampling in wireless sensor networks. ACM Transactions on Sensor Networks 5(3), 19:119:35.CrossRefGoogle Scholar
Kirkpatrick, S., Gelatt, C. D., Vecchi, M. P. 1983. Optimisation by simulated annealing. Science 220, 671680.CrossRefGoogle Scholar
Kitano, H., Todokoro, S., Noda, I., Matsubara, H., Takahashi, T., Shinjou, A., Shimada, S. 1999. Robocup rescue: search and rescue in large-scale disaster as a domain for autonomous agents research. In Proceedings of the IEEE International Conference on System, Man, and Cybernetics (SMC-99), 6, Tokyo, Japan, 739–743.Google Scholar
Krainin, M., An, B., Lesser, V. 2007. An application of automated negotiation to distributed task allocation. In Proceedings of the 2007 IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT-07). IEEE Computer Society Press, 138–145.Google Scholar
Kschischang, F. R., Frey, B. J., Loeliger, H. 2001. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 47, 498519.CrossRefGoogle Scholar
La Mura, P., Pearson, M. R. 2002. Simulated annealing of game equilibria: a simple adaptive procedure leading to Nash equilibrium. In International Workshop on the Logic and Strategy of Distributed Agents, Trento, Italy.Google Scholar
Leslie, D. S., Collins, E. J. 2006. Generalised weakened fictitious play. Games and Economic Behavior 56, 285298.CrossRefGoogle Scholar
Maheswaran, R. T., Pearce, J. P., Tambe, M. 2004. Distributed algorithms for DCOP: a graphical-game-based approach. In Proceedings of the 17th International Conference on Parallel and Distributed Computing Systems (PDCS-04), San Francisco, CA, USA, 432–439.Google Scholar
Maheswaran, R. T., Pearce, J. P., Tambe, M. 2005. A family of graphical-game-based algorithms for distributed constraint optimization problems. In Coordination of Large-Scale Multiagent Systems, Springer-Verlag, 127–146.Google Scholar
Mailler, R., Lesser, V. 2006. Asynchronous partial overlay: a new algorithm for solving distributed constraint satisfaction problems. Journal of Artificial Intelligence Research 25, 529576.CrossRefGoogle Scholar
Marden, J. R., Arslan, G., Shamma, J. S. 2009a. Connections between cooperative control and potential games illustrated on the consensus problem. IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics 39, 13931407.CrossRefGoogle Scholar
Marden, J. R., Arslan, G., Shamma, J. S. 2009b. Joint strategy fictitious play with inertia for potential games. IEEE Transaction on Automatic Control, in press.CrossRefGoogle Scholar
Matthews, G. M., Durrant-Whyte, H. F. 2006. Scalable decentralised control for multi-platform reconnaissance and information gathering tasks. In Proceedings of the 9th International Conference on Information Fusion (Fusion'06), Florence, Italy.CrossRefGoogle Scholar
Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., Teller, E. 1953. Equation of state calculations by fast computing machines. Journal of Chemical Physics 21, 10871092.CrossRefGoogle Scholar
Mezzetti, C., Friedman, J. W. 2001. Learning in games by random sampling. Journal of Economic Theory 98(1), 5584.Google Scholar
Modi, P. J., Shen, W., Tambe, M., Yokoo, M. 2005. ADOPT: asynchronous distributed constraint optimization with quality guarantees. Artificial Intelligence 161(1–2), 149180.CrossRefGoogle Scholar
Monderer, D., Shapley, L. S. 1996a. Fictitious play property for games with identical interests. Journal of Economic Theory 68, 258265.CrossRefGoogle Scholar
Monderer, D., Shapley, L. S. 1996b. Potential games. Games and Economic Behavior 14, 124143.CrossRefGoogle Scholar
Morris, P. 1993. The breakout method for escaping from local minima. In Proceedings of the 11th National Conference on Artificial Intelligence (AAAI '93), Washington, DC, USA, 40–45.Google Scholar
Papadimitriou, C. H., Roughgarden, T. 2008. Computing correlated equilibria in multi-player games. Journal of the ACM 55(3), 14:114:29.CrossRefGoogle Scholar
Pearce, J. P., Tambe, M. 2007. Quality guarantees on k-optimal solutions for distributed constraint optimisation problems. In Proceedings of the 20th Internation Joint Conference on Artificial Intelligence (IJCAI-07), Hyderabad, India, 1446–1451.Google Scholar
Petcu, A., Faltings, B. 2005. DPOP: a scalable method for multiagent constraint optimization. In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI-05), Edinburgh, Scotland, 266–271.Google Scholar
Press, W. H., Flannery, B. P., Teukolsky, S. A., Vetterling, W. T. 1992. Numerical Recipes: The Art of Scientific Computing. Cambridge University Press.Google Scholar
Robinson, J. 1951. An iterative method of solving a game. Annals of Mathematics 54, 296301.CrossRefGoogle Scholar
Rogers, A., Corkill, D. D., Jennings, N. R. 2009. Agent technologies for sensor networks. IEEE Intelligent Systems 24(2), 1317.CrossRefGoogle Scholar
Roughgarden, T. 2005. Selfish Routing and the Price of Anarchy. MIT Press.Google Scholar
Schiex, T., Fargier, H., Verfaillie, G. 1995. Valued constraint satisfaction problems: Hard and easy problems. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95), 631–639.Google Scholar
Singh, S., Jaakkola, T., Littman, M. L., Szepesvari, C. 2000. Convergence results for single-step on—policy reinforcement—learning algorithms. Machine Learning 39, 287308.CrossRefGoogle Scholar
Stranders, R., Farinelli, A., Rogers, A., Jennings, N. R. 2009. Decentralised coordination of mobile sensors using the max-sum algorithm. In Proceedings of the 21st International Joint Conference on Artifical Intelligence (IJCAI-09), 2009.Google Scholar
Stranjak, A., Dutta, P. S., Ebden, M., Rogers, A., Vytelingum, P. 2008. A multi-agent simulation system for prediction and scheduling of aero engine overhaul. In Proceedings of the 7th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-08).Google Scholar
Tel, G. 2000. Introduction to Distributed Algorithms. Cambridge University Press.CrossRefGoogle Scholar
Tumer, K., Wolpert, D. H. (eds). 2004. Collectives and the Design of Complex Systems. Springer.CrossRefGoogle Scholar
van Leeuwen, P., Hesselink, H., Rohling, J. 2002. Scheduling aircraft using constraint satisfaction. In Electronic Notes in Theoretical Computer Science. Elsevier, 252268.Google Scholar
Weiss, Y. 2000. Correctness of local probability propagation in graphical models with loops. Neural Computation 12(1), 141.CrossRefGoogle Scholar
Xu, Y., Scerri, P., Yu, B., Okamoto, S., Lewis, M., Sycara, K. 2005. An integrated token-based algorithm for scalable coordination. In Proceedings of the 4th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-05), 407–414.Google Scholar
Yokoo, M., Hirayama, K. 1996. Distributed breakout algorithm for solving algorithm for solving distributed constraint satisfaction and optimization problems. In Proceedings of the 2nd International Conference on Multiagent Systems (ICMAS '96), 401–408.Google Scholar
Young, H. P. 1998. Individual Strategy and Social Structure: An Evolutionary Theory of Institutions. Princeton University Press.CrossRefGoogle Scholar
Young, H. P. 1993. The evolution of conventions. Econometrica 61, 5784.CrossRefGoogle Scholar
Zhang, W., Xing, Z. 2002. Distributed breakout vs. distributed stochastic: a comparative evaluation on scan scheduling. In Proceedings of the AAMAS-02 workshop on Distributed Constraint Reasoning, 192–201.Google Scholar
Zhang, W., Wang, G., Xing, Z., Wittenburg, L. 2005. Distributed stochastic search and distributed breakout: properties, comparison and applications to constraint optimization problems in sensor networks. Artificial Intelligence 161, 5587.CrossRefGoogle Scholar