Multi-Actor Markov Decision Processes

Hyun-Soo Ahn; Rhonda Righter

doi:10.1239/jap/1110381367

Multi-Actor Markov Decision Processes

Part of: Operations research and management science

Published online by Cambridge University Press: 14 July 2016

Hyun-Soo Ahn and

Rhonda Righter

Show author details

Hyun-Soo Ahn*: Affiliation:
University of Michigan
Rhonda Righter*: Affiliation:
University of California, Berkeley
*: ∗Postal address: Operations and Management Science, University of Michigan Business School, 701 Tappan Street, Ann Arbor, MI 48109-1234, USA. Email address: [email protected]
∗∗Postal address: Department of Industrial Engineering and Operations Research, University of California, Berkeley, CA 94720, USA. Email address: [email protected]

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We give a very general reformulation of multi-actor Markov decision processes and show that there is a tendency for the actors to take the same action whenever possible. This considerably reduces the complexity of the problem, either facilitating numerical computation of the optimal policy or providing a basis for a heuristic.

Keywords

Markov decision process multiarmed bandit flexible server

MSC classification

Secondary: 90C40: Markov and semi-Markov decision processes 90B22: Queues and service

Type: Research Papers
Information: Journal of Applied Probability , Volume 42 , Issue 1 , March 2005 , pp. 15 - 26

DOI: https://doi.org/10.1239/jap/1110381367 [Opens in a new window]
Copyright: © Applied Probability Trust 2005

References

Ahn, H.-S., Duenyas, I. and Lewis, M. E. (2002). The optimal control of a two-stage tandem queueing system with flexible servers. Prob. Eng. Inf. Sci. 16, 453–469.CrossRef Google Scholar

Andradöttir, S., Ayhan, H. and Down, D. G. (2001). Server assignment policies for maximizing the steady-state throughput. Manag. Sci. 47, 1421–1439.Google Scholar

Gittins, J. C. (1979). Bandit processes and dynamic allocation indices. J. R. Statist. Soc. B. 14, 148–177.Google Scholar

Harrison, J. M. (1975). Dynamic scheduling of a multiclass queue: discount optimality. Operat. Res. 23, 270–282.Google Scholar

Kaufman, D., Ahn, H.-S. and Lewis, M. E. (2004). On the introduction of agile, temporary workers into a tandem queueing system. Work in progress.Google Scholar

Klimov, G. P. (1974). Time-sharing service systems. I. J. Theory Prob. Appl. 19, 532–551.Google Scholar

Klimov, G. P. (1978). Time-sharing service systems. II. J. Theory Prob. Appl. 23, 314–321.CrossRef Google Scholar

Koole, K. and Righter, R. (2004). Resource allocation in grid computing. Work in progress.Google Scholar

Mandelbaum, A. and Reiman, M. I. (1998). On pooling in queueing networks. Manag. Sci. 44, 971–981.CrossRef Google Scholar

Sennott, L. I. (1999). Stochastic Dynamic Programming and the Control of Queueing Systems. John Wiley, New York.Google Scholar

Van Oyen, M. P., Gel, E. G. S. and Hopp, W. J. (2001). Performance opportunity for workforce agility in collaborative and noncollaborative work systems. IIE Trans. 33, 761–777.CrossRef Google Scholar

Vairaktarakis, G. L. (2003). The value of resource flexibility in the resource-constrained Job assignment problem. Manag. Sci. 49, 718–732.CrossRef Google Scholar

Weiss, G. and Pinedo, M. (1980). Scheduling tasks with exponential service times on non-identical processors to minimize various cost functions. J. Appl. Prob. 17, 187–202.Google Scholar

Article contents

Multi-Actor Markov Decision Processes

Abstract

Keywords

MSC classification

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests