Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-01-26T09:19:13.527Z Has data issue: false hasContentIssue false

Spinning plates and squad systems: policies for bi-directional restless bandits

Published online by Cambridge University Press:  01 July 2016

K. D. Glazebrook*
Affiliation:
Lancaster University
C. Kirkbride*
Affiliation:
Lancaster University
D. Ruiz-Hernandez*
Affiliation:
Universitat Pompeu Fabra
*
Postal address: Department of Mathematics and Statistics, Lancaster University, Lancaster LA1 4YF, UK. Email address: [email protected]
∗∗ Postal address: Department of Management Science, Lancaster University, Lancaster LA1 4YX, UK.
∗∗∗ Department of Economics and Business, Universitat Pompeu Fabra, Barcelona, E-08005, Spain.
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

This paper concerns two families of Markov decision problem that fall within the family of (bi-directional) restless bandits, an intractable class of decision processes introduced by Whittle. The spinning plates problem concerns the optimal management of a portfolio of reward-generating assets whose yields grow with investment but otherwise tend to decline. In the model of asset exploitation called the squad system, the yield from an asset tends to decline when it is used but will recover when the asset is at rest. In all cases, simply stated conditions are given that guarantee indexability of the problem, together with conditions necessary and sufficient for its strict indexability. The index heuristics for asset activation that emerge from the analysis are assessed numerically and found to perform very strongly.

Type
General Applied Probability
Copyright
Copyright © Applied Probability Trust 2006 

References

Ansell, P. S., Glazebrook, K. D., Niño-Mora, J. and O'Keeffe, M. (2003). Whittle's index policy for a multi-class queueing system with convex holding costs. Math. Meth. Operat. Res. 57, 2139.Google Scholar
Gittins, J. C. (1979). Bandit processes and dynamic allocation indices. With discussion. J. R. Statist. Soc. Ser. B 41, 148177.Google Scholar
Gittins, J. C. (1989). Multi-Armed Bandit Allocation Indices. John Wiley, Chichester.Google Scholar
Glazebrook, K. D., Lumley, R. R. and Ansell, P. S. (2003). Index heuristics for multi-class M/G/1 systems with non-preemptive service and convex holding costs. Queueing Systems 45, 81111.CrossRefGoogle Scholar
Glazebrook, K. D., Niño-Mora, J. and Ansell, P. S. (2002). Index policies for a class of discounted restless bandits. Adv. Appl. Prob. 34, 754774.CrossRefGoogle Scholar
Niño-Mora, J. (2001a). PCL-indexable restless bandits: diminishing marginal returns, optimal marginal reward rate index characterization, and a tiring–recovery model. Unpublished manuscript.Google Scholar
Niño-Mora, J. (2001b). Restless bandits, partial conservation laws and indexability. Adv. Appl. Prob. 33, 7698.Google Scholar
Niño-Mora, J. (2002). Dynamic allocation indices for restless projects and queueing admission control: a polyhedral approach. Math. Program. 93, 361413.CrossRefGoogle Scholar
Papadimitriou, C. H. and Tsitsiklis, J. N. (1999). The complexity of optimal queueing network control. Math. Operat. Res. 24, 293305.Google Scholar
Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.CrossRefGoogle Scholar
Tijms, H. C. (1994). Stochastic Models: An Algorithmic Approach. John Wiley, New York.Google Scholar
Weber, R. R. and Weiss, G. (1990). On an index policy for restless bandits. J. Appl. Prob. 27, 637648.CrossRefGoogle Scholar
Weber, R. R. and Weiss, G. (1991). Addendum to ‘On an index policy for restless bandits’. Adv. Appl. Prob. 23, 429430.Google Scholar
Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In A Celebration of Applied Probability (J. Appl. Prob. Spec. Vol. 25A), pplied Probability Trust, Sheffield, pp. 287298.Google Scholar