Hostname: page-component-cd9895bd7-jn8rn Total loading time: 0 Render date: 2024-12-23T13:41:18.412Z Has data issue: false hasContentIssue false

On approximately optimal index strategies for generalised arm problems

Published online by Cambridge University Press:  14 July 2016

N. A. Fay*
Affiliation:
University of Durham
J. C. Walrand*
Affiliation:
University of California, Berkeley
*
Postal address: Department of Mathematical Sciences, Science Laboratories, South Road, University of Durham, Durham, DH1 3LE, UK.
∗∗ Postal address: Department of Electrical Engineering and Computer Science, College of Engineering, University of California at Berkeley, Berkeley, California, CA 94720, USA.

Abstract

Nash has extended Gittins' work to describe optimal strategies for a class of generalised bandit problems. Here we use a forwards induction argument to analyse ε -optimal strategies for generalised bandit problems. An evaluation procedure for such problems is described; this may be used to analyse models in research planning and stochastic scheduling.

Type
Research Papers
Copyright
Copyright © Applied Probability Trust 1991 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bergman, S. W. and Gittins, J. C. (1985) Statistical Methods for Pharmaceutical Research Planning. Marcel Dekker, New York.Google Scholar
Fay, N. A. and Glazebrook, K. D. (1989) A general model for the scheduling of alternative stochastic jobs which may fail. Prob. Eng. Inf. Sci. 3, 199221.CrossRefGoogle Scholar
Gittins, J. C. (1979) Bandit processes and dynamic allocation indices. J. R. Statist. Soc. B 41, 148177.Google Scholar
Gittins, J. C. (1989) Multi-armed Bandit Allocation Indices. Wiley, Chichester.Google Scholar
Gittins, J. C. and Jones, D. M. (1974) A dynamic allocation index for the sequential design of experiments. In Progress in Statistics, ed. Gani, J. et al., North-Holland, Amsterdam, 241266.Google Scholar
Glazebrook, K. D. (1982) On the evaluation of sub-optimal strategies for families of alternative bandit processes. J. Appl. Prob. 19, 716722.Google Scholar
Glazebrook, K. D. (1987) Sensitivity analysis for stochastic scheduling problems. Math. Operat. Res. 12, 205223.Google Scholar
Glazebrook, K. D. and Fay, N. A. (1987) On the scheduling of alternative stochastic jobs on a single machine. Adv. Appl. Prob. 19, 955973.Google Scholar
Glazebrook, K. D. and Fay, N. A. (1988) Evaluating strategies for generalised bandit problems. Int. J. Syst. Sci. 19, 16051613.Google Scholar
Katehakis, M. N. and Veinott, A. F. (1987) The multi-armed bandit problem: decomposition and computation. Math. Operat. Res. 12, 262268.Google Scholar
Nash, P. (1980) A generalised bandit problem. J. R. Statist. Soc. B 42, 165169.Google Scholar
Ross, S. M. (1970) Applied Probability Models with Optimisation Applications. Holden-Day, San Francisco.Google Scholar
Varaiya, P., Walrand, J. and Buyukkoc, C. (1985) Extensions of the multi-armed bandit problem: the discounted case. IEEE Trans. Autom. Control 30, 426439.Google Scholar