Hostname: page-component-78c5997874-j824f Total loading time: 0 Render date: 2024-11-05T15:55:18.060Z Has data issue: false hasContentIssue false

Randomised allocation of treatments in sequential trials

Published online by Cambridge University Press:  01 July 2016

John Bather*
Affiliation:
University of Sussex
*
Postal address: School of Mathematical and Physical Sciences, The University of Sussex, Falmer, Brighton BN1 9QH, U.K.

Abstract

Given a finite number of different experiments with unknown probabilities p1, p2, ···, pk of success, the multi-armed bandit problem is concerned with maximising the expected number of successes in a sequence of trials. There are many policies which ensure that the proportion of successes converges to p = max (p1, p2, ···, pk), in the long run. This property is established for a class of decision procedures which rely on randomisation, at each stage, in selecting the experiment for the next trial. Further, it is suggested that some of these procedures might perform well over any finite sequence of trials.

Type
Research Article
Copyright
Copyright © Applied Probability Trust 1980 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Blackwell, D. and Hodges, J. L. Jr. (1957) Design for the control of selection bias. Ann. Math. Statist. 28, 449460.CrossRefGoogle Scholar
Chow, Y. S. (1965) Local convergence of martingales and the law of large numbers. Ann. Math. Statist. 36, 552558.Google Scholar
Gittins, J. C. (1979) Bandit processes and dynamic allocation indices. J. R. Statist. Soc. B 41, 148177.Google Scholar
Poloniecki, J. D. (1978) The two-armed bandit and the controlled clinical trial. The Statistician 27, 97102.Google Scholar
Robbins, H. (1952) Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. 58, 527535.CrossRefGoogle Scholar
Sobel, M. and Weiss, G. H. (1970) Play the winner sampling for selecting the better of two binomial populations. Biometrika 57, 357365.CrossRefGoogle Scholar
Sobel, M. and Weiss, G. H. (1971) Play the winner rule and inverse sampling in selecting the better of two binomial populations. J. Amer. Statist. Assoc. 66, 545551.CrossRefGoogle Scholar