Continuous-time allocation indices and their discrete-time approximation

W. J. R. Eplett

doi:10.2307/1427185

Continuous-time allocation indices and their discrete-time approximation

Published online by Cambridge University Press: 01 July 2016

W. J. R. Eplett

Show author details

W. J. R. Eplett*: Affiliation:
University of Oxford
*: ∗Present address: Department of Statistics, University of Rochester, River Campus, Rochester, NY 14627, USA.

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The theory of allocation indices for defining the optimal policy in multi-armed bandit problems developed by Gittins is presented in the continuous-time case where the projects (or ‘arms’) are strong Markov processes. Complications peculiar to the continuous-time case are discussed. This motivates investigation of whether approximation of the continuous-time problems by discrete-time versions provides a valid technique with convergent allocation indices and optimal expected rewards. Conditions are presented under which the convergence holds.

Keywords

MARKOV PROCESSES DYNAMIC ALLOCATION PROBLEM OPTIMAL STOPPING CONVERGENCE DIFFUSIONS AND MARKOV CHAINS

Type: Research Article
Information: Advances in Applied Probability , Volume 18 , Issue 3 , September 1986 , pp. 724 - 746

DOI: https://doi.org/10.2307/1427185 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 1986

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Bensoussan, A. and Lions, J. L. (1978) Applications des Inéquations variationnelles en Contrôle stochastique. Dunod, Paris.Google Scholar

El Karoui, N. (1981) Les aspects probabilistes du contrôle stochastique. In Springer Lecture Notes in Mathematics 876, 74–238.Google Scholar

Gittins, J. C. (1979) Bandit processes and dynamic allocation indices (with discussion). J. R. Statist. Soc. B 41, 148–177.Google Scholar

Gittins, J. C. and Jones, D. M. (1974) A dynamic allocation index for the sequential design of experiments. In Progress in Statistics ed. Gani, J., North Holland, Amsterdam, 241–266.Google Scholar

Ikeda, N. and Watanabe, S. (1981) Stochastic Differential Equations and Diffusion Processes. North-Holland, Oxford.Google Scholar

Karatzas, I. (1984) Gittins indices in the dynamic allocation problem for diffusion processes. Ann. Prob. 12, 173–192.Google Scholar

Nash, P. and Gittins, J. C. (1977) A Hamiltonian approach to optimal stochastic resource allocation. Adv. Appl. Prob. 9, 55–68.Google Scholar

Perkins, E. (1981) A global intrinsic characterization of Brownian local time. Ann. Prob. 9, 800–817.Google Scholar

Reuter, G. E. H. (1976) Denumerable Markov processes (IV): on C. T. Hou&s uniqueness theorem for Q-semigroups. Z. Wahrscheinlichkeitsth. 33, 309–315.Google Scholar

Ross, S. M. (1970) Applied Probability Models with Optimization Applications. Holden-Day, San Francisco.Google Scholar

Stroock, D. W. and Varadhan, S. R. S. (1979) Multidimensional Diffusion Processes. Springer-Verlag, Berlin.Google Scholar

Whittle, P. (1980) Multi-armed bandits and the Gittins index. J. R. Statist. Soc. B 42, 143–149.Google Scholar

Williams, D. (1979) Diffusions, Markov Processes, and Martingales , Vol. I. Wiley, New York.Google Scholar

Article contents

Continuous-time allocation indices and their discrete-time approximation

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests