Article contents
On transforming an index for generalised bandit problems
Published online by Cambridge University Press: 14 July 2016
Abstract
Nash (1980) demonstrated that index policies are optimal for a class of generalised bandit problem. A transform of the index concerned has many of the attributes of the Gittins index. The transformed index is positive-valued, with maximal values yielding optimal actions. It may be characterised as the value of a restart problem and is hence computable via dynamic programming methodologies. The transformed index can also be used in procedures for policy evaluation.
MSC classification
- Type
- Research Papers
- Information
- Copyright
- Copyright © Applied Probability Trust 1995
References
- 4
- Cited by