2 results
EXPLORATION–EXPLOITATION POLICIES WITH ALMOST SURE, ARBITRARILY SLOW GROWING ASYMPTOTIC REGRET
- Part of
-
- Journal:
- Probability in the Engineering and Informational Sciences / Volume 34 / Issue 3 / July 2020
- Published online by Cambridge University Press:
- 26 January 2019, pp. 406-428
-
- Article
- Export citation
Sample mean based index policies by O(log n) regret for the multi-armed bandit problem
- Part of
-
- Journal:
- Advances in Applied Probability / Volume 27 / Issue 4 / December 1995
- Published online by Cambridge University Press:
- 01 July 2016, pp. 1054-1078
- Print publication:
- December 1995
-
- Article
- Export citation