No CrossRef data available.
Article contents
A MARKOV CHAIN CHOICE PROBLEM
Published online by Cambridge University Press: 10 December 2012
Abstract
Consider two independent Markov chains having states 0, 1, and identical transition probabilities. At each stage one of the chains is observed, and a reward equal to the observed state is earned. Assuming prior probabilities on the initial states of the chains it is shown that the myopic policy that always chooses to observe the chain most likely to be in state 1 stochastically maximizes the sequence of rewards earned in each period.
- Type
- Research Article
- Information
- Probability in the Engineering and Informational Sciences , Volume 27 , Issue 1 , January 2013 , pp. 53 - 55
- Copyright
- Copyright © Cambridge University Press 2013