Optimal decision procedures for finite Markov chains. Part II: Communicating systems

John Bather

doi:10.2307/1425832

Abstract

A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a given convex family of distributions depending on the present state. The immediate cost is prescribed for each choice and it is required to minimise the average expected cost over an infinite future. The paper considers a special case of this general problem and provides the foundation for a general solution. The main result is that an optimal policy exists if each state of the system can be reached with positive probability from any other state by choosing a suitable policy.

References

[1] Bather, J. (1973) Optimal decision procedures for finite Markov chains. Part I: examples. Adv. Appl. Prob. 5, 328–339.Google Scholar

[2] Brown, B. W. (1965) On the iterative method of dynamic programming on a finite space discrete time Markov process. Ann. Math. Statist. 36, 1279–1286.Google Scholar

[3] Derman, C. and Veinott, A. F. Jr. (1967) A solution to a countable system of equations arising in Markovian decision processes. Ann. Math. Statist. 38, 582–584.Google Scholar

[4] Hordijk, A. (1971) A sufficient condition for the existence of an optimal policy with respect to the average cost criterion in Markovian decision processes. Report BW 14/71, Mathematisch Centrum, Amsterdam. (To appear in Transactions of the Sixth Prague Conference on Information Theory, Statistical Decision Functions and Random Processes.) .Google Scholar

[5] Hordijk, A. (1972) Over een Doeblinvoorwaarde en haar toepassing in beslissingsprocessen. Report BW 15/72, Mathematisch Centrum, Amsterdam.Google Scholar

[6] Howard, R. A. (1960) Dynamic Programming and Markov Processes. Wiley, New York.Google Scholar

[7] Kemeny, J. G. and Snell, J. L. (1966) Finite Markov Chains. Van Nostrand, New York.Google Scholar

[8] Lanery, E. (1967) Étude asymptotique des systèmes Markoviens à commande. Revue d'Informatique et Recherche Operationnelle. 1, no. 6, 3–56.Google Scholar

[9] Ross, S. (1968) Non-discounted denumerable Markovian decision models. Ann. Math. Statist. 39, 412–423.Google Scholar

[10] Veinott, A. F. Jr. (1966) On finding optimal policies in discrete dynamic programming with no discounting. Ann. Math. Statist. 37, 1284–1294.CrossRef Google Scholar

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Bather, John 1973. Optimal decision procedures for finite Markov chains. Part III: General convex systems. Advances in Applied Probability, Vol. 5, Issue. 3, p. 541.

Hordijk, Arie Schweitzer, Paul J. and Tijms, Henk 1975. The asymptotic behaviour of the minimal total expected cost for the denumerable state Markov decision model. Journal of Applied Probability, Vol. 12, Issue. 02, p. 298.

Kennedy, Douglas P. 1978. On sets of countable non-negative matrices and Markov decision processes. Advances in Applied Probability, Vol. 10, Issue. 3, p. 633.

Federgruen, A. 1978. On N-person stochastic games by denumerable state space. Advances in Applied Probability, Vol. 10, Issue. 2, p. 452.

Federgruen, A Schweitzer, P.J and Tijms, H.C 1978. Contraction mappings underlying undiscounted Markov decision problems. Journal of Mathematical Analysis and Applications, Vol. 65, Issue. 3, p. 711.

Fainberg, E. A. 1979. The Existence of a Stationary $\varepsilon $-Optimal Policy for a Finite Markov Chain. Theory of Probability & Its Applications, Vol. 23, Issue. 2, p. 297.

Thomas, L.C 1979. Connectedness conditions used in finite state Markov Decision Processes. Journal of Mathematical Analysis and Applications, Vol. 68, Issue. 2, p. 548.

White, D. J. 1979. Papers of the 8th DGOR Annual Meeting / Vorträge der 8. DGOR Jahrestagung. p. 103.

Federgruen, A. Hordijk, A. and Tijms, H.C. 1979. Denumerable state semi-Markov decision processes with unbounded costs, average cost criterion. Stochastic Processes and their Applications, Vol. 9, Issue. 2, p. 223.

Federgruen, A. 1980. On the functional equations in undiscounted and sensitive discounted stochastic games. Zeitschrift für Operations Research, Vol. 24, Issue. 7, p. 243.

Wal, J. 1980. Vorträge der Jahrestagung 1979 / Papers of the Annual Meeting 1979. p. 446.

Fainberg, E. A. 1980. An $\varepsilon $-Optimal Control of a Finite Markov Chain with an Average Reward Criterion. Theory of Probability & Its Applications, Vol. 25, Issue. 1, p. 70.

Stanford, Robert E. 1982. The set of limiting distributions for a Markov chain with fuzzy transition probabilities. Fuzzy Sets and Systems, Vol. 7, Issue. 1, p. 71.

Schweitzer, Paul J 1984. On the existence of relative values for undiscounted Markovian decision processes with a scalar gain rate. Journal of Mathematical Analysis and Applications, Vol. 104, Issue. 1, p. 67.

Federgrün, A. and Schweitzer, P. J. 1984. A Fixed Point Approach to Undiscounted Markov Renewal Programs. SIAM Journal on Algebraic Discrete Methods, Vol. 5, Issue. 4, p. 539.

Bierth, Karl-Josef 1987. DGOR. p. 643.

Schweitzer, Paul J 1987. A Brouwer fixed-point mapping approach to communicating Markov decision processes. Journal of Mathematical Analysis and Applications, Vol. 123, Issue. 1, p. 117.

Download full list

Article contents

Optimal decision procedures for finite Markov chains. Part II: Communicating systems

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

Optimal decision procedures for finite Markov chains. Part II: Communicating systems

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests