Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-10T06:57:18.979Z Has data issue: false hasContentIssue false

Constrained Average Cost Markov Decision Chains

Published online by Cambridge University Press:  27 July 2009

Linn I. Sennott
Affiliation:
Department of Mathematics, Illinois State University, Normal, Illinois 61761

Abstract

A Markov decision chain with denumerable state space incurs two types of costs — for example, an operating cost and a holding cost. The objective is to minimize the expected average operating cost, subject to a constraint on the expected average holding cost. We prove the existence of an optimal constrained randomized stationary policy, for which the two stationary policies differ on at most one state. The examples treated are a packet communication system with reject option and a single-server queue with service rate control.

Type
Research Article
Copyright
Copyright © Cambridge University Press 1993

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1.Altman, E. (1990). Denumerable constrained Markov decision problems and finite approximations (preprint).Google Scholar
2.Altman, E. & Shwartz, A. (1991). Markov decision problems and state-action frequencies. SIAM Journal on Control and Optimization 29: 786806.CrossRefGoogle Scholar
3.Beutler, F.J. & Ross, K.W. (1985). Optimal policies for controlled Markov chains with a constraint. Journal of Mathematical Analysis and Applications 112: 236252.CrossRefGoogle Scholar
4.Beutler, F.J. & Ross, K.W. (1986). Time-average optimal constrained semi-Markov decision processes. Advances in Applied Probability 18: 341359.Google Scholar
5.Borkar, V. Ergodic control of Markov chains with constraints — The general case (preprint).Google Scholar
6.Cavazos-Cadena, R. (1991). Solution to the optimality equation in a class of Markov decision chains with the average cost criterion. Kybernetika 27: 2337.Google Scholar
7.Cavazos-Cadena, R. & Sennott, L. (1992). Comparing recent assumptions for the existence of average optimal stationary policies. Operations Research Letters 11: 3337.CrossRefGoogle Scholar
8.Chung, K.L. (1967). Markov chains with stationary transition probabilities. New York: Springer-Verlag.Google Scholar
9.Frid, E.B. (1972). On optimal strategies in control problems with constraints. Theory of Probability and Applications 17: 188191.CrossRefGoogle Scholar
10.Hewitt, & Stromberg, (1965). Real and abstract analysis. New York: Springer-Verlag.Google Scholar
11.Heyman, D. & Sobel, M. (1982). Stochastic models in operations research, Vol. 1. New York: McGraw-Hill.Google Scholar
12.Hordijk, A. (1974). Dynamic programming and Markov potential theory. Mathematical Centre Tracts, 51, Amsterdam.Google Scholar
13.Hordijk, A. & Kallenberg, L.C.M. (1984). Constrained undiscounted stochastic dynamic programming. Mathematics of Operations Research 9: 276289.Google Scholar
14.Ma, D.-J., Makowski, A.M., & Shwartz, A. (1986). Estimation and optimal control for constrained Markov chains. In Proceedings of the 25th IEEE Conference on Decision and Control. Athens, 994999.Google Scholar
15.Makowski, A.M. & Shwartz, A. (1991). On constrained optimization of the Klimov network and related Markov decision processes (preprint).Google Scholar
16.Sennott, L.I. (1989). Average cost optimal stationary policies in infinite state Markov decision processes with unbounded costs. Operations Research 37: 626633.Google Scholar
17.Sennott, L.I. (1991). Constrained discounted Markov decision chains. Probability in the Engineering and informational Sciences 5: 463475.CrossRefGoogle Scholar
18.Sennott, L.I. (1993). The average cost optimality equation and critical number policies. Probability in the Engineering and Informational Sciences 7: 4767.CrossRefGoogle Scholar
19.Spieksma, F.M. (1990). Geometrically ergodic Markov chains and the optimal control of queues. Ph.D. thesis, University of Leiden, The Netherlands.Google Scholar