Hostname: page-component-cd9895bd7-dzt6s Total loading time: 0 Render date: 2024-12-23T09:44:14.732Z Has data issue: false hasContentIssue false

Constrained Discounted Markov Decision Chains

Published online by Cambridge University Press:  27 July 2009

Linn I. Sennott
Affiliation:
Department of StatisticsUniversity of Illinois at Urbana-Champaign Champaign, Illinois 61820

Abstract

A Markov decision chain with countable state space incurs two types of costs: an operating cost and a holding cost. The objective is to minimize the expected discounted operating cost, subject to a constraint on the expected discounted holding cost. The existence of an optimal randomized simple policy is proved. This is a policy that randomizes between two stationary policies, that differ in at most one state. Several examples from the control of discrete time queueing systems are discussed.

Type
Articles
Copyright
Copyright © Cambridge University Press 1991

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Altman, E. (1990). Denumerable constrained Markov decision problems and finite approximations. Preprint.Google Scholar
Bertsekas, D. (1987). Dynamic programming: Deterministic and stochastic models. Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
Beutler, F.J. & Ross, K.W. (1985). Optimal policies for controlled Markov chains with a constraint. Journal of Mathematical Analysis and Applications 112: 236252.CrossRefGoogle Scholar
Beutler, F.J. & Ross, K.W. (1986). Time-average optimal constrained semi-Markov decision processes. Advances in Applied Probability 18: 341359.CrossRefGoogle Scholar
Borkar, V. (1990). Controlled Markov chains with constraints. Preprint.CrossRefGoogle Scholar
Chitgopekar, S.S. (1975). Denumerable state Markovian sequential control processes: On randomizations of optimal policies. Naval Research Logistics Quarterly 22: 567573.CrossRefGoogle Scholar
Frid, E.B. (1972). On optimal strategies in control problems with constraints. Theory of Probability and Its Applications 17: 188192.CrossRefGoogle Scholar
Hewitt, & Stromberg, (1965). Real and abstract analysis. New York: Springer-Verlag.Google Scholar
Hordijk, A. (1974). Dynamic programming and Markov potential theory. Mathematical Centre Tracts 51. Amsterdam: CWI.Google Scholar
Hordijk, A. & Kallenberg, L.C.M. (1984). Constrained undiscounted stochastic dynamic programming. Mathematics of Operations Research 9: 276289.CrossRefGoogle Scholar
Kallenberg, L.C.M. (1980). Linear programming and finite Markovian control problems. Mathematical Centre Tracts 148. Amsterdam: CWI.Google Scholar
Ma, D.-J., Makowski, A.M., & Shwartz, A. (1986). Estimation and optimal control for constrained Markov chains. Proceedings of the 25th Conference on Decision and Control, Athens, Greece, pp. 994999.Google Scholar
Sennott, L.I. (1989). Average cost optimal stationary policies in infinite state Markov decision processes with unbounded costs. Operations Research 37: 626633.CrossRefGoogle Scholar
Sennott, L.I. (1989). Average cost semi-Markov decision processes and the control of queueng systems. Probability in the Engineering and Informational Sciences 3: 247272.CrossRefGoogle Scholar
Sennott, L.I. (1991). Value iteration in countable state average cost Markov decision processes with unbounded costs. Annals of Operations Research 28: 261272.CrossRefGoogle Scholar