Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-11T04:39:38.187Z Has data issue: false hasContentIssue false

Markov decision programming–the moment optimal problem for the first-passage model

Published online by Cambridge University Press:  17 February 2009

Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

In this paper, we discuss MDP-the moment optimal problem for the first-passage model. A policy improvement iteration algorithm is given for finding the k-moment optimal stationary policy.

Type
Research Article
Copyright
Copyright © Australian Mathematical Society 1997

References

[1]Baykal-Gürsoy, M. and Ross, K. W., “Variability sensitive Markov decision processes”, Math. Oper. Res. 17 (1992) 558571.CrossRefGoogle Scholar
[2]qing, Dong ze, “An accelerated successive approximation method of discounted Markovian decision programming and the least variance problem in optimal policies (Chinese)”, Acta Math. Sinica 21 (1978) 135150.Google Scholar
[3]Filar, J. A., Kallenberg, L. C. M. and Huey-Miin, Lee, “Variance-penalized Markov decision processes”, Math. Oper. Res. 14 (1989) 147161.CrossRefGoogle Scholar
[4]Filar, J. A. and Huey-Miin, Lee, “Gain/variability tradeoffs in undiscounted Markov decision processes”, in Proc. 1985 IEEE Conf. (24th Conf), Decision and Control, 11061112.Google Scholar
[5]Jaquette, S. C., “Markov decision processes with a new optimality criterion: Small interest rates”, Ann. Statist. 43 (1972) 18941901.CrossRefGoogle Scholar
[6]Jaquette, S. C., “Markov decision processes with a new optimality criterion: Discrete time”, Ann. Statist. 1 (1973) 496505.CrossRefGoogle Scholar
[7]Kawai, H. A., “A variance minimization problem for a Markov decision process”, EurJour. Oper. Res. 31 (1987) 140145.CrossRefGoogle Scholar
[8]Chung, Kun-Jen, “A note on maximal mean/standard deviation ratio in an undiscounted Markov decision process”, Oper. Res. Ltters 8 (1989) 201203.CrossRefGoogle Scholar
[9]Chung, Kun-Jen, “Mean-variance tradeoffs in an undiscounted Markov decision process: The unichain case”, Oper. Res. 42 (1994) 184188.CrossRefGoogle Scholar
[10]Jian-xing, Lin, “The moment optimal model in which the discount factor is dependent on history(chinese)”, M. Sc. Thesis, Department of Appl. Math., Qinghua University.Google Scholar
[11]Jian-yong, Liu and Ke, Liu, “Markov decision programming- the first-passage model with denumerable state space”, Syst. Sci. and Math. Sci. 5 (1992) 340351.Google Scholar
[12]Quelle, G., “Dynamic programming of expectation and variance”, J. Math. Anal. Appl. 55 (1976) 239252.CrossRefGoogle Scholar
[13]Sobel, M. L., “The variance of discounted Markov decision process”, J. Appl. Prob. 19 (1982) 794802.CrossRefGoogle Scholar
[14]Sobel, M. L., “Maximal mean/standard deviation ratio in an undiscounted Markov decision process”, Oper. Res. Letters 4 (1985) 157158.CrossRefGoogle Scholar
[15]Sobel, M. L., “Mean-variance tradeoffs in an undiscounted Markov decision process”, Oper. Res. 42 (1994) 175183.CrossRefGoogle Scholar
[16]White, D. J., “Variance and probabilistic criteria in finite markov decision processes: A review”, J. Opti. Theory Appl. 56 (1988) 129.CrossRefGoogle Scholar