Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-08T22:41:46.866Z Has data issue: false hasContentIssue false

Strongly consistent estimation in a controlled Markov renewal model

Published online by Cambridge University Press:  14 July 2016

Michael Kolonko*
Affiliation:
Universität Karlsruhe
*
Postal address: Institut für Mathematische Statistik, Universität Karlsruhe, Englerstrasse 2, 7500 Karlsruhe, W. Germany.

Abstract

The optimal control of dynamic models which are not completely known to the controller often requires some kind of estimation of the unknown parameters. We present conditions under which a minimum contrast estimator will be strongly consistent independently of the control used. This kind of estimator is appropriate for the adaptive or ‘estimation and control' approach in dynamic programming under uncertainty. We consider a countable-state Markov renewal model and we impose bounding and recurrence conditions of the so-called Liapunov type.

Type
Research Papers
Copyright
Copyright © Applied Probability Trust 1982 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Beckenbach, E. F. and Bellman, R. (1961) Inequalities. Springer-Verlag, Berlin.CrossRefGoogle Scholar
Billingsley, P. (1968) Convergence of Probability Measures. Wiley, New York.Google Scholar
Borkar, V. and Varaiya, P. (1979) Adaptive control of Markov chains. In Lecture Notes in Control and Information Sciences 16, Springer-Verlag, Berlin.Google Scholar
Borkar, V. and Varaiya, P. (1980) Identification and adaptive control of Markov chains. Department of Electrical Engineering and Computer Sciences and the Electronics Research Laboratory, University of California, Berkeley, California 94720.Google Scholar
Doshi, B. and Shreve, S. E. (1980) Strong consistency of a modified maximum likelihood estimator for controlled Markov chains. J. Appl. Prob. 17, 726734.CrossRefGoogle Scholar
El-Fattah, Y. M. (1981) Recursive estimation and control in Markov chains. Adv. Appl. Prob. 13, 778803.CrossRefGoogle Scholar
Georgin, J. P. (1978) Estimation et contrôle des chaînes de Markov sur des espaces arbitraires. In Lecture Notes in Mathematics 636, Springer-Verlag, Berlin.Google Scholar
Hinderer, K. (1970) Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter. Springer-Verlag, Berlin.Google Scholar
Hordijk, A. (1974) Dynamic Programming and Markov Potential Theory. Mathematical Centre Tract 51, Amsterdam.Google Scholar
Hordijk, A. (1976) Regenerative Markov decision models. Math. Progr. Stud. 6, 4972.CrossRefGoogle Scholar
Kolonko, M. (1980a) Dynamische Optimierung unter Unsicherheit in einem Semi-Markoff-Modell mit abzählbarem Zustandsraum. Dissertation, Bonn. Google Scholar
Kolonko, M. (1980b) A countable Markov chain with reward structure. — Continuity of the average reward. Preprint No. 415, SFB 72, University of Bonn.Google Scholar
Kolonko, M. (1982) The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter. Math. Operationsforsch. Statist., Ser. Optimization. To appear.Google Scholar
Kumar, P. R. and Becker, A. (1980) A new family of optimal adaptive controllers. Mathematical Research Report No. 80–18, University of Maryland, Dept. of Mathematics.Google Scholar
Kurano, M. (1972) Discrete-time Markovian decision processes with an unknown parameteraverage return criterion. J. Operat. Res. Soc. Japan 15, 6776.Google Scholar
Loève, M. (1978) Probability Theory II, 4th edn. Springer-Verlag, Berlin.Google Scholar
Mandl, P. (1974) Estimation and control in Markov chains. Adv. Appl. Prob. 6, 4060.Google Scholar
Mandl, P. (1979) On the adaptive control of countable Markov chains. In Probability Theory, Banach Center Publications, Vol. 5. PWN — Polish Scientific Publishers, Warsaw.Google Scholar
Pfanzagl, J. (1969) On the measurability and consistency of minimum contrast estimates. Metrika 14, 249272.Google Scholar
Schäl, M. (1975) On dynamic programming: compactness of the space of policies. Stoch. Proc. Appl. 3, 345364.Google Scholar
Schäl, M. (1981) Estimation and control in discounted stochastic dynamic programming. Stochastics. To appear.Google Scholar
Wijngaard, J. (1977a) Stationary Markovian decision problem and perturbation theory of quasi-compact linear operators. Math. Operat. Res. 2, 91102.CrossRefGoogle Scholar
Wijngaard, J. (1977b) Recurrence conditions and the existence of average optimal strategies for inventory problems on a countable state space. Bonner Math. Schriften 98, 149161.Google Scholar