Hostname: page-component-78c5997874-mlc7c Total loading time: 0 Render date: 2024-11-05T15:24:49.075Z Has data issue: false hasContentIssue false

Likelihood ratio gradient estimation for stochastic recursions

Published online by Cambridge University Press:  01 July 2016

Peter W. Glynn*
Affiliation:
Stanford University
Pierre L'ecuyer*
Affiliation:
Université de Montréal
*
* Postal address: Department of Operations Research, Stanford University, Stanford, CA 94305-4022, USA.
** Postal address: Département d'IRO, Université de Montréal, C.P. 6128, Succ. Centre Ville, Montréal, H3C 3J7, Canada.

Abstract

In this paper, we develop mathematical machinery for verifying that a broad class of general state space Markov chains reacts smoothly to certain types of perturbations in the underlying transition structure. Our main result provides conditions under which the stationary probability measure of an ergodic Harris-recurrent Markov chain is differentiable in a certain strong sense. The approach is based on likelihood ratio ‘change-of-measure' arguments, and leads directly to a ‘likelihood ratio gradient estimator' that can be computed numerically.

Type
General Applied Probability
Copyright
Copyright © Applied Probability Trust 1995 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The research of this author was supported by the U.S. Army Research Office under Contract No. DAAL03-91-G-0101 and by the National Science Foundation under Contract No. DDM-9101580.

This author's research was supported by NSERC-Canada grant No. OGPO110050 and FCAR-Québec Grant No. 93ER-1654.

References

Asmussen, S. (1987) Applied Probability and Queues. Wiley, New York.Google Scholar
Athreya, K. B. and Ney, P. (1978) A new approach to the limit theory of recurrent Markov Chains. Trans. Amer. Math. Soc. 245, 493501.Google Scholar
Billingsley, P. (1968) Convergence of Probability Measures. Wiley, New York.Google Scholar
Chung, K. L. (1974) A Course in Probability Theory. Academic Press, New York.Google Scholar
Fox, B. L. and Glynn, P. W. (1986). Discrete-time conversion for simulating semi-Markov processes. Operat. Res. Lett. 5, 191196.CrossRefGoogle Scholar
Glasserman, P. (1991) Gradient Estimation via Perturbation Analysis. Kluwer Academic Publishers, Norwell, MA.Google Scholar
Glynn, P. W. (1986) Optimization of stochastic systems. Proc. 1986 Winter Simulation Conf., 5259.Google Scholar
Glynn, P. W. (1989) A GSMP formalism for discrete event systems. Proc. IEEE 77, 1423.CrossRefGoogle Scholar
Glynn, P. W. (1992) Pathwise convexity and its relation to convergence of time-average derivatives. Management Sci. 38, 13601366.Google Scholar
Glynn, P. W. and L'Ecuyer, P. (1994) Likelihood ratio gradient estimation for stochastic recursions. Research Report No. G-94, GERAD, University of Montreal.Google Scholar
Glynn, P. W., L'Ecuyer, P., and Ades, M. (1991) Gradient estimation for ratios. Proc. 1991 Winter Simulation Conf. IEEE Press, pp. 986993.Google Scholar
Golub, G. H. and Meyer, C. D. (1986) Using the QR factorization and group inversion to compute, differentiate, and estimate the sensitivity of stationary probabilities for Markov chains. SIAM J. Alg. Dis. Methods 7, 273281.Google Scholar
Kennedy, D. P. (1972) The continuity of the single-server queue. J. Appl. Prob. 9, 370381.CrossRefGoogle Scholar
König, D., Mathes, K. and Nawrotzki, K. (1967) Verallgemeinerungen der Erlangschen und Engsetschen Formeln. Akademie-Verlag, Berlin.Google Scholar
L'Ecuyer, P. and Glynn, P. W. (1994) Stochastic optimization by simulation: Convergence proofs for GI/G/1 queues in steady-state. Management Sci. 40, 15621578.Google Scholar
Meyn, S. P. and Tweedie, R. L. (1993) Markov Chains and Stochastic Stability. Springer-Verlag, New York.Google Scholar
Nummelin, E. (1978) A splitting technique for Harris recurrent Markov chains. Z. Wahrscheinlichkeitsth. 43, 309318.Google Scholar
Nummelin, E. (1984) General Irreducible Markov Chains and Non-negative Operators. Cambridge University Press, New York.CrossRefGoogle Scholar
Rubinstein, R. Y. and Shapiro, A. (1993) Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization by the Score Function Method. Wiley, New York.Google Scholar
Schweitzer, P. J. (1968) Perturbation theory and finite Markov chains. J. Appl. Prob. 5, 401413.CrossRefGoogle Scholar
Tweedie, R. L. (1983) The existence of moments for stationary Markov chains. J. Appl. Prob. 20, 191196.Google Scholar
Vázquez-Abad, F. and Kushner, H. (1992) Estimation of the derivative of a stationary measure with respect to a control parameter. J. Appl. Prob. 29, 343352.Google Scholar
Whitt, W. (1974) The continuity of queues. Adv. Appl. Prob. 6, 175183.CrossRefGoogle Scholar
Whitt, W. (1980) Continuity of generalized semi-Markov processes. Math. Operat. Res. 5, 494501.Google Scholar