Is Partial-Dimension Convergence a Problem for Inferences from MCMC Algorithms?

Jeff Gill

doi:10.1093/pan/mpm019

Is Partial-Dimension Convergence a Problem for Inferences from MCMC Algorithms?

Published online by Cambridge University Press: 19 August 2007

Jeff Gill

Show author details

Jeff Gill*: Affiliation:
Center for Applied Statistics, Department of Political Science, Washington University, One Brookings Drive, St Louis, MO 63130-4899, e-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Increasingly, political science researchers are turning to Markov chain Monte Carlo methods to solve inferential problems with complex models and problematic data. This is an enormously powerful set of tools based on replacing difficult or impossible analytical work with simulated empirical draws from the distributions of interest. Although practitioners are generally aware of the importance of convergence of the Markov chain, many are not fully aware of the difficulties in fully assessing convergence across multiple dimensions. In most applied circumstances, every parameter dimension must be converged for the others to converge. The usual culprit is slow mixing of the Markov chain and therefore slow convergence towards the target distribution. This work demonstrates the partial convergence problem for the two dominant algorithms and illustrates these issues with empirical examples.

Type: Research Article
Information: Political Analysis , Volume 16 , Issue 2 , Spring 2008 , pp. 153 - 178

DOI: https://doi.org/10.1093/pan/mpm019 [Opens in a new window]
Copyright: Copyright © The Author 2007. Published by Oxford University Press on behalf of the Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Albert, James H., and Chib, Siddhartha. 1993. Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association 88: 669–79.CrossRef Google Scholar

Amemiya, Takeshi. 1985. Advanced econometrics. Cambridge, MA: Harvard University Press.Google Scholar

Amit, Y. 1991. On rates of convergence of stochastic relaxation for Gaussian and non-Gaussian distributions. Journal of Multivariate Analysis 38: 82–99.Google Scholar

Amit, Y., and Grenander, U. 1991. Comparing sweep strategies for stochastic relaxation. Journal of Multivariate Analysis 37: 197–222.Google Scholar

Asmussen, S. P., Glynn, P., and Thorisson, H. 1992. Stationarity detection in the initial transient problem. ACM Transactions on Modeling and Computer Simulation 2: 130–57.CrossRef Google Scholar

Athreya, K. B., Doss, Hani, and Sethuraman, Jayaram. 1996. On the convergence of the Markov chain simulation method. Annals of Statistics 24: 69–100.Google Scholar

Athreya, K. B., and Ney, P. 1978. A new approach to the limit theory of recurrent Markov chains. Transactions of the American Mathematical Society 245: 493–501.Google Scholar

Barker, A. A. 1965. Monte Carlo calculation of the radical distribution functions for a proton-electron plasma. Australian Journal of Physics 18: 119–33.Google Scholar

Brooks, S. P., Dellaportas, P., and Roberts, G. O. 1997. An approach to diagnosing total variation convergence of MCMC algorithms. Journal of Computational and Graphical Statistics 6: 251–65.Google Scholar

Brooks, S. P., and Roberts, G. O. 1998. Convergence assessment techniques for Markov chain Monte Carlo. Statistics and Computing 8: 319–35.Google Scholar

Carlin, B. P., and Polson, N. G. 1991. Inference for nonconjugate Bayesian models using the Gibbs sampler. Canadian Journal of Statistics 19: 399–405.Google Scholar

Casella, G., and Robert, C. P. 1996. Rao-Blackwellization of sampling schemes. Biometrika 93: 81–94.Google Scholar

Chan, K. S. 1993. Asymptotic behavior of the Gibbs sampler. Journal of the American Statistical Association 88: 320–6.Google Scholar

Chan, K. S., and Geyer, C. J. 1994. Discussion of Markov chains for exploring posterior distributions. (L. Tierney). Annals of Statistics 22: 1747–58.CrossRef Google Scholar

Chen, M.-H., Shao, Q.-M., and Ibrahim, J. G. 2000. Monte Carlo methods in Bayesian computation. New York: Springer-Verlag.CrossRef Google Scholar

Chib, S. 1992. Bayes inference in the tobit censored regression model. Journal of Econometrics 51: 79–99.CrossRef Google Scholar

Chib, S., and Greenberg, E. 1995. Understanding the Metropolis-Hastings algorithm. The American Statistician 49: 327–35.Google Scholar

Cowles, Mary Kathryn, and Carlin, Bradley P. 1996. Markov chain Monte Carlo convergence diagnostics: A comparative review. Journal of the American Statistical Association 91: 883–904.CrossRef Google Scholar

Cowles, Mary Kathryn, Roberts, Gareth O., and Rosenthal, Jeffrey S. 1999. Possible biases induced by MCMC convergence diagnostics. Journal of Statistical Computation and Simulation 64: 87–104.Google Scholar

Cowles, Mary Kathryn, and Rosenthal, Jeffrey S. 1998. A simulation approach to convergence rates for Markov chain Monte Carlo algorithms. Statistics and Computing 8: 115–24.CrossRef Google Scholar

Creutz, M. 1979. Confinement and the critical dimensionality of space-time. Physical Review Letters 43: 553–56.Google Scholar

Creutz, M., Jacobs, L., and Rebbi, C. 1983. Monte Carlo computations in lattice gauge theories. Physical Review 95: 201.Google Scholar

Diaconis, Persi, and Laurent, Saloff-Coste. 1993. Comparison theorems for reversible Markov chains. Annals of Applied Probability 3: 696–730.CrossRef Google Scholar

Diaconis, Persi, and Laurent, Saloff-Coste. 1996. Logarithmic Sobolev inequalities for finite Markov chains. Annals of Applied Probability 6: 695–750.Google Scholar

Diaconis, Persi, and Stroock, Daniel. 1991. Geometric bounds for eigenvalues of Markov chains. Annals of Applied Probability 1: 36–61.Google Scholar

Doeblin, W. 1940. Élements d'Une Thérie Générale des Chaînes Simple Constantes de Markoff. Annales Scientifiques de l'Ecole Normale Supérieure 57: 61–111.CrossRef Google Scholar

Fill, J. 1991. Eigenvalue bounds on convergence to stationarity for nonreversible Markov chains with an application to the exclusion process. Annals of Applied Probability 1: 62–7.Google Scholar

Frieze, A., Kannan, R., and Polson, N. G. 1993. Sampling from log-concave distributions. Annals of Applied Probability 4: 812–37.Google Scholar

Frigessi, Arnoldo, Hwang, Chii-Ruey, Di Steffano, Patrizia, and Sheu, Shuenn-Jyi. 1993. Convergence rates of the Gibbs sampler, the Metropolis algorithm, and other single-site updating dynamics. Journal of the Royal Statistical Society, Series B 55: 205–20.Google Scholar

Fulman, Jason, and Wilmer, Elizabeth L. 1999. Comparing eigenvalue bounds for Markov chains: When does Poincare beat Cheeger? Annals of Applied Probability 9: 1–13.Google Scholar

Gawande, Kishore. 1998. Comparing theories of endogenous protection: Bayesian comparison of tobit models using Gibbs sampling output. Review of Economics and Statistics 80: 128–40.CrossRef Google Scholar

Gelman, Andrew, Huang, Zaiying, van Dyk, David A., and Boscardin, John W. 2006. Transformed and parameter-expanded Gibbs samplers for multilevel linear and generalized linear models. Working paper: http://www.stat.columbia.edu/∼gelman/research/unpublished/alpha4.pdf.Google Scholar

Gelman, A., Roberts, G. O., and Gilks, W. R. 1996. Efficient Metropolis jumping rules. In Bayesian Statistics 5, eds. Bernardo, J. M., Berger, J. O., Dawid, A. P., and Smith, A. F. M., 599–608. Oxford: Oxford University.Google Scholar

Gelman, A., and Rubin, D. B. 1992. Inference from iterative simulation using multiple sequences. Statistical Science 7: 457–511.Google Scholar

Geman, S., and Geman, D. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6: 721–41.Google Scholar

Geyer, C., and Thompson, E. 1995. Annealing Markov chain Monte Carlo with applications to ancestral inference. Journal of the American Statistical Association 90: 909–20.Google Scholar

Gilks, W., and Roberts, G. 1996. Strategies for improving MCMC. In Markov Chain Monte Carlo in practice, eds. Gilks, W., Richardson, S., and Spiegelhalter, D., 89–114. New York: Chapman & Hall.Google Scholar

Gill, Jeff. 2007. Bayesian methods for the social and behavioral sciences. 2nd ed. New York: Chapman & Hall.Google Scholar

Gill, Jeff, and Casella, George. 2004. Dynamic tempered transitions for exploring multimodal posterior distributions. Political Analysis 12: 425–43.CrossRef Google Scholar

Gill, Jeff, and Walker, Lee. 2005. Elicited priors for Bayesian model specifications in political science research. Journal of Politics 67: 841–87.Google Scholar

Goodman, J., and Sokal, A. D. 1989. Multigrid Monte Carlo method. Conceptual foundations. Physics Review D 40: 2035–71.Google Scholar

Grenander, Ulf. 1983. Tutorial in pattern theory. Technical Report, Division of Applied Mathematics. Providence, RI: Brown University.Google Scholar

Hastings, W. K. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57: 97–109.Google Scholar

Hill, J., and Kriesi, H. 2001. Classification by opinion changing behavior: A mixture model approach. Political Analysis 4: 301–24.Google Scholar

Hobert, J., Robert, C., and Goutis, C. 1997. Connectedness conditions for the convergence of the Gibbs sampler. Statistics and Probability Letters 33: 235–40.Google Scholar

Imai, Kosuke, and van Dyk, David A. 2005. A Bayesian analysis of the multinomial probit model using marginal data augmentation. Journal of Econometrics 124: 311–34.Google Scholar

Ingrassia, Salvatore. 1994. On the rate of convergence of the Metropolis algorithm and Gibbs sampler by geometric bounds. Annals of Applied Probability 4: 347–89.Google Scholar

Jackman, Simon. 2000a. Estimation and inference are missing data problems: Unifying social science statistics via Bayesian simulation. Political Analysis 8: 307–32.Google Scholar

Jackman, Simon. 2000b. Estimation and inference via Bayesian simulation: An introduction to Markov chain Monte Carlo . American Journal of Political Science 44: 375–404.Google Scholar

Jackman, Simon. 2001. Multidimensional analysis of Roll Call Data via Bayesian simulation: Identification, estimation, inference, and model checking. Political Analysis 9: 227–41.Google Scholar

Kass, R. E., Carlin, B. P., Gelman, A., and Neal, R. M. 1998. Markov chain Monte Carlo in practice: A roundtable discussion. American Statistician 52: 93–100.Google Scholar

Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. 1983. Optimization by simulated annealing. Science 220: 671–80.CrossRef Google Scholar PubMed

Krishnan, Neeraja M., Seligman, Hervé, Stewart, Caro-Beth Stewart, de Konig, A. P. Jason, and Pollock, David D. 2004. Ancestral sequence reconstruction in primate mitochondrial DNA: Compositional bias and effect on functional inference. Molecular Biology and Evolution 21: 1871–83.Google Scholar

Lawler, G. F., and Sokal, A. D. 1988. Bounds on the L ² spectrum for Markov chains and their applications. Transactions of the American Mathematical Society 309: 557–80.Google Scholar

Liu, J. S. 1994. The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem. Journal of the American Statistical Association 89: 958–66.Google Scholar

Liu, J. S. 1996. Metropolized independent sampling with comparisons to rejection sampling and importance sampling . Statistics and Computing 6: 113–9.Google Scholar

Liu, J. S. 2001. Monte Carlo strategies in scientific computing. New York: Springer-Verlag.Google Scholar

Liu, J. S., Wong, W. H., and Kong, , 1994. Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81: 27–40.Google Scholar

Macintosh, Duncan. 1994. Partial convergence and approximate truth. British Journal for the Philosophy of Science 45: 153–70.Google Scholar

Manita, A. D. 1997. Convergence time to equilibrium for multi-particle Markov chains. Preprint of French-Russian Institute, No. 3. Moscow University, September 1997. http://mech.math.msu.su/∼manita/.Google Scholar

Marinari, E., and Parisi, G. 1992. Simulated tempering: A new Monte Carlo scheme. Europhysics Letters 19: 451–8.Google Scholar

Martin, Andrew D., and Quinn, Kevin M. 2002. Dynamic ideal point estimation via Markov chain Monte Carlo for the U.S. Supreme Court, 1953-1999. Political Analysis 10: 134–53.Google Scholar

Meng, X.-L., and van Dyk, D. A. 1999. Seeking efficient data augmentation schemes via conditional and marginal augmentation. Biometrika 86: 301–320.Google Scholar

Mengersen, K. L., and Tweedie, R. L. 1996. Rates of convergence of the Hastings and Metropolis algorithms. Annals of Statistics 24: 101–21.Google Scholar

Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. 1953. Equations of state calculations by fast computing machine. Journal of Chemical Physics 21: 1087–91.Google Scholar

Meyn, S. P., and Tweedie, R. L. 1993. Markov chains and stochastic stability. New York: Springer-Verlag.Google Scholar

Meyn, S. P., and Tweedie, R. L. 1994. Computable bounds for geometric convergence rates of Markov chains. Annals of Probability 4: 981–1011.Google Scholar

Mira, A., and Tierney, L. 2001. Efficiency and convergence properties of slice samplers. Scandanavian Journal of Statistics 29: 1035–53.Google Scholar

Norrander, Barbara. 2000. The multi-layered impact of public opinion on capital punishment implementation in the American states. Political Research Quarterly 53: 771–93.Google Scholar

Norris, J. R. 1997. Markov chains. Cambridge: Cambridge University Press.Google Scholar

Nummelin, E. 1984. General irreducible Markov chains and non-negative operators. Cambridge: Cambridge University Press.Google Scholar

Peskun, P. H. 1973. Optimum Monte Carlo sampling using Markov chains. Biometrika 60: 607–12.CrossRef Google Scholar

Polson, N. G. 1996. Convergence of Markov chain Monte Carlo algorithm. In Bayesian statistics 5, eds. Bernardo, J. M., Smith, A. F. M., Dawid, A. P., and Berger, J. O., 297–323. Oxford: Oxford University Press.Google Scholar

Quinn, Kevin M., Martin, Andrew, and Whitford, Andrew B. 1999. Voter choice in multi-party democracies: A test of competing theories and models. American Journal of Political Science 43: 1231–47.Google Scholar

Ripley, B. D. 1979. Algorithm AS 137: Simulating spatial patterns: Dependent samples from a multivariate density. Applied Statistics 28: 109–12.Google Scholar

Robert, C. P. 1995. Convergence control methods for Markov chain Monte Carlo algorithms. Statistical Science 10: 231–53.Google Scholar

Robert, C. P., and Casella, G. 2004. Monte Carlo statistical methods. 2nd ed. New York: Springer-Verlag.Google Scholar

Robert, Christian P., and Richardson, Sylvia. 1998. Markov chain Monte Carlo methods. In Discretization and MCMC Convergence Assessment, ed. Robert, C. P., 1–25. New York: Springer.Google Scholar

Roberts, G. O., and Polson, N. G. 1994. On the geometric convergence of the Gibbs sampler. Journal of the Royal Statistical Society, Series B 56: 377–84.Google Scholar

Roberts, G. O., and Rosenthal, J. S. 1998a. Markov chain Monte Carlo: Some practical implications of theoretical results. Canadian Journal of Statistics 26: 5–32.CrossRef Google Scholar

Roberts, G. O., and Rosenthal, J. S. 1998b. Two convergence properties of hybrid samplers. Annals of Applied Probability 8: 397–407.Google Scholar

Roberts, G. O., and Rosenthal, J. S. 1999. Convergence of the slice sampler Markov chains. Journal of the Royal Statistical Society, Series B 61: 643–60.Google Scholar

Roberts, G. O., and Sahu, S. K. 1997. Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler. Journal of the Royal Statistical Society, Series B 59: 291–307.CrossRef Google Scholar

Roberts, G. O., and Smith, A. F. M. 1994. Simple conditions for the convergence of the Gibbs sampler and Metropolis-Hastings algorithms. Stochastic Processes and their Applications 44: 207–16.Google Scholar

Roberts, G. O., and Tweedie, R. L. 1996. Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms. Biometrika 83: 95–110.Google Scholar

Rosenblatt, M. 1971. Markov processes. Structure and asymptotic behavior. New York: Springer-Verlag.Google Scholar

Rosenthal, Jeffrey S. 1993. Rates of convergence for data augmentation on finite sample spaces. Annals of Applied Probability 3: 819–39.Google Scholar

Rosenthal, Jeffrey S. 1995a. Convergence rates for Markov chains. SIAM Review 37: 387–405.Google Scholar

Rosenthal, Jeffrey S. 1995b. Minorization conditions and convergence rates for Markov chain Monte Carlo. Journal of the American Statistical Association 90: 558–66.Google Scholar

Rosenthal, Jeffrey S. 1996. Analysis of the Gibbs sampler for a model related to James-Stein estimators. Statistics and Computing 6: 269–75.Google Scholar

Salzano, Marcia, and Schonmann, Roberto H. 1997. The second lowest extremal invariant measure of the contact process. Annals of Probability 25: 1846–71.Google Scholar

Salzano, Marcia, and Schonmann, Roberto H. 1999. The second lowest extremal invariant measure of the contact process II. Annals of Probability 27: 845–75.Google Scholar

Sinclair, A. J., and Jerrum, M. R. 1988. Conductance and the rapid mixing property for Markov chains: The approximation of the permanent resolved. Proceedings of the 20th Annual ACM Symposium on the Theory of Computing 235–44.Google Scholar

Sinclair, A. J., and Jerrum, M. R. 1989. Approximate counting, uniform generation and rapidly mixing Markov chains. Information and Computation 82: 93–133.Google Scholar

Smith, Alastair. 1999. Testing theories of strategic choice: The example of crisis escalation. American Journal of Political Science 43: 1254–83.Google Scholar

Swendsen, R. H., and Wang, J. S. 1987. Nonuniversal critical dynamics in Monte Carlo simulations. Physical Review Letters 58: 86–8.CrossRef Google Scholar PubMed

Tanner, M. A., and Wong, W. H. 1987. The calculation of posterior distributions by data augmentation. Journal of the American Statistical Society 82: 528–50.Google Scholar

Tierney, L. 1994. Markov chains for exploring posterior distributions. Annals of Statistics 22: 1701–28.Google Scholar

Tobin, James. 1958. Estimation of relationships for limited dependent variables. Econometrica 26: 24–36.CrossRef Google Scholar

Tweedie, R. L. 1975. Sufficient conditions for ergodicity and recurrence of Markov chains on a general state space. Stochastic Processes Applications 3: 385–403.Google Scholar

van Dyk, D., and Meng, X.-L. 2001. The art of data augmentation. Journal of Computational and Graphical Statistics 10: 1–50.Google Scholar

Vines, S., and Gilks, W. 1994. Technical Report, MRC Biostatistics Unit, Cambridge University.Google Scholar

Western, Bruce. 1998. Causal heterogeneity in comparative research: A Bayesian hierarchical modeling approach. American Journal of Political Science 42(4): 1233–59.Google Scholar

Zellner, A., and Min, C.-K. 1995. Gibbs sampler convergence criteria. Journal of the American Statistical Association 90: 921–7.Google Scholar

Article contents

Is Partial-Dimension Convergence a Problem for Inferences from MCMC Algorithms?

Abstract

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests