Hostname: page-component-cd9895bd7-q99xh Total loading time: 0 Render date: 2024-12-26T22:04:47.216Z Has data issue: false hasContentIssue false

The Statistics of Causal Inference: A View from Political Methodology

Published online by Cambridge University Press:  04 January 2017

Luke Keele*
Affiliation:
Department of Political Science, 211 Pond Lab, Penn State University, University Park, PA 19130
*
e-mail: [email protected] (corresponding author)

Abstract

Many areas of political science focus on causal questions. Evidence from statistical analyses is often used to make the case for causal relationships. While statistical analyses can help establish causal relationships, it can also provide strong evidence of causality where none exists. In this essay, I provide an overview of the statistics of causal inference. Instead of focusing on specific statistical methods, such as matching, I focus more on the assumptions needed to give statistical estimates a causal interpretation. Such assumptions are often referred to as identification assumptions, and these assumptions are critical to any statistical analysis about causal effects. I outline a wide range of identification assumptions and highlight the design-based approach to causal inference. I conclude with an overview of statistical methods that are frequently used for causal inference.

Type
Articles
Copyright
Copyright © The Author 2015. Published by Oxford University Press on behalf of the Society for Political Methodology 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Authors' note: For comments I thank the editors and the four anonymous reviewers. I also thank Rocío Titiunik, Jasjeet Sekhon, Paul Rosenbaum, and Dylan Small for many insightful conversations about these topics over the years. In the online Supplementary Materials, I provide further information about software tools to implement many of the methodologies discussed in this essay. Supplementary materials for this article are available on the Political Analysis Web site.

References

Abadie, Alberto, Diamond, Alexis, and Hainmueller, Jens. 2010. Synthetic control methods for comparative case studies: Estimating the effect of California's tobacco control program. Journal of the American Statistical Association 105(490): 493505.Google Scholar
Abadie, Alberto, and Gardeazabal, Javier. 2003. The economic costs of conflict: A case study of the Basque country. American Economic Review 93(1): 112–32.Google Scholar
Angrist, Joshua D., Imbens, Guido W., and Rubin, Donald B. 1996. Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91(434): 444–55.Google Scholar
Angrist, Joshua D., and Pischke, Jörn-Steffen. 2009. Mostly harmless econometrics. Princeton, NJ: Princeton University Press.Google Scholar
Angrist, Joshua D., and Pischke, Jörn-Steffen. 2010. The credibility revolution in empirical economics: How better research design is taking the con out of econometrics. Journal of Economic Perspectives 24(2): 330.Google Scholar
Arceneaux, Kevin, Gerber, Alan S., and Green, Donald P. 2006. Comparing experimental and matching methods using a large-scale voter mobilization study. Political Analysis 14(1): 3762.Google Scholar
Balke, Alexander, and Pearl, Judea. 1997. Bounds on treatment effects from studies with imperfect compliance. Journal of the American Statistical Association 92(439): 11711176.Google Scholar
Barnow, B. S., Cain, G. G., and Goldberger, A. S. 1980. Issues in the analysis of selectivity bias. In Evaluation studies, eds. Stromsdorfer, E. and Farkas, G., Vol. 5, 4359. San Francisco: Sage Publications.Google Scholar
Berk, Richard A. 2006. Regression analysis: A constructive critique. Thousand Oaks, CA: Sage Publications.Google Scholar
Bound, J., Jaeger, D. A., and Baker, R. M. 1995. Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. Journal of the American Statistical Association 90(430): 443–50.Google Scholar
Bowers, Jake, Fredrickson, Mark M., and Panagopoulos, Costas. 2013. Reasoning about interference between units: A general framework. Political Analysis 21(1): 97124.Google Scholar
Calonico, Sebastian, Cattaneo, Matias, and Titiunik, Rocio. 2013. Robust nonparametric confidence intervals for regression-discontinuity designs. Econometrica 82(6): 2295–326.Google Scholar
Campbell, Donald T., and Stanley, Julian C. 1963. Experimental and quasi-experimental designs for research. Chicago: Rand McNally.Google Scholar
Cattaneo, Matias, Frandsen, Brigham, and Titiunik, Rocío. 2014. Randomization inference in the regression-discontinuity design: An application to party advantages in the U.S. Senate. Journal of Causal Inference. Unpublished manuscript.Google Scholar
Caughey, Devin, and Sekhon, Jasjeet S. 2011. Elections and the regression discontinuity design: Lessons from close U.S. House races, 1942–2008. Political Analysis 19(4): 385408.Google Scholar
Cochran, William G., and Paul Chambers, S. 1965. The planning of observational studies of human populations. Journal of Royal Statistical Society, Series A 128(2): 234–65.Google Scholar
Cook, T. D., and Shadish, W. R. 1994. Social experiments: Some developments over the past fifteen years. Annual Review of Psychology 45:545–80.Google Scholar
Cook, Thomas D., Shadish, William R., and Wong, Vivian C. 2008. Three conditions under which experiments and observational studies produce comparable causal estimates: New findings from within-study comparisons. Journal of Policy Analysis and Management 27(4): 724–50.Google Scholar
Cornfield, J., Haenszel, W., Hammond, E., Lilienfeld, A., Shimkin, M., and Wynder, E. 1959. Smoking and lung cancer: Recent evidence and a discussion of some questions. Journal of National Cancer Institute 22:173203.Google Scholar
Crump, Richard K., Joseph Hotz, V., Imbens, Guido W., and Mitnik, Oscar A. 2009. Dealing with limited overlap in estimation of average treatment effects. Biometrika 96(1): 187–99.CrossRefGoogle Scholar
Dawid, A. Philip. 2000. Causal inference without counterfactuals. Journal of the American Statistical Association 95(450): 407–24.Google Scholar
Dehejia, Rajeev, and Wahba, Sadek. 1999. Causal effects in non-experimental studies: Re-evaluating the evaluation of training programs. Journal of the American Statistical Association 94(448): 10531062.Google Scholar
Ding, Peng, and Miratrix, Luke W. 2015. To adjust or not to adjust? Sensitivity analysis of M-bias and butterfly-bias. Journal of Causal Inference 3(1): 4157.Google Scholar
Dunning, Thad. 2012. Natural experiments in the social sciences: A design-based approach. Cambridge, UK: Cambridge University Press.Google Scholar
Elwert, Felix, and Winship, Christopher. 2014. Endogenous selection bias: The problem of conditioning on a collider variable. Annual Review of Sociology 40(1): 3153.Google Scholar
Fisher, R. A. 1938. Presidential address. Sankhya: The Indian Journal of Statistics 4(1): 14–7.Google Scholar
Freedman, D. A. 2005. Linear statistical models for causation: A critical review. Encyclopedia of Statistics in Behavioral Science.Google Scholar
Gerber, Alan S., and Green, Donald P. 2012. Field experiments: Design, analysis, and interpretation. New York: Norton.Google Scholar
Glynn, Adam N., and Quinn, Kevin M. 2010. An introduction to the augmented inverse propensity weighted estimator. Political Analysis 18(1): 3656.CrossRefGoogle Scholar
Gordon, Sanford C. 2011. Politicizing agency spending authority: Lessons from a bush-era scadal. American Political Science Review 105(4): 717–34.Google Scholar
Greevy, Robert, Lu, Bo, Silber, Jeffery H., and Rosenbaum, Paul. 2004. Optimal multivariate matching before randomization. Biostatistics 5(2): 263–75.Google Scholar
Hahn, Jinyong, Todd, Petra, and van der Klaauw, Wilbert. 2001. Identification and estimation of treatments effects with a regression-discontinuity design. Econometrica 69(1): 201–9.Google Scholar
Hainmueller, Jens, and Hazlett, Chad. 2014. Kernel regularized least squares: Reducing misspecification bias with a flexible and interpretable machine learning approach. Political Analysis 22(2): 143168.Google Scholar
Hansford, Thomas G., and Gomez, Brad T. 2010. Estimating the electoral effects of voter turnout. American Political Science Review 104(2): 268–88.Google Scholar
Hernán, Miguel A., and VanderWeele, Tyler J. 2011. Compound treatments and transportability of causal inference. Epidemiology 22(3): 368–77.Google Scholar
Hidalgo, Daniel F., and Sekhon, Jasjeet S. 2011. Causation. In International Encyclopedia of Political Science, eds. Badie, Bertrand, Berg-Schlosser, Dirk, and Morlino, Leonardo, 203–10. Thousand Oaks, CA: Sage Publications.Google Scholar
Hill, Jennifer, Weiss, Christopher, and Zhai, Fuhua. 2011. Challenges with propensity score strategies in a highdimensional setting and a potential alternative. Multivariate Behavioral Research 46(3): 477513.Google Scholar
Hill, Jennifer L. 2011. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics 20(1): 217–40.Google Scholar
Ho, Daniel E., Imai, Kosuke, King, Gary, and Stuart, Elizabeth A. 2007. Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis 15(3): 199236.Google Scholar
Holland, Paul W. 1986. Statistics and causal inference. Journal of the American Statistical Association 81(396): 945–60.Google Scholar
Holland, Paul W. 1988. Causal inference, path analysis, and recursive structural equation models. Sociological Methodology 18:449–84.Google Scholar
Imai, Kosuke, Keele, Dustin Tingley, Luke, and Yamamoto, Teppei. 2011. Unpacking the black box of causality: Learning about causal mechanisms from experimental and observational studies. American Political Science Review 105(4): 765–89.Google Scholar
Imbens, Guido W. 2003. Sensitivity to exogeneity assumptions in program evaluation. American Economic Review Papers and Proceedings 93(2): 126–32.Google Scholar
Imbens, Guido W. 2010. Better LATE than nothing: Some comments on Deaton (2009) and Heckman and Urzua (2009). Journal of Economic Literature 48(2): 399423.Google Scholar
Imbens, Guido W., and Rubin, Donald B. 2015. Causal inference for statistics, social, and biomedical sciences: An introduction. Cambridge, UK: Cambridge University Press.Google Scholar
Imbens, Guido W., and Kalyanaraman, Karthik. 2012. Optimal bandwidth choice for the regression discontinuity estimator. Review of Economic Studies 79(3): 933–59.Google Scholar
Kang, Joseph D.Y., and Schafer, Joseph L. 2007. Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science 22(4): 523–39.Google Scholar
Keele, Luke. 2008. Semiparametric regression for the social sciences. Chichester, UK: Wiley and Sons.Google Scholar
Keele, Luke J., McConnaughy, Corrine, and White, Ismail K. 2012. Strengthening the experimenter's toolbox: Statistical estimation of internal validity. American Journal of Political Science 56(2): 484–99.Google Scholar
Keele, Luke J., and Minozzi, William. 2012. How much is Minnesota like Wisconsin? Assumptions and counterfactuals in causal inference with observational data. Political Analysis 21(2): 193216.Google Scholar
Keele, Luke, Titiunik, Rocío, and Zubizarreta, José. 2014. Enhancing a geographic regression discontinuity design through matching to estimate the effect of ballot initiatives on voter turnout. Journal of the Royal Statistical Society, Series A 178(1): 223–39.Google Scholar
King, Gary, Lucas, Christopher, and Nielsen, Richard. 2014. The balance-sample size frontier in matching methods for causal inference. Unpublished Manuscript.Google Scholar
Lee, David S. 2008. Randomized experiments from non-random selection in U.S. House elections. Journal of Econometrics 142(2): 675–97.Google Scholar
Lee, David S. 2009. Training, wages, and sample selection: Estimating sharp bounds on treatment effects. Review of Economic Studies 76(3): 1071–102.Google Scholar
Lee, David S., and Lemieux, Thomas. 2010. Regression discontinuity designs in economics. Journal of Economic Literature 48(2): 281355.Google Scholar
Lyall, Jason. 2009. Does indiscriminate violence incite insurgent attacks? Evidence from Chechnya. Journal of Conflict Resolution 53(3): 331–62.Google Scholar
Manski, Charles F. 1990. Nonparametric bounds on treatment effects. American Economic Review Papers and Proceedings 80(2): 319–23.Google Scholar
Manski, Charles F. 1995. Identification problems in the social sciences. Cambridge, MA: Harvard University Press.Google Scholar
Manski, Charles F. 2007. Identification for prediction and decision. Cambridge, MA: Harvard University Press.Google Scholar
Matzkin, Rosa L. 2007. Nonparametric identification. Handbook of Econometrics 6:5307–68.Google Scholar
Mebane, Walter R., and Poast, Paul. 2013. Causal inference without ignorability: Identification with nonrandom assignment and missing treatment data. Political Analysis 22(2): 169–82.Google Scholar
Morgan, Stephen L., and Winship, Christopher. 2014. Counterfactuals and causal inference: Methods and principles for social research. 2nd ed. New York: Cambridge University Press.Google Scholar
Norvell, Daniel C., and Cummings, Peter. 2002. Association of helmet use with death in motorcycle crashes. American Journal of Epidemiology 156(5): 483–87.Google Scholar
Pearl, Judea. 1995. Causal diagrams for empirical research. Biometrika 82(4): 669710.Google Scholar
Pearl, Judea. 2001. Direct and indirect effects. Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence. San Francisco, CA: Morgan Kaufmann Publishers.Google Scholar
Pearl, Judea. 2009a. Causality: models, reasoning, and inference. 2nd ed. New York: Cambridge University Press.Google Scholar
Pearl, Judea. 2009b. Letter to the editor. Statistics in Medicine 28:14151416.Google Scholar
Pearl, Judea. 2010. On the consistency rule in causal inference: Axiom, definition, assumption, or theorem? Epidemiology 21(6): 872–5.Google Scholar
Poe, Steven C., and Neal Tate, C. 1994. Repression of human rights to personal integrity in the (1980s): A global analysis. American Political Science Review 88(04): 853–72.Google Scholar
Robins, James M. 1997. Causal inference from complex longitudinal data. Latent variable modeling and applications to causality, 69–117. New York: Springer.Google Scholar
Robins, James M. 1999. Marginal structural models versus structural nested models as tools for causal inference. In Statistical methods in epidemiology: The environment and clinical trials, eds. Halloran, E. and Berry, D., 95134. New York: Springer-Verlag.Google Scholar
Robins, James M., Rotnitzky, Andrea, and Ping Zhao, Lue. 1994. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association 89(427): 846–66.Google Scholar
Robins, J. M. 2003. Semantics of causal DAG models and the identification of direct and indirect effects. In Highly structured stochastic systems, eds. Green, P. J., Hjort, N. L., and Richardson, S., 7081. Oxford: Oxford University Press.Google Scholar
Rosenbaum, Paul R. 1984. The consequences of adjusting for a concomitant variable that has been affected by the treatment. Journal of the Royal Statistical Society, Series A 147(5): 656–66.Google Scholar
Rosenbaum, Paul R. 1987. Sensitivity analysis for certain permutation inferences in matched observational studies. Biometrika 74(1): 1326.Google Scholar
Rosenbaum, Paul R. 2002a. Covariance adjustment in randomized experiments and observational studies. Statistical Science 17(3): 286387.Google Scholar
Rosenbaum, Paul R. 2002b. Observational studies. 2nd ed. New York: Springer.Google Scholar
Rosenbaum, Paul R. 2004. Design sensitivity in observational studies. Biometrika 91(1): 153–64.Google Scholar
Rosenbaum, Paul R. 2005a. Heterogeneity and causality: Unit heterogeneity and design sensitivity in observational studies. American Statistician 59(2): 147–52.Google Scholar
Rosenbaum, Paul R. 2005b. Observational study. In Encyclopedia of statistics in behavioral science, eds. Everitt, Brian S. and Howell, C., Vol. 3, 14511462. Chichester, UK: John Wiley and Sons.Google Scholar
Rosenbaum, Paul R. 2010. Design of observational studies. New York: Springer-Verlag.Google Scholar
Rosenbaum, Paul R. 2012. Optimal matching of an optimally chosen subset in observational studies. Journal of Computational and Graphical Statistics 21(1): 5771.Google Scholar
Rosenbaum, Paul R., and Rubin, Donald B. 1983. The central role of propensity scores in observational studies for causal effects. Biometrika 76(1): 4155.Google Scholar
Rubin, Donald B. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 6:688701.Google Scholar
Rubin, Donald B. 1986. Which ifs have causal answers. Journal of the American Statistical Association 81(396): 961–62.Google Scholar
Rubin, Donald B. 1991. Practical implications of modes of statistical inference for causal effects and the critical role of the assignment mechanism. Biometrics 47(4): 1213–34.Google Scholar
Rubin, Donald B. 2008. For objective causal inference, design trumps analysis. Annals of Applied Statistics 2(3): 808–40.Google Scholar
Scharfstein, Daniel O., Rotnitzky, Andrea, and Robins, James M. 1999. Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American Statistical Association 94(448): 10961120.Google Scholar
Sekhon, Jasjeet S. 2009. Opiates for the matches: Matching methods for causal inference. Annual Review of Political Science 12:487508.Google Scholar
Sekhon, Jasjeet S., and Titiunik, Rocío. 2012. When natural experiments are neither natural nor experiments. American Political Science Review 106(1): 3557.Google Scholar
Sinclair, Betsy, McConnell, Margaret, and Green, Donald P. 2012. Detecting spillover in social networks: Design and analysis of multilevel experiments. American Journal of Political Science 56(4): 10551069.Google Scholar
Skerfving, S., Hansson, K., Mangs, C., Lindsten, J., and Ryman, N. 1974. Methylmercury-induced chromosome damage in man. Environmental Research 7(1): 8398.Google Scholar
Sovey, J. Allison, and Green, Donald P. 2011. Instrumental variables estimation in political science: A readers’ guide. American Journal of Political Science 55(1): 188200.Google Scholar
Tchetgen, Eric J. Tchetgen, and VanderWeele, Tyler J. 2012. On causal inference in the presence of interference. Statistical Methods in Medical Research 21(1): 5575.Google Scholar
van der Laan, Mark J., Haight, Thaddeus J., and Tager, Ira B. 2005. “van der Laan et al. respond to ‘Hypothetical interventions to define causal effects’”. American Journal of Epidemiology 162(7): 621–22.Google Scholar
Zubizarreta José, R., Small, Dylan S., Goyal, Neera K., Lorch, Scott, and Rosenbaum, Paul R. 2013. Stronger instruments via integer programming in an observational study of late preterm birth outcomes. Annals of Applied Statistics 7(1): 2550.Google Scholar
Supplementary material: PDF

Keele supplementary material

Supplementary Material

Download Keele supplementary material(PDF)
PDF 74.8 KB