Hostname: page-component-78c5997874-xbtfd Total loading time: 0 Render date: 2024-11-03T05:22:35.015Z Has data issue: false hasContentIssue false

Estimating Heterogeneous Treatment Effects and the Effects of Heterogeneous Treatments with Ensemble Methods

Published online by Cambridge University Press:  04 September 2017

Justin Grimmer*
Affiliation:
Associate Professor, Department of Political Science, University of Chicago, 5828 S. University Ave., Chicago, IL 60637, USA. Email: [email protected]
Solomon Messing
Affiliation:
Director, Data Labs, Pew Research Center, 1615 L Street NW, Washington, DC, USA
Sean J. Westwood
Affiliation:
Assistant Professor, Department of Government, Dartmouth College, USA

Abstract

Randomized experiments are increasingly used to study political phenomena because they can credibly estimate the average effect of a treatment on a population of interest. But political scientists are often interested in how effects vary across subpopulations—heterogeneous treatment effects—and how differences in the content of the treatment affects responses—the response to heterogeneous treatments. Several new methods have been introduced to estimate heterogeneous effects, but it is difficult to know if a method will perform well for a particular data set. Rather than using only one method, we show how an ensemble of methods—weighted averages of estimates from individual models increasingly used in machine learning—accurately measure heterogeneous effects. Building on a large literature on ensemble methods, we show how the weighting of methods can contribute to accurate estimation of heterogeneous treatment effects and demonstrate how pooling models lead to superior performance to individual methods across diverse problems. We apply the ensemble method to two experiments, illuminating how the ensemble method for heterogeneous treatment effects facilitates exploratory analysis of treatment effects.

Type
Articles
Copyright
Copyright © The Author(s) 2017. Published by Cambridge University Press on behalf of the Society for Political Methodology. 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Authors’ note: Replication data available in Grimmer, Messing, and Westwood (2017).

Contributing Editor: Dustin Tingley

References

Athey, Susan, and Imbens, Guido. 2015. Machine learning methods for estimating heterogeneous causal effects. Preprint, arXiv:1504.01132.Google Scholar
Berinsky, Adam J., Huber, Gregory A., and Lenz, Gabriel S.. 2012. Evaluating online labor markets for experimental research: Amazon.com’s mechanical turk. Political Analysis 20:351368.Google Scholar
Breiman, Leo. 2001. Random forests. Journal of Machine Learning 45(1):532.Google Scholar
Chatterjee, Arindam, and Lahiri, Soumendra Nath. 2011. Bootstrapping lasso estimators. Journal of the American Statistical Association 106(494):608625.Google Scholar
Chipman, Hugh A., George, Edward I., and McCulloch, Robert E.. 2010. BART: Bayesian additive regression trees. Annals of Applied Statistics 41(1):266298.Google Scholar
Dietterich, Thomas. 2000. Ensemble methods in machine learning. In Multiple Classifier Systems. MCS 2000 . Lecture Notes in Computer Science, vol. 1857. Heidelberg: Springer-Verlag.Google Scholar
Efron, Bradley, and Tibshirani, Robert J.. 1994. An introduction to the bootstrap . Boca Raton, FL: CRC Press.Google Scholar
Fong, Christian, and Grimmer, Justin. 2016. Discovery of treatments from text corpora. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, ACL 2016, Berlin, Germany .Google Scholar
Gelman, Andrew, Jakulin, Aleks, Pittau, Maria Grazia, and Su, Yu-Sung. 2008. A weakly informative default prior distribution for logistic and other regression models. The Annals of Applied Statistics 2(4):13601383.Google Scholar
Gelman, Andrew, Hill, Jennifer, and Yajima, Masanao. 2012. Why we (usually) don’t have to worry about multiple comparisons? Journal of Research on Educational Effectiveness 5(1):189211.Google Scholar
Gerber, Alan S., and Green, Donald P.. 2012. Field experiment: Design, analysis, and interpretation . New York: W.W. Norton & Company.Google Scholar
Green, Donald P., and Kern, Holger L.. 2012. Modeling heterogeneous treatment effects in survey experiments with Bayesian additive regression trees. Public Opinion Quarterly 76(3):491511.Google Scholar
Grimmer, Justin. 2013. Representational style: What legislators say and why it matters . Cambridge: Cambridge University Press.Google Scholar
Grimmer, Justin, Westwood, Sean J., and Messing, Solomon. 2014. The impression of influence: Legislator communication, representation, and democratic accountability . Princeton, NJ: Princeton University Press.Google Scholar
Grimmer, Justin, Messing, Solomon, and Westwood, Sean J.. 2012. How words and money cultivate a personal vote: The effect of legislator credit claiming on constituent credit allocation. American Political Science Review 106(4):703719.Google Scholar
Grimmer, Justin, Messing, Solomon, and Westwood, Sean. 2017. Replication data for estimating heterogeneous treatment effects and the effects of heterogeneous treatments with ensemble methods. doi:10.7910/DVN/BQMLQW.Google Scholar
Hainmueller, Jens, and Hazlett, Chad. 2013. Kernel regularized least squares: Reducing misspecification bias with a flexible and interpretable machine learning approach. Political Analysis 22(2):143168.Google Scholar
Hainmueller, Jens, Hopkins, Daniel, and Yamamoto, Teppei. 2014. Causal inference in conjoint analysis: Understanding multi-dimensional choices via stated preference experiments. Political Analysis 22(1):130.Google Scholar
Hainmueller, Jens, and Hopkins, Daniel J.. 2015. The hidden American immigration consensus: A conjoint analysis of attitudes toward immigrants. American Journal of Political Science 59(3):529548.Google Scholar
Hartman, Erin, Grieve, Richard, Ramshai, Roland, and Sekhon, Jasjeet S.. 2012. From SATE to PATT: Combining experimental with observational studies. University of California, Berkeley Mimeo.Google Scholar
Hastie, Trevor, Tibshirani, Robert, and Friedman, Jerome. 2001. The elements of statistical learning . Springer.Google Scholar
Hillard, Dustin, Purpura, Stephen, and Wilkerson, John. 2008. Computer-assisted topic classification for mixed-methods social science research. Journal of Information Technology & Politics 4(4):3146.Google Scholar
Holland, Paul. 1986. Statistics and causal inference. Journal of the American Statistical Association 81(396):945960.Google Scholar
Humphreys, Macartan, de la Sierra, Raul Sanchez, and van der Windt, Peter. 2013. Fishing, commitment, and communication: A proposal for comprehensive nonbinding research registration. Political Analysis 21(1):120.Google Scholar
Imai, Kosuke, and Strauss, Aaron. 2011. Estimation of heterogeneous treatment effects from randomized experiments, with application to the optimal planning of the get-out-the-vote campaign. Political Analysis 19(1):119.Google Scholar
Imai, Kosuke, and Ratkovic, Marc. 2013. Estimating treatment effect heterogeneity in randomized program evaluation. The Annals of Applied Statistics 7(1):443470.Google Scholar
Kasperowicz, Pete. 2013. GOP seeks planned parenthood study with hope to strip funding. Politico.com.Google Scholar
Keerthi, S. S., Shevade, S. K., Bhattacharyya, C., and Murthy, K. R. K.. 2001. Improvements to platt’s SMO algorithm for SVM classifier design. Neural Computation 13(3):637649.Google Scholar
King, Gary, Tomz, Michael, and Wittenberg, Jason. 2000. Making the most of statistical analyses: Improving interpretation and presentation. American Journal of Political Science 44(2):347361.Google Scholar
Mayhew, David. 1974. Congress: The electoral connection . New Haven, CT: Yale University Press.Google Scholar
Montgomery, Jacob M., Hollenbach, Florian M., and Ward, Michael D.. 2012. Improving predictions using ensemble Bayesian model averaging. Political Analysis 20(3):271291.Google Scholar
Platt, J. 1998. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods - Support Vector Learning , ed. Schoelkopf, B., Burges, C., and Smola, A.. Cambridge, MA: MIT Press, http://research.microsoft.com/ jplatt/smo.html.Google Scholar
Raftery, Adrian E., Gneiting, Tilmann, Balabdaoui, Fadoua, and Polakowski, Michael. 2005. Using Bayesian model averaging to calibrate forecast ensembles. Monthly Weather Review 133:11551174.Google Scholar
Ratkovic, Marc, and Tingley, Dustin. 2017. Sparse estimation and uncertainty with application to subgroup analysis. Political Analysis 25(1):140.Google Scholar
Samii, Cyrus, Paler, Laura, and Daly, Sarah. 2017. Retrospective causal inference with machine learning ensembles: An application to anti-recidivism policies in colombia. Political Analysis 24(4):434456.Google Scholar
Skocpol, Theda, and Williamson, Vanessa. 2011. The tea party and the remaking of republican conservatism . Oxford: Oxford University Press.Google Scholar
van der Laan, Mark, Polley, Eric, and Hubbard, Alan. 2007. Super learner. Statistical Applications in Genetics and Molecular Biology 6(1):121.Google Scholar
Van der Laan, Mark J, and Rose, Sherri. 2011. Targeted learning: Causal inference for observational and experimental data . New York: Springer Science & Business Media.Google Scholar
Wager, Stefan, and Athey, Susan. 2017. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, forthcoming.Google Scholar
Supplementary material: File

Grimmer et al supplementary material

Online Appendix

Download Grimmer et al supplementary material(File)
File 217.9 KB