Hostname: page-component-586b7cd67f-vdxz6 Total loading time: 0 Render date: 2024-11-30T21:15:58.808Z Has data issue: false hasContentIssue false

Cluster–Robust Variance Estimation for Dyadic Data

Published online by Cambridge University Press:  04 January 2017

Peter M. Aronow
Affiliation:
Department of Political Science, Yale University, 77 Prospect Street, New Haven, CT 06520, e-mail: [email protected]
Cyrus Samii*
Affiliation:
Department of Politics, New York University, 19 West 4th Street, New York, NY 10012
Valentina A. Assenova
Affiliation:
School of Management, Yale University, 165 Whitney Avenue, New Haven, CT 06520, e-mail: [email protected]
*
e-mail: [email protected] (corresponding author)

Abstract

Dyadic data are common in the social sciences, although inference for such settings involves accounting for a complex clustering structure. Many analyses in the social sciences fail to account for the fact that multiple dyads share a member, and that errors are thus likely correlated across these dyads. We propose a non-parametric, sandwich-type robust variance estimator for linear regression to account for such clustering in dyadic data. We enumerate conditions for estimator consistency. We also extend our results to repeated and weighted observations, including directed dyads and longitudinal data, and provide an implementation for generalized linear models such as logistic regression. We examine empirical performance with simulations and an application to interstate disputes.

Type
Letter
Copyright
Copyright © The Author 2015. Published by Oxford University Press on behalf of the Society for Political Methodology 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Authors' note: The authors thank Neal Beck, Allison Carnegie, Dean Eckles, Donald Lee, Winston Lin, Kelly Rader, Olav Sorenson, the Political Analysis editors, and two reviewers for helpful comments. They thank Jonathan Baron and Lauren Pinson for research assistance. Supplementary materials for this article are available on the Political Analysis Web site. Replication materials are available on the Political Analysis Dataverse (https://dataverse.harvard.edu/dataverse/pan).

References

Angrist, Joshua D., and Imbens, Guido W. 2002. Comment on “Covariance adjustment in randomized experiments and observational studies” by Paul R. Rosenbaum. Statistical Science 17(3): 304–7.Google Scholar
Angrist, Joshua D., and Pischke, Jorn-Steffen. 2009. Mostly harmless econometrics: An empiricist's companion. Princeton, NJ: Princeton University Press.Google Scholar
Arellano, Manuel. 1987. Computing robust standard errors for within-group estimators. Oxford Bulletin of Economics and Statistics 49(4): 431–34.Google Scholar
Beck, Nathanial, Skrede Gleditsch, Kristian, and Beardsley, Kyle. 2006. Space is more than geography: Using spatial ecometrics in the study of political economy. International Studies Quarterly 50:2744.CrossRefGoogle Scholar
Andreas, Buja, Berk, Richard Brown, Lawrence George, Edward Pitkin, Emil Traskin, Mikhail Zhao, Linda and Zhang, Kai. 2014. Models as approximations: A conspiracy of random predictors and model violations against classical inference in regression. Manuscript, Wharton School, University of Pennsylvania, Philadelphia.Google Scholar
Cameron, A. Colin, Gelbach, Jonah B., and Miller, Douglas L. 2011. Robust inference with multi-way clustering. Journal of Business and Economic Statistics 29(2): 238–49.Google Scholar
Chamberlain, Gary. 1982. Multivariate regression models for panel data. Journal of Econometrics 18(1): 546.CrossRefGoogle Scholar
Conley, Timothy G. 1999. GMM estimation with cross-sectional dependence. Journal of Econometrics 92:145.Google Scholar
Davidson, Russell, and MacKinnon, James G. 2004. Econometric theory and methods. Oxford: Oxford University Press.Google Scholar
Erikson, R. S., Pinto, P. M., and Rader, K. T. 2014. Dyadic analysis in international relations: A cautionary tale. Political Analysis 22(4): 457–63.Google Scholar
Fafchamps, Marcel, and Gubert, Flore. 2007. The formation of risk sharing networks. Journal of Development Economics 83:326–50.Google Scholar
Fisman, Raymond, Iyengar, Sheena S., Kamenica, Emik, and Simonson, Itamar. 2006. Gender differences in mate selection: Evidence from a speed dating experiment. Quarterly Journal of Economics 121:673–97.Google Scholar
Gelman, Andrew, and Hill, Jennifer. 2007. Data analysis using regression and multilevel/hierarchical models. Cambridge: Cambridge University Press.Google Scholar
Goldberger, Arthur S. 1991. A course in econometrics. Cambridge, MA: Harvard University Press.Google Scholar
Green, Donald P., Yeon Kim, Soo, and Yoon, David H. 2001. Dirty pool. International Organization 55(2): 441–68.Google Scholar
Greene, William H. 2008. Econometric analysis. 6th ed. Upper Saddle River, NJ: Pearson.Google Scholar
Hansen, Christian B. 2007. Asymptotic properties of a robust variance matrix estimator for panel data when T is large. Journal of Econometrics 141:597620.Google Scholar
Hoff, Peter D. 2005. Bilinear mixed-effects models for dyadic data. Journal of the American Statistical Association 100(469): 286–95.Google Scholar
Hubbard, Alan E., Ahern, Jennifer, Fliescher, Nancy L., Van Der Laan, Mark, Lippman, Sheri A., Jewell, Tim Bruckner, Nicholas, and Satariano, William A. 2010. To GEE or not to GEE: Comparing population average and mixed models for estimating the associations between neighborhood risk factors and health. Epidemiology 21(4): 467–74.Google Scholar
Huber, Peter J. 1967. The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Vol. 1, pp. 221–33. Berkeley, CA: University of California Press.Google Scholar
Kenny, David A., Kashy, Deborah A., and Cook, William L. 2006. Dyadic data analysis. New York: Guilford Press.Google Scholar
King, Gary, and Roberts, Margaret E. 2015. “How Robust Standard Errors Expose Methodological Problems They Do Not Fix, and What to Do About It.” Political Analysis 23(2): 159–79.Google Scholar
Lehmann, Erich L. 1999. Elements of large sample theory. New York: Springer-Verlag.CrossRefGoogle Scholar
Liang, Kung-Yee, and Zeger, Scott L. 1986. Longitudinal data analysis using generalized linear models. Biometrika 73(1): 1322.Google Scholar
Lin, Winston. 2013. Agnostic notes on regression adjustments to experimental data: Reexamining Freedman's critique. Annals of Applied Statistics 7(1): 295318.CrossRefGoogle Scholar
MacKinnon, James G., and White, Halbert. 1985. Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties. Journal of Econometrics 29(3): 305–25.Google Scholar
Moulton, Brent R. 1986. Random group effects and the precision of regression estimates. Journal of Econometrics 32:385–97.Google Scholar
Neumayer, Eric, and Pluemper, Thomas. 2010. Spatial effects in dyadic data. International Organization 64(1): 145–65.CrossRefGoogle Scholar
Russett, Bruce M., and Oneal, John R. 2001. Triangulating peace: Democracy, interdependence, and international organizations. New York: Norton.Google Scholar
Samii, Cyrus. 2015. Cluster-Robust Variance Estimation for Dyadic Data. http://dx.doi.org/10.7910/DVN/OMJYE5, Harvard Dataverse, V1 [UNF:6:WJJ3ZmDS7COvpy1kwztcMQ==].Google Scholar
Stefanski, Leonard A., and Boos, Dennis D. 2002. The calculus of M-estimation. American Statistician 56(1): 2938.Google Scholar
Stock, James H., and Watson, Mark W. 2008. Heteroskedasticity-robust standard errors for fixed effects panel data regression. Econometrica 76(1): 155–74.Google Scholar
White, Halbert. 1980a. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroske-dasticity. Econometrica 48(4): 817–38.Google Scholar
White, Halbert. 1980b. Using least squares to approximate unknown regression functions. International Economic Review 21(1): 149–70.Google Scholar
White, Halbert. 1981. Consequences and detection of misspecified nonlinear regression models. Journal of the American Statistical Association 76(374): 419–33.CrossRefGoogle Scholar
White, Halbert. 1982. Maximum likelihood estimation of misspecified models. Econometrica 50:125.Google Scholar
White, Halbert. 1984. Asymptotic theory of econometricians. New York: Academic Press.Google Scholar
Wooldridge, Jeffrey M. 2010. Econometric analysis of cross section and panel data. Cambridge, MA: MIT Press.Google Scholar
Zorn, Christopher. 2001. Generalized estimating equation models for correlated data: A review with applications. American Journal of Political Science 45:470–90.Google Scholar
Supplementary material: PDF

Aronow et al. supplementary material

Supporting Information

Download Aronow et al. supplementary material(PDF)
PDF 216.8 KB