Hostname: page-component-cd9895bd7-8ctnn Total loading time: 0 Render date: 2024-12-25T19:45:37.815Z Has data issue: false hasContentIssue false

Apples and Oranges? The Problem of Equivalence in Comparative Research

Published online by Cambridge University Press:  04 January 2017

Daniel Stegmueller*
Affiliation:
Nuffield College, University of Oxford, New Road, Oxford, OX1 1NF, United Kingdom, and School of Social Sciences, University of Mannheim, Germany e-mail: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Researchers in comparative research are increasingly relying on individual level data to test theories involving unobservable constructs like attitudes and preferences. Estimation is carried out using large-scale cross-national survey data providing responses from individuals living in widely varying contexts. This strategy rests on the assumption of equivalence, that is, no systematic distortion in response behavior of individuals from different countries exists. However, this assumption is frequently violated with rather grave consequences for comparability and interpretation. I present a multilevel mixture ordinal item response model with item bias effects that is able to establish equivalence. It corrects for systematic measurement error induced by unobserved country heterogeneity, and it allows for the simultaneous estimation of structural parameters of interest.

Type
Articles
Copyright
Copyright © The Author 2011. Published by Oxford University Press on behalf of the Society for Political Methodology 

References

Aitkin, M. 1999. A general maximum likelihood analysis of variance components in generalized linear models. Biometrics 55: 117–28.Google Scholar
Bartels, Larry M. 1996. Pooling disparate observations. American Journal of Political Science 40: 905–42.CrossRefGoogle Scholar
Baumgartner, Hans, and Steenkamp, Jan-Benedict. 1998. Multi-group latent variable models for varying numbers of items and factors with cross-national and longitudinal applications. Marketing Letters 9: 2135.CrossRefGoogle Scholar
Baumgartner, Hans, and Steenkamp, Jan-Benedict. 2001. Response styles in marketing research: A cross-national investigation. Journal of Marketing Research 38: 143–56.Google Scholar
Baumgartner, Hans, and Steenkamp, Jan-Benedict. 2004. Issues in assessing measurement invariance in cross-national research. Presentation at Symposium on Cross-Cultural Survey Research, University of Illinois, Urbana-Champaign.Google Scholar
Becker, Gary S. 1993. Human capital: A theoretical and empirical analysis with special reference to education. Chicago, IL: University of Chicago Press.Google Scholar
Burnham, Kenneth P., and Anderson, David. 2003. Model selection and multi-model inference. A practical information-theoretic approach. New York: Springer.Google Scholar
Byrne, Barbara M., Shavelson, Richard J., and Muthén, Bengt. 1989. Testing for the equivlence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin 105: 456–66.Google Scholar
Clinton, Joshua D., Jackman, Simon, and Rivers, Doug. 2004. The statistical analysis of roll call voting: A unified approach. American Political Science Review 98: 355–70.Google Scholar
Croon, Marcel, and Bolck, A. 1997. On the use of factor scores in structural equations models. Technical report No. 97.10.102/7. The Netherlands: Work and Organization Research Center, Tilburg University.Google Scholar
Cusack, Thomas, Iversen, Torbern, and Rehm, Phillip. 2005. Risks at work: The demand and supply sides of government redistribution. Oxford Review Of Economic Policy 22: 365–89.Google Scholar
Davidov, Eldad. 2009. Measurement equivalence of nationalism and constructive patriotism in the ISSP 2003: 34 countries in a comparative perspective. Political Analysis 17: 6482.Google Scholar
De Boeck, P., and Wilson, M. 2004. Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer.Google Scholar
De Jong, Martijn G., and Steenkamp, Jan-Benedict E. M. 2010. Finite mixture multilevel multidimensional ordinal IRT models for large scale cross-cultural research. Psychometrika 75: 332.Google Scholar
Delhey, Jan, and Newton, Kenneth. 2005. Predicting cross-national levels of social trust: Global pattern or nordic exceptionalism? European Sociological Review 21: 311–27.Google Scholar
Estevez-Abe, Margarita, Iversen, Torben, and Soskice, David. 2001. Social protection and the formation of skills. A reinterpretation of the welfare state. In Varieties of capitalism. The institutioinal foundations of comparative advantage, ed. Hall, Peter A. and Soskice, David W., 145–83. Oxford: Oxford University Press.Google Scholar
Fennessey, James. 1986. The general linear model: A new perspective on some familiar topics. American Journal of Sociology 74: 127.Google Scholar
Fontaine, Johnny R. J. 2005. Equivalence. In Encyclopedia of social measurement. Vol. 1, A-E, ed. Kempf-Leonard, Kimberly, 803–18. New York: Academic Press.Google Scholar
Fox, Jean-Paul, and Glas, Cees A. W. 2001. Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika 66: 271–88.Google Scholar
Gelman, Andrew, and Stern, Hal. 2006. The difference between ‘significant’ and ‘not significant’ is not itself statistically significant. The American Statistician 60: 328–31.Google Scholar
Gouveia, Miguel, and Masia, Neal A. 1998. Does the median voter model explain the size of government? Evidence from the states. Public Choice 97: 159–77.Google Scholar
Greene, William, and Hensher, David. 2010. Modeling ordered choices: A primer. Cambridge: Cambridge University Press.Google Scholar
Hambleton, Ronald K., Swaminathan, H., and Jane Rogers, H. 1991. Fundamentals of item response theory. Newbury Park: Sage.Google Scholar
Heckman, J., and Singer, B. 1984. A method for minimizing the impact of distributional assumptions in econometric models for duration data. Econometrica 52: 271320.Google Scholar
Hofstede, Geert H. 2001. Culture's consequences: Comparing values, behaviors, institutions, and organizations across nations. Thousand Oaks: Sage.Google Scholar
Hooghe, Liesbet, and Marks, Gary. 2004. Does identity or economic rationality drive public opinion on European integration? Political Science & Politics 37: 415–20.Google Scholar
Hooghe, Marc, Reeskens, Tim, Stolle, Dietlind, and Trappers, Ann. 2009. Ethnic diversity and generalized trust in Europe. A cross-national multilevel study. Comparative Political Studies 42: 198223.Google Scholar
Horn, John L., and McArdle, Jack J. 1992. A practical and theoretical guide to measurement invariance in aging research. Experimental Aging Research 18: 117–44.Google Scholar
Hyman, Herbert H. 1972. Secondary analysis of sample surveys: Principles, procedures and potentialities. New York: Wiley.Google Scholar
Iversen, Torben. 2005. Capitalism, democracy, and welfare. Cambridge: Cambridge University Press.Google Scholar
Iversen, Torben. 2006. Class politics is dead! Long live class politics! Apolitical economy perspective on the new partisan politics. APSA-CP 17: 16.Google Scholar
Iversen, Torben, and Soskice, David. 2001. An asset theory of social policy preferences. American Political Science Review 95: 875–93.Google Scholar
Jackman, Simon. 2008. Measurement. In Oxford Handbook of Political Methodology, ed. Box-Steffensmeier, Janet M., Brady, Henry E., and Collier, David, 119–51. Oxford: Oxford University Press.Google Scholar
Jesse, Stephen A. 2009. Spatial voting in the 2004 presidential election. American Political Science Review 103: 5981.Google Scholar
Johnson, Timothy P. 1998. Approaches to equivalence in cross-cultural and cross-national survey research. In ZUMA-Nachrichten Spezial Band 3: Cross-cultural survey equivalence, ed. Harkness, J. Mannheim: ZUMA.Google Scholar
Johnson, Timothy, Kulesa, Patrick, Cho, Young Ik, and Shavitt, Sharon. 2005. The relation between culture and response styles. Evidence from 19 countries. Journal of Cross-Cultural Psychology 36: 264–77.Google Scholar
Johnson, Valen E., and Albert, Jim H. 1999. Ordinal data modeling. New York: Springer.Google Scholar
Jöreskog, Karl G. 1971. Simultaneous factor analysis in several populations. Psychometrika 36: 409–26.Google Scholar
Kim, Jae-On, and Mueller, Charles W. 1978. Factor analysis. Thousand Oaks: Sage.CrossRefGoogle Scholar
King, Gary, Keohane, Robert O., and Verba, Sidney. 1994. Designing social inquiry. Princeton: Princeton University Press.Google Scholar
King, Gary, Murray, Christopher J. L., Salomon, Joshua A., and Tandon, Ajay. 2004. Enhancing the validity and cross-cultural comparability of measurement in survey research. American Political Science Review 98: 191207.Google Scholar
King, Gary, and Wand, Jonathan. 2007. Comparing incomparable survey responses: Evaluating and selecting anchoring vignettes. Political Analysis 15: 4666.Google Scholar
Laird, N. 1978. Nonparametric maximum likelihood estimation of a mixture distribution. Journal of the American Statistical Association 73: 805–11.CrossRefGoogle Scholar
Lazarsfeld, Paul F. 1959. Latent structure analysis. In Psychology: A study of a science, Vol. III, ed. Koch, Sigmund. New York: McGraw-Hill.Google Scholar
Lee, Sik-Yum, and Shi, Jian-Qing. 2001. Maximum likelihood estimation of two-level latent variable models with mixed continuous and polytomous data. Biometrics 57: 787–94.Google Scholar
Lesaffre, Emmanuel, and Spiessens, Bart. 2001. On the effect of the number of quadrature points in a logistic random-effects model: An example. Journal of the Royal Statistical Society A 50: 325–35.Google Scholar
Little, Roderick J.A., and Rubin, Donald B. 2002. Statistical analysis with missing data. Hoboken: Wiley.Google Scholar
Lubke, Gitta H., and Muthén, Bengt O. 2004. Applying multigroup confirmatory factor models for continuous outcomes to Likert scale data complicates meaningful group comparisons. Structural Equation Modeling 11: 514–34.Google Scholar
Martin, Andrew D., and Quinn, Kevin M. 2002. Dynamic ideal point estimation via Markov Chain Monte Carlo for the U. S. Supreme Court, 1953-1999. Political Analysis 10: 134–53.Google Scholar
McCullagh, P., and Nelder, J. A. 1989. Generalized linear models. London: Chapman & Hall.Google Scholar
McLachlan, Geoffrey, and Peel, David. 2000. Finite mixture models. New York: Wiley.Google Scholar
McLachlan, Geoffrey J., and Krishnan, Thriyambakam. 2008. The EM algorithm and extensions. New York: Wiley.Google Scholar
Mellenbergh, Gideon J. 1994. Generalized linear item response theory. Psychological Bulletin 115: 300–07.Google Scholar
Meltzer, Allan H., and Richard, Scott F. 1981. A rational theory of the size of government. Journal of Political Economy 89: 914–27.Google Scholar
Meredith, William. 1993. Measurement invariance, factor analysis and factorial invariance. Psychometrika 58: 525–43.CrossRefGoogle Scholar
Mill, John Stuart. 2007. Utilitarianism, liberty & representative government. Rockville, MD: Wildside Press.Google Scholar
Millsap, Roger E., and Kwok, Oi-Man. 2004. Evaluating the impact of partial factorial invariance on selection in two populations. Psychological Methods 9: 93115.Google Scholar
Millsap, Roger E., and Yun-Tein, Jenn. 2004. Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research 39: 479515.Google Scholar
Moene, Karl Ove, and Wallerstein, Michael. 2003. Earnings inequality and welfare spending: A disaggregated analysis. World Politics 55: 485516.Google Scholar
Moustaki, Irini. 2000. A latent variable model for ordinal variables. Applied Psychological Measurement 24: 211–23.CrossRefGoogle Scholar
Moustaki, Irini. 2003. A general class of latent variable models for ordinal manifest variables with covariate effects on the manifest and latent variables. British Journal of Mathematical and Statistical Psychology 56: 337–57.Google Scholar
Moustaki, Irini, Jöreskog, Karl G., and Mavridis, Dimitris. 2004. Factor models for ordinal variables with covariate effects on the manifest and latent variables: A comparison of LISREL and IRT approaches. Structural Equation Modeling 11: 487513.Google Scholar
Moustaki, Irini, and Knott, Martin. 2000. Generalized latent trait models. Psychometrika 65: 391411.CrossRefGoogle Scholar
Muthén, Bengt. 1989. Latent variable modeling in heterogeneous populations. Psychometrika 54: 557–85.Google Scholar
O'Rourke, Kevin H., and Sinnott, Richard. 2006. The determinants of individual attitudes towards immigration. European Journal of Political Economy 22: 838–61.Google Scholar
Quinn, Kevin M. 2004. Bayesian factor analysis for mixed ordinal and continuous responses. Political Analysis 12: 338–53.CrossRefGoogle Scholar
Rabe-Hesketh, Sophia, Skrondal, Anders, and Pickles, Andrew. 2004. Generalized multilevel structural equation modeling. Psychometrika 69: 167–90.Google Scholar
Reeskens, Tim, and Hooghe, Marc. 2008. Cross-cultural measurement equivalence of generalized trust. Evidence from the European Social Survey (2002 and 2004). Social Indicators Research 85: 515–32.Google Scholar
Rijmen, F., Tuerlinckx, F., De Boeck, P., and Kuppens, P. 2003. A nonlinear mixed model framework for item response theory. Psychological Methods 8: 185205.Google Scholar
Rodrigiuez, F. C. 1999. Does distributional skewness lead to redistribution? Evidence from the United States. Economics & Politics 11: 171–99.Google Scholar
Rodrik, Dani, and Mayda, Anna Maria. 2005. Why are some people (and countries) more protectionist than others? European Economic Review 49: 1393–430.Google Scholar
Royall, Richard M. 1986. Model robust confidence intervals using maximum Likelihood estimators. International Statistical Review 54: 221–26.Google Scholar
Salzberger, Thomas, Sinkovics, Rudolf, and Schlegelmilch, Bodo. 1999. Data equivalence in cross-cultural research: A comparison of classical test theory and latent trait theory based approaches. Australasian Journal of Marketing 7: 2338.Google Scholar
Samejima, Fumiko. 1969. Estimation of latent ability using a response pattern of graded scores (Psychometric Monograph No. 17). Richmond: Psychometric Society.Google Scholar
Scheve, Kenneth, and Stasavage, David. 2006. Religion and preferences for social insurance. Quarterly Journal of Political Science 1: 255–86.Google Scholar
Schwarz, Norbert. 2003. Culture-sensitive context effects: A challenge for cross-cultural surveys. In Cross-cultural survey methods, ed. Harkness, Janet A., van de Vijver, Fons J. R., and Mohler, Peter Ph, 93100. New Jersey: Wiley.Google Scholar
Sinn, Hans-Werner. 1995. A theory of the welfare state. Scandinavian Journal of Economics 97: 495526.CrossRefGoogle Scholar
Skrondal, Anders, and Laake, Petter. 2001. Regression among factor scores. Psychometrika 66: 563757.Google Scholar
Skrondal, Anders, and Rabe-Hesketh, Sophia. 2004. Generalized latent variable modeling: Multilevel, longitudinal and structural equation models. Boca Raton, IL: Chapman & Hall.Google Scholar
Song, Xin-Yuan, and Lee, Sik-Yum. 2004. Bayesian analysis of two-level nonlinear structural equation models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology 57: 2952.Google Scholar
Steenbergen, Marco R., and Jones, Bradford S. 2002. Modeling multilevel data structures. American Journal of Political Science 46: 218–37.CrossRefGoogle Scholar
Takane, Yoshio, and de Leeuw, Jan. 1987. On the relationship between item response theory and factor analysis of discretized variables. Psychometrika 52: 393408.Google Scholar
van Deth, Jan W. 1998. Equivalence in comparative political research. In Comparative politics. The problem of equivalence, ed. van Deth, Jan W., 119. London: Routledge.Google Scholar
van Herk, Hester, Poortinga, Ype H., and Verhallen, Theo M. M. 2004. Response styles in rating scales: Evidence of method bias in data from six EU countries. Journal of Cross-Cultural Psychology 35: 346–60.Google Scholar
Varian, Hal R. 1980. Redistributive taxation as social insurance. Journal of Public Economics 14: 4968.Google Scholar
Vermunt, Jeroen. 2004. An EM algorithm for the estimation of parametric and nonparametric hierarchical nonlinear models. Statistica Neerlandica 58: 220–33.Google Scholar
Vermunt, Jeroen. 2008. Multilevel latent variable modeling: An application in education testing. Austrian Journal of Statistics 37: 285–99.Google Scholar
Vermunt, Jeroen K., and Magidson, Jay. 2008. LG-Syntax user's guide: Manual for latent GOLD 4.5 Syntax module. Belmont, CA: Statistical Innovations Inc.Google Scholar
Weldon, Steven A. 2006. The institutional context of tolerance for ethnic minorities: A comparative, multilevel analysis of Western Europe. American Journal of Political Science 50: 331–49.CrossRefGoogle Scholar
White, Halbert. 1996. Estimation, inference and specification analysis. Cambridge, MA: Cambridge University Press.Google Scholar
Yang, Yongwei, Harkness, Janet A., Chin, Tzu-Yun, and Villar, Ana. 2010. Response styles and culture. In Survey methods in multicultural, multinational, and multiregional contexts, ed. Harkness, Janet A., Braun, Michael, Edwards, Brad, Johnson, Timothy P., Lyberg, Lars E., Mohler, Peter Ph, Pennell, Beth-Ellen, and Smith, Tom W., 203–26. Hoboken: Wiley.Google Scholar
Supplementary material: PDF

Stegmueller supplementary material

Appendix

Download Stegmueller supplementary material(PDF)
PDF 96.6 KB