Apples and Oranges? The Problem of Equivalence in Comparative Research

Daniel Stegmueller

doi:10.1093/pan/mpr028

Apples and Oranges? The Problem of Equivalence in Comparative Research

Published online by Cambridge University Press: 04 January 2017

Daniel Stegmueller

Show author details

Daniel Stegmueller*: Affiliation:
Nuffield College, University of Oxford, New Road, Oxford, OX1 1NF, United Kingdom, and School of Social Sciences, University of Mannheim, Germany e-mail: [email protected]

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Researchers in comparative research are increasingly relying on individual level data to test theories involving unobservable constructs like attitudes and preferences. Estimation is carried out using large-scale cross-national survey data providing responses from individuals living in widely varying contexts. This strategy rests on the assumption of equivalence, that is, no systematic distortion in response behavior of individuals from different countries exists. However, this assumption is frequently violated with rather grave consequences for comparability and interpretation. I present a multilevel mixture ordinal item response model with item bias effects that is able to establish equivalence. It corrects for systematic measurement error induced by unobserved country heterogeneity, and it allows for the simultaneous estimation of structural parameters of interest.

Type: Articles
Information: Political Analysis , Volume 19 , Issue 4 , Autumn 2011 , pp. 471 - 487

DOI: https://doi.org/10.1093/pan/mpr028 [Opens in a new window]
Copyright: Copyright © The Author 2011. Published by Oxford University Press on behalf of the Society for Political Methodology

References

Aitkin, M. 1999. A general maximum likelihood analysis of variance components in generalized linear models. Biometrics 55: 117–28.Google Scholar

Bartels, Larry M. 1996. Pooling disparate observations. American Journal of Political Science 40: 905–42.CrossRef Google Scholar

Baumgartner, Hans, and Steenkamp, Jan-Benedict. 1998. Multi-group latent variable models for varying numbers of items and factors with cross-national and longitudinal applications. Marketing Letters 9: 21–35.CrossRef Google Scholar

Baumgartner, Hans, and Steenkamp, Jan-Benedict. 2001. Response styles in marketing research: A cross-national investigation. Journal of Marketing Research 38: 143–56.Google Scholar

Baumgartner, Hans, and Steenkamp, Jan-Benedict. 2004. Issues in assessing measurement invariance in cross-national research. Presentation at Symposium on Cross-Cultural Survey Research, University of Illinois, Urbana-Champaign.Google Scholar

Becker, Gary S. 1993. Human capital: A theoretical and empirical analysis with special reference to education. Chicago, IL: University of Chicago Press.Google Scholar

Burnham, Kenneth P., and Anderson, David. 2003. Model selection and multi-model inference. A practical information-theoretic approach. New York: Springer.Google Scholar

Byrne, Barbara M., Shavelson, Richard J., and Muthén, Bengt. 1989. Testing for the equivlence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin 105: 456–66.Google Scholar

Clinton, Joshua D., Jackman, Simon, and Rivers, Doug. 2004. The statistical analysis of roll call voting: A unified approach. American Political Science Review 98: 355–70.Google Scholar

Croon, Marcel, and Bolck, A. 1997. On the use of factor scores in structural equations models. Technical report No. 97.10.102/7. The Netherlands: Work and Organization Research Center, Tilburg University.Google Scholar

Cusack, Thomas, Iversen, Torbern, and Rehm, Phillip. 2005. Risks at work: The demand and supply sides of government redistribution. Oxford Review Of Economic Policy 22: 365–89.Google Scholar

Davidov, Eldad. 2009. Measurement equivalence of nationalism and constructive patriotism in the ISSP 2003: 34 countries in a comparative perspective. Political Analysis 17: 64–82.Google Scholar

De Boeck, P., and Wilson, M. 2004. Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer.Google Scholar

De Jong, Martijn G., and Steenkamp, Jan-Benedict E. M. 2010. Finite mixture multilevel multidimensional ordinal IRT models for large scale cross-cultural research. Psychometrika 75: 3–32.Google Scholar

Delhey, Jan, and Newton, Kenneth. 2005. Predicting cross-national levels of social trust: Global pattern or nordic exceptionalism? European Sociological Review 21: 311–27.Google Scholar

Estevez-Abe, Margarita, Iversen, Torben, and Soskice, David. 2001. Social protection and the formation of skills. A reinterpretation of the welfare state. In Varieties of capitalism. The institutioinal foundations of comparative advantage, ed. Hall, Peter A. and Soskice, David W., 145–83. Oxford: Oxford University Press.Google Scholar

Fennessey, James. 1986. The general linear model: A new perspective on some familiar topics. American Journal of Sociology 74: 1–27.Google Scholar

Fontaine, Johnny R. J. 2005. Equivalence. In Encyclopedia of social measurement. Vol. 1, A-E, ed. Kempf-Leonard, Kimberly, 803–18. New York: Academic Press.Google Scholar

Fox, Jean-Paul, and Glas, Cees A. W. 2001. Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika 66: 271–88.Google Scholar

Gelman, Andrew, and Stern, Hal. 2006. The difference between ‘significant’ and ‘not significant’ is not itself statistically significant. The American Statistician 60: 328–31.Google Scholar

Gouveia, Miguel, and Masia, Neal A. 1998. Does the median voter model explain the size of government? Evidence from the states. Public Choice 97: 159–77.Google Scholar

Greene, William, and Hensher, David. 2010. Modeling ordered choices: A primer. Cambridge: Cambridge University Press.Google Scholar

Hambleton, Ronald K., Swaminathan, H., and Jane Rogers, H. 1991. Fundamentals of item response theory. Newbury Park: Sage.Google Scholar

Heckman, J., and Singer, B. 1984. A method for minimizing the impact of distributional assumptions in econometric models for duration data. Econometrica 52: 271–320.Google Scholar

Hofstede, Geert H. 2001. Culture's consequences: Comparing values, behaviors, institutions, and organizations across nations. Thousand Oaks: Sage.Google Scholar

Hooghe, Liesbet, and Marks, Gary. 2004. Does identity or economic rationality drive public opinion on European integration? Political Science & Politics 37: 415–20.Google Scholar

Hooghe, Marc, Reeskens, Tim, Stolle, Dietlind, and Trappers, Ann. 2009. Ethnic diversity and generalized trust in Europe. A cross-national multilevel study. Comparative Political Studies 42: 198–223.Google Scholar

Horn, John L., and McArdle, Jack J. 1992. A practical and theoretical guide to measurement invariance in aging research. Experimental Aging Research 18: 117–44.Google Scholar

Hyman, Herbert H. 1972. Secondary analysis of sample surveys: Principles, procedures and potentialities. New York: Wiley.Google Scholar

Iversen, Torben. 2005. Capitalism, democracy, and welfare. Cambridge: Cambridge University Press.Google Scholar

Iversen, Torben. 2006. Class politics is dead! Long live class politics! Apolitical economy perspective on the new partisan politics. APSA-CP 17: 1–6.Google Scholar

Iversen, Torben, and Soskice, David. 2001. An asset theory of social policy preferences. American Political Science Review 95: 875–93.Google Scholar

Jackman, Simon. 2008. Measurement. In Oxford Handbook of Political Methodology, ed. Box-Steffensmeier, Janet M., Brady, Henry E., and Collier, David, 119–51. Oxford: Oxford University Press.Google Scholar

Jesse, Stephen A. 2009. Spatial voting in the 2004 presidential election. American Political Science Review 103: 59–81.Google Scholar

Johnson, Timothy P. 1998. Approaches to equivalence in cross-cultural and cross-national survey research. In ZUMA-Nachrichten Spezial Band 3: Cross-cultural survey equivalence, ed. Harkness, J. Mannheim: ZUMA.Google Scholar

Johnson, Timothy, Kulesa, Patrick, Cho, Young Ik, and Shavitt, Sharon. 2005. The relation between culture and response styles. Evidence from 19 countries. Journal of Cross-Cultural Psychology 36: 264–77.Google Scholar

Johnson, Valen E., and Albert, Jim H. 1999. Ordinal data modeling. New York: Springer.Google Scholar

Jöreskog, Karl G. 1971. Simultaneous factor analysis in several populations. Psychometrika 36: 409–26.Google Scholar

Kim, Jae-On, and Mueller, Charles W. 1978. Factor analysis. Thousand Oaks: Sage.CrossRef Google Scholar

King, Gary, Keohane, Robert O., and Verba, Sidney. 1994. Designing social inquiry. Princeton: Princeton University Press.Google Scholar

King, Gary, Murray, Christopher J. L., Salomon, Joshua A., and Tandon, Ajay. 2004. Enhancing the validity and cross-cultural comparability of measurement in survey research. American Political Science Review 98: 191–207.Google Scholar

King, Gary, and Wand, Jonathan. 2007. Comparing incomparable survey responses: Evaluating and selecting anchoring vignettes. Political Analysis 15: 46–66.Google Scholar

Laird, N. 1978. Nonparametric maximum likelihood estimation of a mixture distribution. Journal of the American Statistical Association 73: 805–11.CrossRef Google Scholar

Lazarsfeld, Paul F. 1959. Latent structure analysis. In Psychology: A study of a science, Vol. III, ed. Koch, Sigmund. New York: McGraw-Hill.Google Scholar

Lee, Sik-Yum, and Shi, Jian-Qing. 2001. Maximum likelihood estimation of two-level latent variable models with mixed continuous and polytomous data. Biometrics 57: 787–94.Google Scholar

Lesaffre, Emmanuel, and Spiessens, Bart. 2001. On the effect of the number of quadrature points in a logistic random-effects model: An example. Journal of the Royal Statistical Society A 50: 325–35.Google Scholar

Little, Roderick J.A., and Rubin, Donald B. 2002. Statistical analysis with missing data. Hoboken: Wiley.Google Scholar

Lubke, Gitta H., and Muthén, Bengt O. 2004. Applying multigroup confirmatory factor models for continuous outcomes to Likert scale data complicates meaningful group comparisons. Structural Equation Modeling 11: 514–34.Google Scholar

Martin, Andrew D., and Quinn, Kevin M. 2002. Dynamic ideal point estimation via Markov Chain Monte Carlo for the U. S. Supreme Court, 1953-1999. Political Analysis 10: 134–53.Google Scholar

McCullagh, P., and Nelder, J. A. 1989. Generalized linear models. London: Chapman & Hall.Google Scholar

McLachlan, Geoffrey, and Peel, David. 2000. Finite mixture models. New York: Wiley.Google Scholar

McLachlan, Geoffrey J., and Krishnan, Thriyambakam. 2008. The EM algorithm and extensions. New York: Wiley.Google Scholar

Mellenbergh, Gideon J. 1994. Generalized linear item response theory. Psychological Bulletin 115: 300–07.Google Scholar

Meltzer, Allan H., and Richard, Scott F. 1981. A rational theory of the size of government. Journal of Political Economy 89: 914–27.Google Scholar

Meredith, William. 1993. Measurement invariance, factor analysis and factorial invariance. Psychometrika 58: 525–43.CrossRef Google Scholar

Mill, John Stuart. 2007. Utilitarianism, liberty & representative government. Rockville, MD: Wildside Press.Google Scholar

Millsap, Roger E., and Kwok, Oi-Man. 2004. Evaluating the impact of partial factorial invariance on selection in two populations. Psychological Methods 9: 93–115.Google Scholar

Millsap, Roger E., and Yun-Tein, Jenn. 2004. Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research 39: 479–515.Google Scholar

Moene, Karl Ove, and Wallerstein, Michael. 2003. Earnings inequality and welfare spending: A disaggregated analysis. World Politics 55: 485–516.Google Scholar

Moustaki, Irini. 2000. A latent variable model for ordinal variables. Applied Psychological Measurement 24: 211–23.CrossRef Google Scholar

Moustaki, Irini. 2003. A general class of latent variable models for ordinal manifest variables with covariate effects on the manifest and latent variables. British Journal of Mathematical and Statistical Psychology 56: 337–57.Google Scholar

Moustaki, Irini, Jöreskog, Karl G., and Mavridis, Dimitris. 2004. Factor models for ordinal variables with covariate effects on the manifest and latent variables: A comparison of LISREL and IRT approaches. Structural Equation Modeling 11: 487–513.Google Scholar

Moustaki, Irini, and Knott, Martin. 2000. Generalized latent trait models. Psychometrika 65: 391–411.CrossRef Google Scholar

Muthén, Bengt. 1989. Latent variable modeling in heterogeneous populations. Psychometrika 54: 557–85.Google Scholar

O'Rourke, Kevin H., and Sinnott, Richard. 2006. The determinants of individual attitudes towards immigration. European Journal of Political Economy 22: 838–61.Google Scholar

Quinn, Kevin M. 2004. Bayesian factor analysis for mixed ordinal and continuous responses. Political Analysis 12: 338–53.CrossRef Google Scholar

Rabe-Hesketh, Sophia, Skrondal, Anders, and Pickles, Andrew. 2004. Generalized multilevel structural equation modeling. Psychometrika 69: 167–90.Google Scholar

Reeskens, Tim, and Hooghe, Marc. 2008. Cross-cultural measurement equivalence of generalized trust. Evidence from the European Social Survey (2002 and 2004). Social Indicators Research 85: 515–32.Google Scholar

Rijmen, F., Tuerlinckx, F., De Boeck, P., and Kuppens, P. 2003. A nonlinear mixed model framework for item response theory. Psychological Methods 8: 185–205.Google Scholar

Rodrigiuez, F. C. 1999. Does distributional skewness lead to redistribution? Evidence from the United States. Economics & Politics 11: 171–99.Google Scholar

Rodrik, Dani, and Mayda, Anna Maria. 2005. Why are some people (and countries) more protectionist than others? European Economic Review 49: 1393–430.Google Scholar

Royall, Richard M. 1986. Model robust confidence intervals using maximum Likelihood estimators. International Statistical Review 54: 221–26.Google Scholar

Salzberger, Thomas, Sinkovics, Rudolf, and Schlegelmilch, Bodo. 1999. Data equivalence in cross-cultural research: A comparison of classical test theory and latent trait theory based approaches. Australasian Journal of Marketing 7: 23–38.Google Scholar

Samejima, Fumiko. 1969. Estimation of latent ability using a response pattern of graded scores (Psychometric Monograph No. 17). Richmond: Psychometric Society.Google Scholar

Scheve, Kenneth, and Stasavage, David. 2006. Religion and preferences for social insurance. Quarterly Journal of Political Science 1: 255–86.Google Scholar

Schwarz, Norbert. 2003. Culture-sensitive context effects: A challenge for cross-cultural surveys. In Cross-cultural survey methods, ed. Harkness, Janet A., van de Vijver, Fons J. R., and Mohler, Peter Ph, 93–100. New Jersey: Wiley.Google Scholar

Sinn, Hans-Werner. 1995. A theory of the welfare state. Scandinavian Journal of Economics 97: 495–526.CrossRef Google Scholar

Skrondal, Anders, and Laake, Petter. 2001. Regression among factor scores. Psychometrika 66: 563–757.Google Scholar

Skrondal, Anders, and Rabe-Hesketh, Sophia. 2004. Generalized latent variable modeling: Multilevel, longitudinal and structural equation models. Boca Raton, IL: Chapman & Hall.Google Scholar

Song, Xin-Yuan, and Lee, Sik-Yum. 2004. Bayesian analysis of two-level nonlinear structural equation models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology 57: 29–52.Google Scholar

Steenbergen, Marco R., and Jones, Bradford S. 2002. Modeling multilevel data structures. American Journal of Political Science 46: 218–37.CrossRef Google Scholar

Takane, Yoshio, and de Leeuw, Jan. 1987. On the relationship between item response theory and factor analysis of discretized variables. Psychometrika 52: 393–408.Google Scholar

van Deth, Jan W. 1998. Equivalence in comparative political research. In Comparative politics. The problem of equivalence, ed. van Deth, Jan W., 1–19. London: Routledge.Google Scholar

van Herk, Hester, Poortinga, Ype H., and Verhallen, Theo M. M. 2004. Response styles in rating scales: Evidence of method bias in data from six EU countries. Journal of Cross-Cultural Psychology 35: 346–60.Google Scholar

Varian, Hal R. 1980. Redistributive taxation as social insurance. Journal of Public Economics 14: 49–68.Google Scholar

Vermunt, Jeroen. 2004. An EM algorithm for the estimation of parametric and nonparametric hierarchical nonlinear models. Statistica Neerlandica 58: 220–33.Google Scholar

Vermunt, Jeroen. 2008. Multilevel latent variable modeling: An application in education testing. Austrian Journal of Statistics 37: 285–99.Google Scholar

Vermunt, Jeroen K., and Magidson, Jay. 2008. LG-Syntax user's guide: Manual for latent GOLD 4.5 Syntax module. Belmont, CA: Statistical Innovations Inc.Google Scholar

Weldon, Steven A. 2006. The institutional context of tolerance for ethnic minorities: A comparative, multilevel analysis of Western Europe. American Journal of Political Science 50: 331–49.CrossRef Google Scholar

White, Halbert. 1996. Estimation, inference and specification analysis. Cambridge, MA: Cambridge University Press.Google Scholar

Yang, Yongwei, Harkness, Janet A., Chin, Tzu-Yun, and Villar, Ana. 2010. Response styles and culture. In Survey methods in multicultural, multinational, and multiregional contexts, ed. Harkness, Janet A., Braun, Michael, Edwards, Brad, Johnson, Timothy P., Lyberg, Lars E., Mohler, Peter Ph, Pennell, Beth-Ellen, and Smith, Tom W., 203–26. Hoboken: Wiley.Google Scholar

Stegmueller supplementary material

Appendix

PDF 96.6 KB

Article contents

Apples and Oranges? The Problem of Equivalence in Comparative Research

Abstract

References

Stegmueller supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests