Published online by Cambridge University Press: 13 June 2011
Qualitative analysts have received stern warnings that the validity of their studies may be undermined by selection bias. This article provides an overview of this problem for qualitative researchers in the field of international and comparative studies, focusing on selection bias that may result from the deliberate selection of cases by the investigator. Examples are drawn from studies of revolution, international deterrence, the politics of inflation, international terms of trade, economic growth, and industrial competitiveness. The article first explores how insights about selection bias developed in quantitative research can most productively be applied in qualitative studies. The discussion considers why qualitative researchers need to be concerned about selection bias, even if they do not care about the generality of their findings, and it considers distinctive implications of this form of bias for qualitative research, as in the problem of what is labeled “complexification based on extreme cases.” The article then considers pitfalls in recent discussions of selection bias in qualitative studies. These discussions at times get bogged down in disagreements and misunderstandings over how the dependent variable is conceptualized and what the appropriate frame of comparison should be, issues that are crucial to the assessment of bias within a given study. At certain points it becomes clear that the real issue is not just selection bias, but a larger set of trade-offs among alternative analytic goals.
1 King, Gary, Keohane, Robert O., and Verba, Sidney, Designing Social Inquiry: Scientific Inference Qualitative Research (Princeton: Princeton University Press, 1994), 116Google Scholar; Geddes, Barbara, “How the Cases You Choose Affect the Answers You Get: Selection Bias in Comparative Politics,” in Stimson, James A., ed., Political Analysis, vol. 2 (Ann Arbor: University of Michigan Press, 1990), 131, n. 1Google Scholar; and Achen, Christopher H. and Snidal, Duncan, “Rational Deterrence Theory and Comparative Case Studies,” World Politics 41 (January 1989), 160, 161CrossRefGoogle Scholar. The most important general statement by a political scientist on selection bias is Achen, Christopher H., The Statistical Analysis of Quasi-Experiments (Berkeley: University of California Press, 1986)Google Scholar. See also King, Gary, Unifying Political Methodology: The Likelihood Theory ofStatisticalInference (Cambridge: Cambridge University Press, 1989), chap. 9Google Scholar.
2 Heckman, James J., “The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Variables and a Simple Estimator for Such Models,” Annals ofEconomic and Social Measurement 5 (Fall 1976)Google Scholar; idem, “Sample Selection Bias as a Specification Error,” Econo-metrica 47 (January 1979)Google Scholar; idem, “Varieties of Selection Bias,” American Economic Association Papers and Proceedings 80 (May 1990)Google Scholar; Maddala, G. S., Limited-Dependent and Qualitative Variables in Economics (Cambridge: Cambridge University Press, 1983)CrossRefGoogle Scholar; Campbell, Donald T. and Erlebacher, Albert, “How Regression Artifacts in Quasi-Experimental Evaluations Can Mistakenly Make Compensatory Education Look Harmful,” in Struening, Elmer L. and Guttentag, Marcia, eds., Handbook if Evaluation Research, vol. 1 (Beverly Hills, Calif.: Sage Publications, 1975)Google Scholar; and Cain, G. G., “Regression and Selection Models to Improve Nonexperimental Comparisons,” in Bennett, C. A. and Lumsdaine, A. A., eds., Evaluation and Experiment: Some Critical Issues in Assessing Social Programs (New York: Academic Press, 1975)Google Scholar.
3 King, Keohane, and Verba (fn. 1), 125–26.
4 “Review Symposium—The Qualitative-Quantitative Disputation: Gary King, Robert O. Keo-hane, and Sidney Verba's Designing SocialInquiry: Scientific Inference in Qualitative Research” American Political Science Review 89 (June 1995).
5 Collier, David, “Translating Quantitative Methods for Qualitative Researchers: The Case of Selection Bias” American Political Science Review 89 (June 1995)CrossRefGoogle Scholar.
6 Rogowski, Ronald, “The Role of Theory and Anomaly in Social-Scientific Inference,” American Political Science Review 89 (June 1995), 468—70CrossRefGoogle Scholar. For a cautionary treatment of selection bias within the field of quantitative sociology, see Stolzenberg, Ross M. and Relies, Daniel A., “Theory Testing in a World of Constrained Research Design: The Significance of Heckman's Censored Sampling Bias Correction for Nonexperimental Research,” Sociological Methods and Research 18 (May 1990)CrossRefGoogle Scholar.
7 See Kendall, Maurice G. and Buckland, William R., A Dictionary of Statistical Terms, 4th ed. (London: Longman, 1982), 18, 66Google Scholar; and Vogt, W. Paul, Dictionary of Statistics and Methodology (Newbury Park, Calif.: Sage Publications, 1993), 21, 82Google Scholar.
8 Achen (fn. 1).
9 Przeworski, Adam and Limongi, Fernando, “Political Regimes and Economic Growth,” Journal of Economic Perspectives 7 (Summer 1993), 62–64CrossRefGoogle Scholar; and Adam Przeworski, contribution to “The Role of Theory in Comparative Politics: A Symposium,” World Politics 48 (October 1995). This specific problem is also referred to as “endogeneity.” It merits emphasis that even if scholars resolve the concerns about investigator-induced selection bias that are the focus of the present paper, they will still be faced with the selection issues raised by Przeworski.
10 Moses, Lincoln E., “Truncation and Censorship,” in Sills, David L., ed., International Encyclopedia ofthe Social Sciences, vol. 15 (New York: Macmillan and Free Press, 1968), 196Google Scholar. Moses refers to this as truncation “on the left” and “on the right.” We are not concerned with other forms of truncation, which he refers to as “inner” truncation (omitting cases within a given range of values, but including cases above and below that range) and “outer” truncation (omitting cases above and below a given range). In the discussion below, when we refer to truncation, we mean left and right truncation.
11 Heckman (fn. 2, 1976), 478–79.
12 It is important to emphasize that this does not involve the situation of causal heterogeneity discussed below, in which unit changes in the explanatory variables have different effects on the dependent variable. Rather, a different combination of extreme scores on the explanatory variables produces the high scores.
13 Putnam, Robert D., Making Democracy Work- Civic Traditions in Modern Italy (Princeton: Princeton University Press, 1993), chaps. 3–4Google Scholar, and esp. 91–99. His term is actually “civic-ness.”
14 King, Keohane, and Verba (fn. 1), 130. See also Heckman (fn. 2,1976), 478, n. 4; and Winship, Christopher and Mare, Robert D., “Models for Sample Selection Was” Annual Review of Sociology 18 (1992), 330CrossRefGoogle Scholar.
15 Discussions of these methods of inference are found in Frendreis, John P., “Explanation of Variation and Detection of Covariation: The Purpose and Logic of Comparative Analysis,” Comparative Political Studies 16 (July 1983)CrossRefGoogle Scholar; DeFelice, E. Gene, “Causal Inference and Comparative Methods,” Comparative Political Studies 19 (October 1986)CrossRefGoogle Scholar; George, Alexander L. and McKeown, Timothy J., “Case Studies and Theories of Organizational Decision Making,” in Advances in Information Processing in Organizations, vol. 2 (Santa Barbara, Calif: JAI Press, 1985), 29–41Google Scholar; Ragin, Charles C., The Comparative Method: Moving beyond Qualitative and Quantitative Strategies (Berkeley: University of California Press, 1987), esp. chaps. 6—8Google Scholar; and Collier, David, “The Comparative Method,” in Finifter, Ada W., ed., Political Science: The State ofthe Discipline II (Washington, D.C.: American Political Science Association, 1993)Google Scholar.
16 Garfinkel, Alan, Forms ofExplanation: Rethinking the Questions in Social Theory (New Haven: Yale University Press, 1981), 22–24Google Scholar.
17 Bartels, Larry M., “Pooling Disparate Observations,” American Journal of Political Science 40 (August 1996), 906CrossRefGoogle Scholar; emphasis in original.
18 Bartels offers an excellent example of such a model. See ibid.
19 Przeworski, Adam and Teune, Henry, The Logic of Comparative Social Inquiry (New York: Wiley, 1970), 20–23Google Scholar. “Causality” is achieved when the causal model is correctly specified. Although greater generality may at times be achieved at the cost of causality, discussions of selection bias point to the alternative view that greater generality may sometimes improve causal assessment.
20 Sartori, Giovanni, “Concept Misformation in Comparative Politics,” American Political Science Review 64 (December 1970)CrossRefGoogle Scholar; and Collier, David and Mahon, James E. Jr., “Conceptual ‘Stretching‘Revisited: Adapting Categories in Comparative Analysis,” American Political Science Review 87 (December 1993)CrossRefGoogle Scholar.
21 On discerning, see Komarovsky, Mirra, The UnemployedMan andHis Family: The Effect of Unemployment upon the Status of the Man in Fifty-nine Families (New York: Dryden Press, 1940), esp. 135–46Google Scholar; on process analysis, see Barton, Allen H. and Lazarsfeld, Paul, “Some Functions of Qualitative Analysis in Social Research,” in McCall, G. J. and Simmons, J. L., eds., Issues in Participant Observation (Reading, Mass.: Addison-Wesley, 1969)Google Scholar; on pattern matching, see Campbell, Donald T., “‘Degrees of Freedom‘and the Case Study,” Comparative Political Studies 8 (July 1975), 181–82CrossRefGoogle Scholar; on process tracing, see George and McKeown (fn. 15); on causal narrative, see Sewell, William H. Jr., “Three Temporalities: Toward an Eventful Sociology,” in McDonald, Terrence J., ed., The Historic Turn in the Human Sciences (Ann Arbor: University of Michigan Press, forthcoming)Google Scholar.
22 Campbell (fn. 21).
23 Putnam (fn. 13), 85, 118–19.
24 For a particularly interesting statement on the tendency of case studies to overturn prior understandings, see again Campbell (fn. 21), 182. On the use of case studies to discover new explanations and conceptualizations, see also Piore, Michael J., “Qualitative Research Techniques in Economics,” Administrative Science Quarterly 24 (December 1979)CrossRefGoogle Scholar; Lijphart, Arend, “Comparative Politics and Comparative Method,” American Political Science Review 65 (September 1971), 691–92CrossRefGoogle Scholar; Eckstein, Harry, “Case Study and Theory in Political Science,” in Greenstein, Fred I. and Polsby, Nelson W., eds., Handbook of Political Science, vol. 7 (Reading, Mass.: Addison-Wesley, 1975), 104–8Google Scholar. Some of these themes are incisively summarized in George, Alexander L., “Case Studies and Theory Development: The Method of Structured, Focused Comparison,” in Lauren, Paul Gordon, ed., Diplomacy: New Approaches in History, Theory, and Policy (New York: Free Press, 1979), 51–52Google Scholar.
25 In this latter case, scholars may actually look at a range of variation at the high or low extreme of the variable, yet they treat this range of variation as a single outcome, for example, as “high” or “low” growth.
26 King, Keohane, and Verba (fn. 1), 129; Geddes (fn. 1), 132–33.
27 King, Keohane, and Verba (fn. 1), 129.
28 Ibid., 129, 130. We might add that notwithstanding this emphatic advice, these authors state their position more cautiously at a later point (p. 134). They suggest that this type of design may be a useful first step in addressing a research question and can be used to develop interesting hypotheses.
29 Collier (fn. 5), 464. On counterfactual analysis, see Fearon, James D., “Counterfactuals and Hypothesis Testing in Political Science,” World Politics 43 (January 1991), 179–80CrossRefGoogle Scholar; and Tetlock, Philip E. and Belkin, Aaron, eds., Counterfactual Thought Experiments in World Politics (Princeton: Princeton University Press, 1996)Google Scholar. See also Mill, John Stuart, “Of the Four Methods of Experimental Inquiry,” in A System ofLogic (1843; Toronto: University of Toronto Press, 1974)Google Scholar.
30 King, Keohane, and Verba (fn. 1), 146, underscore this point.
31 Rogowski (fn. 6), 468–70; King, Gary, Keohane, Robert O., and Verba, Sidney, “The Importance of Research Design in Political Science,” American Political Science Review 89 (June 1995), 478—79CrossRefGoogle Scholar; Katzenstein, Peter, Small States in WorldMarkets (Ithaca, N.Y.: Cornell University Press, 1985)Google Scholar; Bates, Robert H., Markets and States in Tropical Africa: The Political Basis of Agricultural Policies (Berkeley: University of California Press, 1981)Google Scholar.
32 Porter, , The CompetitiveAdvantage ofNations (New York: Free Press, 1990)Google Scholar.
33 King, Keohane, and Verba (fn. 1), 134.
34 Porter (fn. 32), 6–10, 28–29, 33, 69, 577, 735.
35 Ibid., 683. See pp. 21–22 for Porter's discussion of his criteria for case selection.
36 Ibid., 675–80.
37 “The Rational Deterrence Debate: A Symposium,” World Politics 41 (January 1989)Google Scholar.
38 Achen and Snidal (fn. 1), 160, 162.
39 Achen and Snidal (fn. 1), 161; George, Alexander L. and Smoke, Richard, Deterrence in American Foreign Policy: Theory and Practice (New York: Columbia University Press, 1974)Google Scholar.
40 George and Smoke (fn. 39), 513—15, 519. See also George, and Smoke, , “Deterrence and Foreign Policy,” World Politics 41 (January 1989), 173CrossRefGoogle Scholar.
41 George and Smoke (fn. 39), 534, 522–36. See more generally chap. 18.
42 Even the cases not classified as following one of their patterns are still treated as instances of deterrence failure. See George and Smoke (fn. 39), 547–48.
43 George and Smoke's (fn. 40) subsequent discussion of these issues appears to underscore the idea of thinking of this variability in terms of gradations (p. 172).
44 George and Smoke (fn. 39), 503.
45 Ibid., 2. Similar statements are found on pp. 503 and 589.
46 This is an adaptation of Tilly's term “variation finding.” See Tilly, Charles, Big Structures, Large Processes, Huge Comparisons (New York: Russell Sage Foundation, 1984), 82, 116–24Google Scholar.
47 Skocpol, Theda, States and Social Revolutions: A Comparative Analysis ofFrance, Russia, and China (Cambridge: Cambridge University Press, 1979)CrossRefGoogle Scholar.
48 Geddes (fn. 1), 142, 145.
49 Skocpol (fn. 47), 33–42, 287–90.
50 Geddes (fn. 1), 134.
51 Ibid., 138.
52 Geddes (fn. 1), 135, introduces additional domain restrictions that seem highly appropriate, as in the exclusion of oil-exporting states.
53 See Geddes (fn. 1), 135–140, and esp. Figures 4, 5, 6.
54 This point is made by Haggard, one of the authors whom Geddes cites. See Haggard, Stephan, “The Newly Industrializing Countries in the International System,” World Politics 38 (January 1986), 343, n. 1CrossRefGoogle Scholar.
55 See Bollen, Kenneth A. and Jackman, Robert W., “Regression Diagnostics: An Expository Treatment of Outliers and Influential Cases,” SociologicalMethods and Research 13 (May 1985)Google Scholar.
56 Geddes (fn. 1), 146–47.
57 Ibid., 145.
58 Prebisch, Raul, The Economic Development ofLatin America and Its Principal Problems (New York: United Nations, 1950)Google Scholar.
59 Geddes (fn. 1), 146.
60 Ibid., 145–47.
61 Prébisch (fn. 58), 9.
62 Hirschmain, Albert O., Journeys toward Progress: Studies of Economic Policy-Mating in Latin America (New York: W. W. Norton, 1973)Google Scholar, originally published by the Twentieth Century Fund in 1963.
63 Geddes (fn. 1), 147,148.
64 Ibid., 147.
65 Ibid.
66 Hirschman (fn. 62), 223.
67 Campbell, Donald T. and Stanley, Julian C., Experimental and Quasi-Experimental Designsfor Research (Chicago: Rand McNally, 1963), 37–43Google ScholarPubMed, esp. Figure 3; Campbell, Donald T. and Ross, H. Laurence, “The Connecticut Crackdown on Speeding: Time-Series Data in Quasi-Experimental Analysis,” Law and Society Review 3 (August 1968)CrossRefGoogle Scholar; Hoole, Francis W., Evaluation Research andDevelopment Activities (Beverly Hills, Calif.: Sage Publications, 1978)Google Scholar; Cook, Thomas D. and Campbell, Donald T., Quasi-Experimentation: Design andAnalysis Issuesfor Field Settings (Boston: Houghton Mifflin, 1979), chap. 2Google Scholar.
68 For two perspectives on the role of probabilistic causation in small-N analysis, see Lieberson, Stanley, “Small N's and Big Conclusions: An Examination of the Reasoning in Comparative Studies Based on a Small Number of Cases,” SocialForces 70 (December 1991), 309–12Google Scholar; and Collier, Ruth Berins and Collier, David, Shaping the Political Arena: Critical Junctures, the Labor Movement, and Regime Dynamics in Latin America (Princeton: Princeton University Press, 1991), 20Google Scholar.
69 Ragin (fn. 15).