Article contents
Testing Deterrence Theory: Rigor Makes a Difference
Published online by Cambridge University Press: 13 June 2011
Abstract
There is no consensus among scholars on how to test hypotheses about deterrence systematically. The disputes are sometimes rooted in differences about theory or sources of data, but they are magnified by methodological confusion, especially over concepts and operational definitions that produce perverse empirical results. Serious theoretical errors include inadequate appreciation of the role of uncertainty in deterrence as well as selection biases that undermine empirical tests. Rigorous examination of our previous work in light of recent criticism discloses very robust findings on the conditions for deterrence success and failure.
- Type
- Research Article
- Information
- Copyright
- Copyright © Trustees of Princeton University 1990
References
1 The most influential early works include Kaufmann, William W., The Requirements of Deterrence, Policy Memorandum No. 7 (Princeton: Center of International Studies, Princeton University, 1954)Google Scholar; Wohlstetter, Albert, “The Delicate Balance of Terror,” Foreign Affairs 37 (January 1959), 221–34CrossRefGoogle Scholar; Snyder, Glenn, Deterrence and Defense (Princeton: Princeton University Press, 1961)CrossRefGoogle Scholar; and Schelling, Thomas, Strategy of Conflict (Cambridge: Harvard University Press, 1960)Google Scholar. Empirical studies of deterrence include Betts, Richard, Nuclear Blackmail and Nuclear Balance (Washington, DC: Brookings, 1987)Google Scholar; de Mesquita, Bruce Bueno, The War Trap (New Haven: Yale University Press, 1981)Google Scholar; George, Alexander and Smoke, Richard, Deterrence in American Foreign Policy (New York: Columbia University Press, 1974)Google Scholar; Kugler, Jacek, “Terror Without Weapons: Reassessing the Role of Nuclear Weapons,” Journal of Conflict Resolution 28 (December 1984), 470–506CrossRefGoogle Scholar; Mearsheimer, John, Conventional Deterrence (Ithaca, NY: Cornell University Press, 1983)Google Scholar; Shimshoni, Jonathan, Israel and Conventional Deterrence (Ithaca, NY: Cornell University Press, 1988)Google Scholar.
2 For a recent exchange on these two issues, see World Politics 41 (January 1989)Google Scholar.
3 See for example, Blalock, Hubert, Theory Construction: From Verbal to Mathematical Formulations (Englewood Cliffs, NJ: Prentice-Hall, 1969)Google Scholar, and Stinchcombe, Arthur, Constructing Social Theories (New York: Brace & World, 1968)Google Scholar.
4 Lebow, Richard Ned, Between Peace and War (Baltimore: The Johns Hopkins University Press, 1981)Google Scholar; Jervis, Robert, Lebow, Richard Ned, and Stein, Janice Gross, Psychology and Deterrence (Baltimore: The Johns Hopkins University Press, 1985)Google Scholar; Stein, Janice Gross, “Extended Deterrence in the Middle East: American Strategy Reconsidered,” World Politics 39 (April 1987), 326–52CrossRefGoogle Scholar; Lebow, Richard Ned and Stein, Janice Gross, “Beyond Deterrence,” Journal of Social Issues 43 (Winter 1987), 5–72CrossRefGoogle Scholar; Lebow, Richard Ned and Stein, Janice Gross, “Rational Deterrence Theory: I Think, Therefore I Deter,” World Politics 41 (January 1989), 208–24CrossRefGoogle Scholar; and Lebow, Richard Ned and Stein, Janice Gross, “Deterrence: The Elusive Dependent Variable,” World Politics 42 (April 1990), 336–69CrossRefGoogle Scholar.
5 Huth, Paul and Russett, Bruce, “What Makes Deterrence Work? Cases from 1900 to 1980,” World Politics 36 (July 1984), 496–526CrossRefGoogle Scholar; Huth, Paul and Russett, Bruce, “Deterrence Failure and Crisis Escalation,” International Studies Quarterly 32 (March 1988), 29–46CrossRefGoogle Scholar; Huth, Paul, “Extended Deterrence and the Outbreak of War,” American Political Science Review 82 (Summer 1988), 423–44CrossRefGoogle Scholar; and Huth, Paul, Extended Deterrence and the Prevention of War (New Haven: Yale University Press, 1988)CrossRefGoogle Scholar.
6 We vigorously reject the assertions made by Lebow and Stein (fn. 4, 1990), 337, that in many cases we (1) improperly designated attacker and defender; (2) incorrectly identified third parties as targets of attack or of deterrence; (3) coded direct deterrence as extended deterrence; (4) conflated deterrence with compellence. In addition, we reject the argument that one reason for the numerous differences in the coding of cases is our inadequate reading of secondary and primary sources. Many of their citations are familiar to us. Indeed, for a number of cases it seems that Lebow and Stein have not carefully read the case summaries provided at their request by Huth on his data set of 58 cases of extended-immediate deterrence. As a result, they make mistakes in characterizing Huth's coding of specific cases. In the Turkish case of 1946, for example, we utilized multiple secondary sources; most importantly, we relied on British intelligence reports, taken from the archives of the Foreign Office, to code Soviet military movements in the critical period of March 1946. Soviet verbal threats and military movements were consistent with a possible intention to use force. Finally, Turkish and U.S. assessments of Soviet intentions are irrevelant to coding the presence or absence of a Soviet threat to use force. Whatever evidence is used to assess the Soviet threat must be based on Soviet actions and behavior. We will discuss other instances at various points in this article.
Full documentation supporting our codings for all cases is available from the first author; it will be published in Kenneth A. Oye, ed., Specifying and Testing Theories of Deterrence (Ann Arbor: University of Michigan Press, forthcoming).
7 Modifications of expected utility theory, such as Quattrone, George and Tversky, Amos, “Contrasting Rational and Psychological Analyses of Political Choice,” American Political Science Review 82 (September 1988), 719–36CrossRefGoogle Scholar, are not central to the substantive focus of this article.
8 The important exception is that analysts of nuclear deterrence have devoted attention to the problem of crisis stability and how the deterrent force posture and military actions of the deterrer could be read as a threat of offensive attack by the attacker, and thereby lead the attacker to launch what he believes to be a preemptive strike. See, for example, Schelling, Thomas, Arms and Influence (New Haven: Yale University Press, 1966)Google Scholar, chap. 6.
9 See, for example, Russett, Bruce, “Pearl Harbor: Deterrence Theory and Decision Theory,” Journal of Peace Research, no. 2 (1967), 89–105CrossRefGoogle Scholar.
10 Our definition is very similar to Glenn Snyder's: “One deters another party from doing something by the implicit or explicit threat of applying some sanction if the forbidden act is performed, or by the promise of a reward if the act is not performed.” See Snyder (fn. 1), 9.
11 Lebow and Stein (fn. 4, 1987), 40–63.
12 George and Smoke (fn. 1), 604–10, discuss the need to develop “inducement theory” as part of the broader study of deterrence.
13 Huth, Extended Deterrence (fn. 5), 51–53.
14 Jervis, , Perception and Misperception in International Relations (Princeton: Princeton versity Press, 1976), 58–113Google Scholar. The first two assumptions may illustrate the cold war origins of deterrence theory. As for the second two, we do not treat them as assumptions (contra Lebow and Stein, fn. 4, 1990, p. 355, n. 28), but rather reformulate them as hypotheses for empirical testing.
15 The need for variation in the dependent variable was emphasized at least as far back as 1834 by John Stuart Mill, A System of Logic Ratiocinative and Inductive; see Robson, J. M., ed., Collected Works of J. S. Mill, vols. 7, 8 (Toronto: University of Toronto Press, 1974)Google Scholar. A recent and relevant illustration of the need is provided by Levite, Ariel, Intelligence and Strategic Surprise (New York: Columbia University Press, 1987)Google Scholar. Lebow and Stein (fn. 4, 1990), 349, suggest longitudinal analysis as a promising means to establish greater variation, presumably for studies of general deterrence. We agree, and have been engaged in such work for the past two years. This analysis produces many data points (by year, month, or other temporal units) within a single “case.” On the relative virtues of statistical and case study methods for different purposes, see Achen, Christopher and Snidal, Duncan, “Rational Deterence Theory and Comparative Case Studies,” World Politics 41 (January 1989), 143–69CrossRefGoogle Scholar, and Russett, Bruce, “International Behavior Research: Case Studies and Cumulation,” in Haas, Michael and Kariel, Henry, eds., Approaches to the Study of Political Science (San Francisco: Chandler, 1970), 425–43Google Scholar. Huth uses both in Extended Deterrence (fn. 5).
16 In addition, the deterrer may also have used inducements in an attempt to persuade the adversary to not use force. In all cases, the threat of sanctions was invoked, but not necessarily the offer of rewards.
17 See Huth, Extended Deterrence (fn. 5), 213–20, for a discussion of how generalizable findings on extended-immediate deterrence are to other types of deterrence cases.
18 Lebow and Stein (fn. 4, 1990), 350, 361–62.
19 Schelling (fn. 8), 69–91.
20 Great Britain, Parliamentary Papers, vol. CXXXVII, “Correspondence Respecting the Turco-Egyptian Frontier in the Sinai Peninsula,” Cd. 3006, 1906, pp. 1–36.
21 Lebow and Stein (fn. 4, 1990), 337, 362–63, 366–68. On p. 350 they refer to “several” cases of extended-immediate deterrence that we allegedly omitted, but the accompanying footnote identifies only two. One is said to be a U.S. success in deterring a Turkish invasion of Cyprus in 1964. The other is a one-week “success” of Egypt against Israel in 1967. We consider such deterrence too short-lived to be coded as successful.
22 Ibid., 352.
23 For example, see Foreign Relations of the United States 1903 (Washington, DC: G.P.O., 1904), 242Google Scholar, 243, 250–51, 270–71, 279–81, 309–10, 312.
24 Lebow and Stein (fn. 4, 1990), 353–54.
25 For the 1954–1955 off-shore islands crisis, see American Foreign Policy, 1950–1955: Basic Documents vol. 2 (Washington, DC: G.P.O., 1957), 2487Google Scholar; Foreign Relations of the United States 1955–1957 vol. 2 (Washington, DC: G.P.O., 1986), 299Google Scholar; George and Smoke (fn. 12), 288; and Huth, Extended Deterrence (fn. 5), 111. Other cases that Lebow and Stein (fn. 4, 1990), 352, n. 25 mistakenly label as including only compellence and not also deterrence are Serbia against Albania in 1913, the U.S. against Panama in 1921, Britain against Turkey in 1922, and the U.S. against North Vietnam in 1964–1965. Two further alleged cases are the U.S. against the Soviet Union over Laos in 1961 (p. 352, n. 25) and the Soviet Union against the U.S. in the 1973 Arab-Israeli War (pp. 352–53). The U.S. made no military (or even economic) threat against the Soviet Union in 1961. In the 1973 case (which we dropped after 1984 on other grounds), the Soviets arguably were threatening (verbally and by alerting airborne divisions in the Soviet Union) to take direct action against Israel, but they were not threatening to take any military action against the United States—as is implied in Lebow and Stein's categorization. Thus, their definition of compellence frequently shifts in practice, and is hardly limited to the relevant meaning of extended military compellence. Elsewhere, their definition of which actions are to be deterred meanders back and forth across our definitional boundary of “use of military force.” That, rather than matters of compellence or direct deterrence (Lebow and Stein, fn. 4, 1990, p. 345), is why we omitted the Cuban missile crisis: we coded other cases differently than did George and Smoke (Lebow and Stein, fn. 4, 1990, p. 345, n. 13) because we were concerned with deterrence of different kinds of actions. This type of conceptual confusion is common; see, for example, Stein (fn. 4).
26 Lebow and Stein (fn. 4, 1990), 351–52.
27 Lebow and Stein (fn. 4, 1990), 347; italics added.
28 Watt, Donald Cameron, How War Came: The Immediate Origins of the Second World War, 1938–1939 (New York: Pantheon, 1989)Google Scholar.
29 George and Smoke (fn. 1), chap. 18; Craig, Gordon and George, Alexander, Force and Statecraft (New York: Oxford University Press, 1983), 186–87Google Scholar; Smoke, Richard, War: Controlling Escalation (Cambridge: Harvard University Press, 1977), 245–51CrossRefGoogle Scholar. In contrast, Lebow and Stein (fn. 4, 1990), 343, state that there is only one case in our data set in which the potential attacker's intentions were uncertain: Indonesia's conflict with the Netherlands over West Irian in 1962. Lebow and Stein argue that documentary evidence is critical in establishing intentions, but documentary evidence is simply not available in a number of cases; how, then, can they reach such a conclusion?
30 Lebow and Stein's desire to look only at cases where the attacker's intentions are “serious” is suited to their theoretical interest in whether strong motivations to pursue particular policies lead policy makers to distort and misjudge the risks of such policies. As a general criterion for selecting cases of deterrence to test theoretical propositions on success and failure, however, it is inappropriate. See Achen, Christopher, The Statistical Analysis of Quasi-Experiments (Berkeley: University of California Press, 1986)Google Scholar, for a general examination of the implications of selection bias for empirical analysis. The implications of selection bias for testing deterrence are addressed by Achen and Snidal (fn. 15) and by Levy, Jack, “Quantitative Studies of Deterrence Success and Failure,” in Stern, Paul, Axelrod, Robert, Jervis, Robert, and Radner, Roy, eds., Perspectives on Deterrence (New York: Oxford University Press, 1989), 98–133Google Scholar.
31 Lebow (fn. 4), 229. Contemporary poststructural theorists share similar doubts about the mythological and self-justificatory functions of decision makers' texts, and the difficulty, therefore, of interpreting motivations. See Shapiro, Michael J., “Textualizing Global Politics,” in DerDerian, James and Shapiro, Michael, eds., International/Intertextual Relations: Post-modern Readings of World Politics (Lexington, MA: Lexington Books, 1989), 11–22Google Scholar.
32 This somewhat arbitrary attitude toward explicit operational criteria can be found in Lebow's earlier work. See, for example, Lebow (fn. 4), 24, where he states:
The assignation of a crisis to a particular category (brinkmanship, justification of hostility, spinoff) depends upon an analysis of the initiator's objectives in a crisis, its leaders' willingness to go to war, and the degree of control they exercise over events. Interpretation of these criteria are (sic) likely to vary from scholar to scholar regardless of the extent of available documentation. … When interpretations are diametrically opposed, the researcher must work his way through the controversy and adopt what appears to him the most convincing interpretation.
33 Krosby, H. Peter, Finland, Germany, and the Soviet Union, 1940–1941: The Petsamo Dispute (Madison: University of Wisconsin Press, 1968), 11–12Google Scholar, 142.
34 Ibid., 338, 340; also see Military Archives, Navy and Old Army Branch of National Archives, Military Intelligence Division Files, Record Group 165, Entry 65:2037–100/45.
35 Lebow and Stein (fn. 4, 1990), 345–46, 353–55.
36 Huth and Russett (fn. 5, 1988), 31; Huth, Extended Deterrence (fn. 5), 32. Our designation of “attacker” merely referred to the state that makes the first military move or verbal threat in a discrete sequence of activities. “Initiator” and “responder” might have been better labels. Our use of attacker and defender simply continued the tradition of earlier research in Russett, Bruce, “The Calculus of Deterrence,” Journal of Conflict Resolution 7 (June 1963), 97–109CrossRefGoogle Scholar, that employed these terms—with a similar disclaimer about normative content.
37 Huth, Extended Deterrence (fn. 5), 49–53. Furthermore, the case studies in chap. 6 focused on defensive motivations which in a number of cases played an important role in the decisions of the potential attacker.
38 The individual case summaries compiled by Huth do not discuss motivations. They document the verbal and military actions of the attacker which constituted a threat to use force, the deterrent actions of the defender, and the outcome of each case.
39 Lebow and Stein (fn. 4, 1990), 368–69.
40 Lebow and Stein (fn. 4, 1990), 346. See Thies, Wallace, When Governments Collide: Coercion and Diplomacy in the Vietnam Conflict, 1964–1968 (Berkeley: University of California Press, 1980), 273–74Google Scholar, 328, n. 54. for information on North Vietnamese troop movements into South Vietnam.
41 Huth, Extended Deterrence (fn.5), 105.
42 The case study of the first off-shore islands crisis (Ibid., chap. 5) makes this clear, as does the case summary.
43 People's China, no. 1 (1950), 6.
44 Karig, W. et al. , Battle Report: The War in Korea, vol.6 (New York: Farrar & Rinehart, 1952), 41Google Scholar, and Whiting, Allen, China Crosses the Yalu (New York: Macmillan, 1960), 178Google Scholar.
45 Ibid., 23. Melvin Gurtov and Byon-Moo Hwang describe Chinese communist policy and military preparations during this period in similar terms; see Gurtov, and Hwang, , China Under Threat (Baltimore: The Johns Hopkins University Press, 1980), 28Google Scholar, 30–32, 40–41, 49.
46 Lebow, Richard Ned, “Extended Deterrence: Military Fact or Fiction?” in Arnett, Eric H., ed., Technologies for Security and Arms Control: Threats and Promises (Washington, DC: American Association for the Advancement of Sciences, 1989), 58Google Scholar. Part of the explanation for this may be that Lebow and Stein are conflating auxiliary assumptions of the deterrence model, as summarized in Jervis (fn. 14) and discussed above, with the core assumptions of rational deterrence theory.
47 See Jakobson, Max, The Diplomacy of the Winter War (Cambridge: Harvard University Press, 1961)Google Scholar, and Tanner, Viano, The Winter War (Stanford: Stanford University Press, 1957)Google Scholar.
48 Findings in experimental psychology suggest that individuals are more willing to accept risks in order to avoid losses than to achieve gains. See Quattrone and Tversky (fn. 7).
49 Huth, Extended Deterrence (fn. 5), 208.
50 Lebow and Stein's insistence (fn. 4, 1990), 344, that “to qualify as a case of deterrence” the defender must “demonstrate the resolve” to punish or restrain transgressors illustrates a confounding of conditions hypothesized to promote deterrence success with merely the attempt to deter.
51 See Anderson, Eugene, The First Moroccan Crisis, 1904–1906 (Chicago: University of Chicago Press, 1930), 252Google Scholar, 327. Two additional verbal warnings, very similar to those quoted, were issued by the British.
52 Lebow and Stein (fn. 4, 1990), 342.
53 Lebow and Stein incorrectly state (Ibid., 344) that this component of the definition of deterrence failure was utilized only in our more recent work. We utilized this definitional component from the outset; see Huth and Russett (fn. 5, 1984), 505. For example, we coded the Czech crisis of 1938 as a deterrence failure. We agree with Lebow and Stein (fn. 4, 1990, pp. 364–65) that the outcome of the Czech case is mixed, and concur with them that, “overall, Munich must be considered a deterrence failure” (Ibid., p. 365). We do disagree, however, on whether Britain should be coded as a defender.
54 The adoption of the 200-fatality threshold in our 1988 work makes no difference in the coding of deterrence outcomes and simply reflects our ability to find more precise data on fatalities in 1988 than in 1984.
55 Any use of force sanctioned by the political and military leadership in a situation of general deterrence, however, should be coded as a deterrence failure.
56 Lebow and Stein (fn. 4, 1990), 344–45.
57 As Huth noted in Extended Deterrence (fn. 5, 25–26, n. 19), some cases were deleted because new information indicated that the defender's threats were initiated after the threshold of 200–250 fatalities had occurred (and that therefore compellence rather than deterrence had been attempted), or because new evidence indicated that the “potential attacker” never had any intention of attacking.
58 The percentage of correct predictions in the probit model presented in 1984 was 78 and the percentage for the model presented in 1988 was 84. See Huth and Russett (fn. 5, 1984), 515, and Huth, Extended Deterrence (fn. 5), 73.
59 For example, we dropped the 1935 Ethiopian case because the deterrent actions of Britain were sufficiently weak—only one verbal threat—that it is a borderline case of attempted deterrence. Originally, we had believed that British naval measures in the Mediterranean were in part intended as a deterrent bluff. We are now persuaded that this was most likely not the case, and that British naval measures were precautionary to deal with the possibility that Italy might attack British naval forces. See Marder, Arthur, “The Royal Navy and the Ethiopian Crisis of 1935–36,” American Historical Review 125 (June 1970), 1327–56CrossRefGoogle Scholar, and Roskill, Stephen, Naval Policy Between the Wars, vol. 2 (London: Collins, 1976)Google Scholar, chap. 9.
60 Briefly, the reasons are as follows: (1) In case 7 of the Appendix, U.S. deterrent actions are not based on the alleged ultimatum of President Roosevelt, but on the extensive buildup and movement of U.S. naval forces. German naval attacks had resulted in only very limited fatalities; thus U.S. threats were still deterrent in character. (2) In case 8, Korea was an independent state; hence extended deterrence was applicable, and Russia rather than Japan attempted deterrence. (3) In case 18, Russia's deterrent policy was coded on the basis of actions from January to May 1913, not June to July.
61 In cases 14 and 44, Lebow and Stein argue that Tripoli and Goa were integral parts of Turkey and Portugal, respectively, and that therefore these cases are direct rather than extended deterrence failures. Formally, this is correct; but internationally, both territories were treated like colonies and we coded them as such. We coded the Egyptian deterrent as a failure (case 49) because Israel did attack Syria as an extension of its attack on Egypt in June 1967. We coded Israeli threats as a deterrence success (case 51) because Syrian military intervention did not exceed the 200–250 threshold in fatalities.
62 See Huth, Extended Deterrence (fn. 5), 71–84, for a detailed discussion of these probit results.
63 Significance levels do not strictly apply to tests on a population of cases rather than a sample; we give them only to suggest the strength of relationships.
64 See Huth and Russett (fn. 5, 1984), 515–17.
65 Lebow and Stein's statement (fn. 4, 1990), 337, is misleading: the scientific method is primarily concerned with the replication of results, not the replication of data sets.
66 Lebow and Stein (fn. 4, 1990), 349.
67 Ibid., 340.
68 See, for example, Herek, Greg, Janis, Irving, and Huth, Paul, “Decision Making During International Crises,” Journal of Conflict Resolution 31 (June 1987), 203–26CrossRefGoogle Scholar, and Zagare, Frank, “Rationality and Deterrence,” World Politics 42 (January 1990), 238–60CrossRefGoogle Scholar.
69 As an example, see Janis, Irving, Crucial Decisions (New York: Free Press, 1989)Google Scholar.
- 83
- Cited by