At several recent conferences revolving around the use of experimental methods in political science, I have increasingly noticed the implicit, inherent, often unrecognized tension between the underlying assumptions of psychological experiments and those that undergird experiments in behavioral economics that affect the evaluation of both ethics and substance in political science. These issues have cropped up time and time again during the past 15 years as the use of experimental methodology has infiltrated more widely into the study of political science in general, and the subfield of American politics in particular. While ostensibly appearing to fracture on issues of substance, this tug of war reflects the differing ontological assumptions and epistemological traditions, not to mention the divergent goals and incentives structuring these downstream consequences, that split the two major disciplines from which experimental research in political science borrows so heavily. Political science often borrows from other disciplines, and this learning enriches our discipline. The problem, of course is that the use of any idea can quickly become divorced from its underlying theoretical purposes, strictures, and incentives, and the motivations that gave rise to it becomes subsumed in the novelty of the practice itself. For example, any student of the American war in Vietnam will recognize that the purported similarity between Ho Chi Minh and Hitler, although discouraging a policy of appeasement, nevertheless led to a long war of attrition that caused much destruction and produced few unscathed victors. Therefore, it is useful to understand the extent to which the processes that gave birth to a given strategy might inform the limits and utility of its application in other contexts. In that spirit, I attempt to explain the experimental method, broadly construed, within political science.
The use of experimentation in political science offers a powerful tool, and its deepening penetration into field contexts facilitates prospects for more comparativists and international relations scholars to take advantage of this useful methodology as well. Experimental methods can foster the kind of rich marriage between internal validity and external validity that all sophisticated methodologists celebrate. Because political science has drawn so heavily from two different disciplines in its application, we have often failed to note how each experimental tradition developed in ways that, while serving each field's primary goals, often present contradictory imperatives.
Specifically, the adherents of experimental methods in psychology and behavioral economics diverge in two critical areas: the role of deception and the nature of incentives. First, psychologists largely embrace the use of deception while economists eschew it. Experiments published in high-impact social psychology journals commonly use deception (Bortolotti and Mameli Reference Bortolotti and Mameli2006; Christensen Reference Christensen1988), which is understood to represent an effective means of eliciting truthful beliefs and behaviors from subjects who will be less able to consciously affect their behavior to manage their impressions or to please the experimenter. Christensen (Reference Christensen1988) argues that “research has revealed that subjects who have participated in deception experiments versus non-deception experiments enjoyed the experience more, received more educational benefit from it, and did not mind being deceived or having their privacy invaded. Such evidence suggests that deception, although unethical from a moral point of view, is not considered to be aversive, undesirable, or an unacceptable methodology from the research participant's point of view” (664). Christensen also suggests that it is more unethical to refuse to do research on important social problems because of unrealistic repugnance over the notion of deceiving subjects. Bortolotti and Mameli (Reference Bortolotti and Mameli2006) similarly argue for the value of deception in research, suggesting that
… methodological deception is at least at the moment the only effective means by which one can acquire morally significant information about certain behavioral tendencies. Individuals, in general, and research participants, in particular, gain self-knowledge which can help them improve their autonomous decision making. The community gains collective self-knowledge that, once shared, can play a role in shaping education, informing policies and in general creating a more efficient and just society
(259).Economists, by contrast, typically prohibit the use of deception in their experiments and will not publish articles that use subjects who are deceived. Economists believe that deceiving subjects contaminates the subject pool. As Jamison, Karlan, and Schechter (Reference Jamison, Karlan and Schechter2008) argue:
Experimental economists believe (and enforce the idea) that researchers should not employ deception in the design of experiments. This rule exists in order to protect a public good: the ability of other researchers to conduct experiments and to have participants trust their instructions to be an accurate representation of the game being played…. We find significant differences in the selection of individuals who return to play after being deceived as well as (to a lesser extent) the behavior in the subsequent games, thus providing qualified support for the proscription of deception.
Note that the concerns that preoccupy each investigation reflect the primary disciplinary interests of each field: psychology remains interested in the effect of experimentation on self-knowledge, whereas economics is preoccupied with the allocation of resources across groups. These concerns affect the primary dependent variables investigated in the use of deception, showing that for the concerns that matter to psychologists, deception can enhance the understanding of socially significant and sensitive issues, whereas economists demonstrate that the use of deception, and consequence lack of transparency, can affect subject behavior.
Second, psychologists and economists typically use different notions of acceptable incentives for participating in experiments. Psychologists often use subjects drawn from their large introductory classes who receive (required) credit for participation. Economists typically require financial reimbursement to subjects for participating, assuming that money provides the strongest reinforcement and assuring that subjects will pay attention to the tasks at hand. This more restrictive view, however, is beginning to change among some of the leading figures in the field (Camerer, Loewenstein, and Rabin Reference Camerer, Loewenstein and Rabin2003). Again, these divergences should not be surprising to those who are familiar with the interests of the respective fields: psychologists focus on processes underlying social affiliation, whereas economists are more interested in financial transactions and decision making.
Thus, the goals pursued by economists and psychologists, while serving their own disciplinary needs and often aligning in evaluation of best standard practices, may not always best advance the interests of political scientists. Therefore, we need to understand the origin of those traditions, the functions they serve, and the purposes they were designed to advance before reaching a consensus on which aspects political scientists should keep and which ones they might eliminate, preserve, or adjust. To develop consensus around best practices for our discipline, we should be mindful not to default to a set of rules and strictures mired in path-dependent contingencies resulting from unrelated disciplinary incentives. Therefore, here is a brief history of the traditions in psychology and economics and a first cut at an integrated set of best practices for the use of experiments in political science.
HISTORY OF EXPERIMENTAL PSYCHOLOGYFootnote 1
The first experimental laboratory in psychology was set up by Wilhelm Wundt at the University in Leipzig in 1879.Footnote 2 Originally trained in medicine at the University of Heidelberg, Wundt separated psychology from philosophy by developing rigorous methods of manipulation and experimentation. Wundt wanted to explore the foundation of conscious thought through the systematic investigation of introspection. He was particularly interested in human physiology and wrote the first psychology textbook, Principles of Physiological Psychology, on this topic in Reference Wundt and Titchener1874. Wundt's foundational influence on psychology was felt not only through his incredibly prolific writings over the course of more than 60 years of scholarship, but also through the influence of his numerous prominent students. One of those students, G. Stanley Hall, the first president of Clark University, became the founder and first president of the American Psychological Association, which began in 1892, and started the American Journal of Psychology in 1887. Hall is also credited with setting up the first experimental laboratory in the United States at Johns Hopkins University in 1883. However, for understanding the history of psychological experimentation in America, perhaps the most important actor was Edward Titchener. Arguably the most devoted of Wundt's students, Titchener set up an experimental psychology laboratory at Cornell University that proved deeply influential for training generations of students. These students included Edwin Boring (who authored one of the first books documenting the use of experiments in psychology, History of Experimental Psychology (Reference Boring1929)) and, despite Titchener's noted sexism, Maragaret Floy Washburn, the first female psychologist, as well as such luminaries as Abraham Maslow. Following Wundt, Titchener sought to understand the nature of internal mental processes through a strategy he called “hard introspection,” designing and developing methods and measurements that attempted to standardize representations of internal experiences across individuals. His work became known as structuralism and vied for dominance in American psychology against William James' (Reference James1890) functionalist perspective, which primarily relied on observation. James had little patience for the structuralist perspective, claiming it had “plenty of school, but no thought” (James Reference James1904), and the influence of the paradigm essentially died with Titchener, who passed away in 1927. Although James won the early battle, he lost the war for the heart of psychological theory as both perspectives quickly lost influence to more systematic models offered by psychoanalysis and later behaviorism. Ironically, much like the way that the survey methodology developed by sociologists, such as Paul Lazarsfeld at Columbia University, allowed political scientists to break away from both psychologists and sociologists in their investigation of mass political behavior, the very experimental methods established by Wundt and Titchener, applied to behavior and not internal mental representation, allowed behaviorism to take hold and dominate academic psychology throughout the mid-twentieth century.
HISTORY OF EXPERIMENTAL ECONOMICSFootnote 3
The development and widespread use of experiments in economics has a more recent history. Early work was not experimental. Most histories attribute the birth of experimental economics to the classic Reference von Neumann and Morgenstern1944Theory of Games and Economic Behavior written by von Neumann and Morganstern and the later work in game theory it helped generate, including the foundational work by Thomas Schelling (Reference Schelling1960), Strategy of Conflict. Maurice Allais famous paradox, published in Econometrica in Reference Allais1953, provided the first experimental evidence of systematic violations of expected utility theory. Reinhard Selten, also working on this topic in Europe (Reference Selten1995), combined his work in mathematics and economics with courses he had taken in experimental psychology to explore the social implications of Von Neumann and Morganstern's work as well as Herbert Simon's notion of bounded rationality. Selten went on to develop the notion of sub-game perfection as an outgrowth of a larger experimental project, and he collaborated with Werner Guth, the first person to publish on the now ubiquitous Ultimatum game (Guth, Schmittberger, and Schwarz Reference Guth, Schmittberger and Schwarz1982). Around the time of Allais' early work, Vernon Smith (Reference Smith1962) began conducting some experiments in economics; the first work was published in the Journal of Political Economy. Smith (Reference Smith1976) is credited with creating experimental economics with his seminal piece on Induced Value Theory, in which he strongly advocated the use of stringent experimental methods to test economic theories and argued that such procedures constituted a rigorous empirical test of models developed using standard economic theory. Interestingly, he was soon joined by psychologists Amos Tversky and Daniel Kahneman (Reference Tversky and Kahneman1974; Kahneman and Tversky Reference Kahneman and Tversky1979) as well as others such as Paul Slovic, Sarah Litchenstein, and Robyn Dawes in conducting experiments on economic topics. Demonstrating that even the Nobel Prize Committee refused to privilege one discipline over the other, Smith and Kahneman both shared the Nobel Prize in Economics in 2002 for this early work in developing the field of behavioral economics that fundamentally rested on psychological experimental methods and procedures. It may appear that the development between psychological experimentation and experiments in behavior economics developed in circular ways. Throughout the early 1990s these fields proceeded in recursive interaction to refine experimental methods involving the use of simple games, although economists, not surprisingly, tended to focus on those methods exploring economic allocations whereas psychologists tended to concentrate on topics related to conflict and cooperation.
Note that each of these disciplines developed experimental methods for their own purposes to explore those topics most closely aligned with their own central theoretical concerns. In psychology, experimentalists wanted to understand the nature of internal human consciousness and experience and how those influences affected behavior. Experimental economists remained primarily concerned with the nature of human choice around preference, utility, and value and on understanding how those forces helped shape primarily economic decision making. Although both these concerns are central to the social choices political scientists study, neither concentrates specifically on the primary institutional and political concerns, particularly around dominance and the mechanisms of coercion and governance, that tend to preoccupy political scientists. These differences remain critical: if political scientists want to study the effect of governance across countries—whether that investigation involves exploring the influence of regime type, or humanitarian intervention, or any of the other myriad potential topics of examination—questions of deception and incentives may prove much less challenging than the logistical challenges faced by experimenters. Simple economic games in the field may reveal interesting cultural diverges across regions; however, investigators may be more interested in the effect of much larger changes on downstream societal consequences such as democratization, globalization, or economic liberalization. In these areas, experimenters may need to use natural experiments (Dunning Reference Dunning2012) that instigate changes exogenously, but nonetheless offer scholars the opportunity to examine the consequences of macro changes on micro practices and beliefs. As a result, although experiments remain the most powerful methodological tool to traction causation, political scientists often take sides in disciplinary differences that derive from disputes whose origins and meanings lie outside the purview of their interests; political scientists need not restrict themselves solely to the practices of economists or psychologists just because they developed particular styles best suited to their own needs and purposes.
TOWARD A POLITICAL SCIENCE CONSENSUS ON BEST PRACTICES IN EXPERIMENTAL METHODS
Several important explorations of the use of experiments in political science (Druckman et al. Reference Druckman, Green, Kuklinski and Lupia2011; Morton and Williams Reference Morton and Williams2010) provide useful information for political scientists to understand these issues in greater depth or to conduct experiments. However, as an unintentional result of divergent disciplinary mimicry, a lack of consensus about the proper use of experimentation in political science has emerged and hinders the inherent flexibility and power it offers. Rather than follow in lockstep with other disciplines, political science should adopt those experimental methods and techniques across disciplines of origin that best serves the purposes of the discipline and avenues of inquiry. As long as technical experimental procedure and protocol is followed, the particular plumage that surrounds these experiments can, and should, be dictated by what is most useful for answering the questions under investigation and not by what other disciplines dictate as proper protocol for investigations that serve different incentives, goals, and restraints.
Therefore, to begin a conversation within the field, if not to achieve such a consensus, I offer, with a nod to Robert Huckfeldt for suggesting the phrase, ten commandments to guide effective use of experiments for the purposes that define our disciplinary needs and purposes. I suggest these principles in true humor and humility, to generate more discussion and increased self-consciousness in the choices that we make. My purpose is to ensure practices that can simultaneously promote the health and safety of our subjects as well as our intellectual edification and professional advancement. I do not claim these commandments as the only way forward; I merely suggest that rendering such choices visible, rather than unconscious or automatic, serves our purposes better than simply following in paths laid down by others pursing different journals simply as a result of disciplinary envy.
1. Thou shalt replicate and be plentiful upon the discipline. A single experiment tells something about the results of a specific test in a particular population. The ability to generalize results derives not so much from the external validity of that singular population, but from replication of the result across many populations. Robust findings will replicate across populations, environments, and even operationalizations. Experiments best help locate the limits, dimensions, and contingencies of particular behaviors through a strategy of conscious and aggregated replication.
2. Thou shalt search far and wide for representative population samples. Sometimes student populations are more than adequate, especially to investigate phenomena that are expected to be universal. Sometimes other restricted populations prove sufficient for studying particular phenomena; for example, experiments would want soldiers in a study on the effects of combat on various psychological processes such as memory. Indeed, experimenters should seek to study the population that best represents the questions or problems posed by the investigator along with a relevant comparison control group.
3. Thou shalt randomize. This is the crux of the experimental method, upon which all analysis of results depends. Randomization is the process that allows for the reasonable exclusion of alternative explanations for observed findings. Randomization allows experimenters to discount the causal influence of any preexisting differences between subjects because such variance can safely be assumed to occur in a manner that will not systematically affect observed results.
4. Thou shalt publish null results. Of course, this admonition is directed more toward journal editors and field norms than toward individual scholars. But systematic publication bias wastes untold time, money, and energy, not to mention agony. Conscientious publication of null results would not only save numerous people countless wasted hours conducting an experiment unaware that others have tried and failed but also respects truth in seeking both positive and negative knowledge.
5. Thou shalt honor both laboratory and field experiments. Political scientists can, and should, welcome all experimental comers. For some questions, controlled laboratory experiments increase internal validity and allow the quickest and most cost-effective investigation. For other questions, taking the experiment into a field setting allows the investigator to explore the influence of real-world factors on subject responses, increases external validity, and expands the population it is possible to study. Experiments embedded in nationally representative samples also offer a particularly powerful and effective tool for achieving high levels of control in representative populations.
6. Thou shalt respect both internal and external validity. Psychologists privilege internal validity, whereas political scientists typically focus, some might say obsess, about external validity. Both are important, but an inherent trade-off exists between the two. Nonetheless, careful design can pay attention to both factors, understanding their mutual interdependence, and the temporal nature of their evolution. If there is no internal validity, external validity becomes a moot issue. If there is no external validity, internal validity is rendered airless and musty and devoid of real-life meaning and purpose.
7. Thou shalt allow promiscuous incentive structures. Subjects can be rewarded and incentivized in many ways. Economists believe that money represents the only true incentive, and this may reflect a self-fulfilling prophecy for economists who only care about money. But political scientists need not publish exclusively in journals that require monetary remuneration of subject populations, from whence the real incentive structure for economists emerges. In reality, most people who are not economists care about other things besides money that can be difficult to purchase with money, belying the notion that money constitutes an infinitely fungible resource; these goals include things such as love and family and status and reputation. Even when proxies, such as sex, can be purchased, the monetary exchange by definition changes the meaning, and thus perhaps the experience. Similar to the way that adjusting the temperature in the room can affect behavior without conscious awareness of the subjects, things that might be exchanged for money even in the near term, such as food or drink, achieve powerful force in a moment of deprivation or discomfort and thus can be used to experimental advantage in short-term, mild ways. Money can be an incentive, but so can credit for grades for students or juice squirts for a thirsty person. Any use of incentives implicitly assumes that subjects only receive what is offered in return, yet subjects enter every experiment with their internal incentives, perhaps unknown, and certainly uncontrolled by the experimenter. Experimenters should remain humble and realize that subject incentives, like subject behavior, might reflect untested assumptions. And it can improve experimental outcomes for researchers to understand what internal incentives structures might exist for subjects inherent within any given design, including the desire for knowledge, appreciation, and approval. Sometimes these desires can help an experimenter's purpose, sometimes they can undermine it. But like leaders who often issue statements to other leaders to impress domestic audiences, experimenters are always well served to remember that subjects may prefer to influence another subject rather than doing what the experimenter requests.
8. Thou shalt allow deception. Deception is not like kosher; it does not contaminate everything it touches. Leaving aside reflections about the conservative ideology that privileges concerns about purity over those associated with harm, just because an experiment that uses deception takes place in a particular room does not render said space forever unusable like some kind of viral radiation poisoning that inevitably infiltrates and violates any future experiment that takes place in that room. Again, political scientists need not restrict their publication venues to journals that prohibit deception. Like abortion, deception should be legal but rare. If it is not necessary, it should not be used. If it is the only, or best, way to explore important human phenomena, it should be used carefully and can be used legitimately. Some of the most important phenomena known about human psychology, from conformity to obedience to the capacity for violence and aggression, come from the use of deception. Few would argue that we would be better off without this knowledge, although most would agree that exquisite protection of subjects is absolutely demanded by such practices. Although some might argue it is better we not know the things we have learned through deception, such a debate rests on values outside this consideration. However, in my long experience, the majority of subjects care more about their time and their money or other incentives than they care about any duplicity being undertaken against them. Student populations, in particular, find these practices no more loathsome than lack of transparency in grading, and no economists seem to oppose the use of grades, lending some hypocrisy to their rejection of deception in all its forms. In my observation, only narcissism makes experimenters believe that subjects put the same kind of thought into their experimental purpose as they do. Experimenters may care about the use of deception, but subjects have better things to do with their time, and better things to think about than whether an experiment involves deception. If they want the incentive, they participate. If they do not, they will not. After all, human subjects require experimenters to tell subjects prior to participation that they can quit at any time. Note that the reason we know that such actions might prove difficult for subjects, and thus we inoculate them against this concern from the outset, is precisely because of the results we learned from experiments involving deception.
9. Thou shalt be thy own first subject. Like a good doctor who takes his own medicine first, and leaving aside those doctors who abuse their own drugs, experimenters should be their own first subject. They should first run through the experiment they want others to experience to see if any aspect of it makes them confused, uncomfortable, or uncertain. Experimenters should try to do this with a blind mindset without the preconceptions they derive from knowing the experiment's design and its purpose. This activity allows the experimenter to discover things that the subject will see that might have been lost on the experimenter during the complications of arriving at the design. This strategy can prevent unexpected problems from cropping up early in an experiment that cost time and possibly lose subject data. The experimenter may also gain important new ideas about possible extensions.
10. Thou shalt run and debrief at least some of one's own subjects. An experimenter never knows what happens in an experiment if he or she is not there to observe. And these observations can prove critical in subsequent understanding of unexpected findings or in generating ideas for follow-up studies. If an experiment is well designed, the experimenter should be able to run subjects blind as to condition in a way that will not allow him or her to run subjects without affecting demand characteristics in the experiment in a way that systematically influences subject response. Only by talking to subjects about their experience can experimenters really know whether subjects are actually interpreting the task or situation as the experimenter assumes. Such variance in interpretation helps identify the source of unexpected findings or failed manipulations. This knowledge alone sets the experimenter free to uncover new sources of influence and novel interpretations of results.
The beauty of experiments, as an art and a practice, is that they embody the conundrum that only discipline will set you free. Careful attention to fundamental aspects of random assignment, meticulous construction of the operationalization and measurement of variables, and precise treatment and control protocols will ensure that an experiment will achieve its greatest likelihood of success. In that precise attention to detail in the grounding of an experiment, creative and innovative ideas can take flight, and find rest on the branches of predictable human behavioral responses.