We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Intergroup attitudes and identity ties can shape foreign policy preferences. Anti-Muslim bias is particularly salient in the USA and the UK, but little work assesses whether this bias generalizes to other countries. We evaluate the extent of anti-Muslim bias in foreign policy attitudes through harmonized survey experiments in thirteen European countries (N=19,673). Experimental vignettes present factual reports of religious persecution by China, counter-stereotypically depicting Muslims as victims. We find evidence of anti-Muslim bias. Participants are less opposed to persecution and less likely to support intervention when Muslims, as opposed to other religious groups, are persecuted. However, this bias is not present in all countries. Exploratory analyses underscore that pre-existing intergroup attitudes and shared group identity moderate how group-based evaluations shape foreign policy attitudes. We provide extensive cross-national evidence that anti-Muslim bias is country-specific and that social identity ties and intergroup attitudes influence foreign policy preferences.
Three effects of apparently superficial changes in presentation (“framing effects” in abroad sense), were replicated together in the same repeated linear public goods experiment with real financial incentives. First, 32 repetitions were presented as four phases of 8 repetitions with a break and results summary in between. Contribution levels decayed during each phase but then persistently returned to about 50% after each re-start. Second, subjects contributed more when the payoff function was decomposed in terms of a gift which is multiplied and distributed to the other players, rather than the equivalent public good from which everyone benefits. Third, subjects contributed more following a comprehension task which asks them to calculate the benefits to the group of various actions (the “We” frame), rather than the benefits to themselves (the “I” frame). These results suggest that aspects of presentation may have strong and replicable effects on experimental findings, even when care is taken to make the language and presentation of instructions as neutral as possible. Experimental economists should therefore give careful consideration to potential framing effects—or, better still, explicitly test for them—before making claims about the external validity of results.
Accountability—the expectation on the side of the decision maker that she may have to justify her decisions in front of somebody else—has been found by psychologists to strongly influence decision-making processes. The awareness of this issue remains however limited amongst economists, who tend to focus on the motivational effects of financial incentives. Accountability and incentives may provide different motivations for decision makers, and disentangling their effects is thus important for understanding real-world situations in which both are present. Separating accountability and incentives, I find different effects. Accountability is found to reduce preference reversals between frames, for which incentives have no effect. Incentives on the other hand are found to reduce risk seeking for losses, where accountability has no effect. In a choice task between simple and compound events, accountability increases the preference for the normatively superior simple event, while incentives have a weaker effect going in the opposite direction.
We investigate the external validity of giving in the dictator game by using the misdirected letter technique in a within-subject design. First, subjects participated in standard dictator games (double blind) conducted in labs in two different studies. Second, after four to five weeks (study 1) or two years (study 2), we delivered prepared letters to the same subjects. The envelopes and the contents of the letters were designed to create the impression that they were misdirected by the mail delivery service. The letters contained 10 Euros (20 Swiss Francs in study 2) corresponding to the endowment of the in-lab experiments. We observe in both studies that subjects who showed other-regarding behavior in the lab returned the misdirected letters more often than subjects giving nothing, suggesting that in-lab behavior is related to behavior in the field.
This paper combines laboratory with field data from professional sellers to study whether social preferences are related to performance in open-air markets. The data show that sellers who are more pro-social in a laboratory experiment are also more successful in natural markets: They achieve higher prices for similar quality, have superior trade relations and better abilities to signal trustworthiness to buyers. These findings suggest that social preferences play a significant role for outcomes in natural markets.
Laboratory experiments are frequently used to examine the nature of individuals’ social and risk preferences and inform economic theory. However, it is unknown whether the preferences of volunteer participants are representative of the population from which the participants are drawn, or whether they differ due to selection bias. To answer this question, we measured the preferences of 1,173 students in a classroom experiment using a trust game and a lottery choice task. Separately, we invited all students to participate in a laboratory experiment using common recruitment procedures. To evaluate whether there is selection bias, we compare the social and risk preferences of students who eventually participated in a laboratory experiment to those who did not, and find that they do not differ significantly. However, we also find that people who sent less in a trust game were more likely to participate in a laboratory experiment, and discuss possible explanations for this behavior.
In this study, we conduct a laboratory experiment in which the subjects make choices between real-world lottery tickets typically purchased by lottery customers. In this way, we can reliably offer extremely high potential payoffs, something rarely possible in economic experiments. In a between-subject design, we separately manipulate several features that distinguish the situation faced by the customers in the field and by subjects in typical laboratory experiments. We also have the unique opportunity to compare our data to actual sales data provided by the operator of the lottery. We find the distributions to be highly similar (meaning high external validity for this particular setting). The only manipulation that makes a major difference is that when the probabilities of winning specific amounts are explicitly provided (which is not the case in the field), choices shift towards options with lower maximum possible payoff and lower payoff variance. We also find that subjects generally show preference for long shots and that standard laboratory measures of risk posture fail to explain their behavior in the main task.
In this paper we compare behaviour in a newspaper experiment with behaviour in the laboratory. Our workhorse is the Yes-No game. Unlike in ultimatum games responders of the Yes-No games do not know the proposal when deciding whether to accept or not. We use two different amounts that can be shared (100€ and 1000€). Unlike in other experiments with the ultimatum game we find a (small) effect of the size of the stakes. In line with findings for the ultimatum game, we find more generosity among women, older participants, and participants who submit their decision via postal mail than via Internet. By comparing our results with other studies (using executives or students), we demonstrate, at least for this type of game, the external validity of lab research.
Laboratory experiments are an important methodology in economics, especially in the field of behavioral economics. However, it is still debated to what extent results from laboratory experiments are informative about behavior in field settings. One highly important question about the external validity of experiments is whether the same individuals act in experiments as they would in the field. This paper presents evidence on how individuals behave in donation experiments and how the same individuals behave in a naturally occurring decision situation on charitable giving. While we find evidence that pro-social behavior is more accentuated in the lab, the data show that pro-social behavior in experiments is correlated with behavior in the field.
Past studies on laboratory corruption games have not been able to find consistent evidence that subjects make “immoral” decisions. A possible reason, and also a critique of laboratory corruption games, is that the experiment may fail to trigger the intended immorality frame in the minds of the participants, leading many to question the very raison d’être of laboratory corruption games. To test this idea, we compare behavior in a harassment bribery game with a strategically identical but neutrally framed ultimatum game. The results show that fewer people, both as briber and bribee, engage in corruption in the bribery frame than in the alternative and the average bribe amount is lesser in the former than in the latter. These suggest that moral costs are indeed at work. A third treatment, which relabels the bribery game in neutral language, indicates that the observed treatment effect arises not from the neutral language of the ultimatum game but from a change in the sense of entitlement between the bribery and ultimatum game frames. To provide further support that the bribery game does measure moral costs, we elicit the shared perceptions of appropriateness of the actions or social norm, under the two frames. We show that the social norm governing the bribery game frame and ultimatum game frame are indeed different and that the perceived sense of social appropriateness plays a crucial role in determining the actual behavior in the two frames. Furthermore, merely relabelling the bribery game in neutral language makes no difference to the social appropriateness norm governing it. This indicates that, just as in the case of actual behavior, the observed difference in social appropriateness norm between bribery game and ultimatum game comes from the difference in entitlement too. Finally, we comment on the external validity of behavior in lab corruption games.
The house-money effect, understood as people’s tendency to be more daring with easily-gotten money, is a behavioral pattern that poses questions about the external validity of experiments in economics: to what extent do people behave in experiments like they would have in a real-life situation, given that they play with easily-gotten house money? We ran an economic experiment with 122 students to measure the house-money effect on their risk preferences. They received an amount of money with which they made risky decisions involving losses and gains; a randomly selected treatment group received the money 21 days in advance and a control group got it the day of the experiment. From a simple calculation we found that participants in the treatment group only spent on average approximately 35 % of their cash in advance. The data confirms the well documented results that men are more tolerant to risk than women, and that individuals in general are more risk tolerant towards losses than towards gains. With our preferred specification, we find a mean CRRA risk aversion coefficient of 0.34, with a standard deviation of 0.09. Furthermore, if subjects in the treatment group spent 35 % of the endowment their CRRA risk aversion coefficient is higher than that of the control group by approximately 0.3 standard deviations. We interpret this result as evidence of a small and indirect house money effect operating though the amount of the cash in advance that was actually spent. We conclude that the house money effect may play a small role in decisions under uncertainty, especially when involving losses. Our novel design, however, could be used for other domains of decision making both in the lab and for calibration of economic models used in micro and macroeconomics.
This paper reports results of a natural field experiment on the dictator game where subjects are unaware that they are participating in an experiment. Three other experiments explore, step by step, how laboratory behavior of students relates to field behavior of a general population. In all experiments, subjects display an equally high amount of pro-social behavior, whether they are students or not, participate in a laboratory or not, or are aware of their participating in an experiment or not. This paper shows that there are settings where laboratory behavior of students is predictive for field behavior of a general population.
We examine the generalizability of single-topic studies, focusing on how often their confidence intervals capture the typical treatment effect from a larger population of possible studies. We show that the confidence intervals from these single-topic studies capture the typical effect from a population of topics at well below the nominal rate. For a plausible scenario, the confidence interval from a single-topic study might only be half as wide as an interval that captures the typical effect at the nominal rate. We highlight three important conclusions. First, we emphasize that researchers and readers must take care when generalizing the inferences from single-topic studies to a larger population of possible studies. Second, we demonstrate the critical importance of similarity across topics in drawing inferences and encourage researchers to consider designs that explicitly estimate and leverage similarity. Third, we emphasize that, despite their limitations, single-topic experiments have some important advantages.
This paper studies the impact of human subjects in the role of a seller on bidding in experimental second-price auctions. Overbidding is a robust finding in second-price auctions, and spite among bidders has been advanced as an explanation. If spite extends to the seller, then the absence of human sellers who receive the auction revenue may bias upwards the bidding behavior in existing experimental auctions. We derive the equilibrium bidding function in a model where bidders have preferences regarding both the payoffs of other bidders and the seller’s revenue. Overbidding is optimal when buyers are spiteful only towards other buyers. However, optimal bids are lower and potentially even truthful when spite extends to the seller. We experimentally test the model predictions by exogenously varying the presence of human subjects in the roles of the seller and competing bidders. We do not detect a systematic effect of the presence of a human seller on overbidding. We conclude that overbidding is not an artefact of the standard experimental implementation of second-price auctions in which human sellers are absent.
We develop a formal framework for accumulating evidence across studies and apply it to develop theoretical foundations for replication. Our primary contribution is to characterize the relationship between replication and distinct formulations of external validity. Whereas conventional wisdom holds that replication facilitates learning about external validity, we show that this is not, in general, the case. Our results show how comparisons of the magnitude or sign of empirical findings link to distinct concepts of external validity. However, without careful attention to the research design of constituent studies, replication can mislead efforts to assess external validity. We show that two studies must have essentially the same research designs, i.e., be harmonized, in order for their estimates to provide information about any kind of external validity. This result shows that even minor differences in research design between a study and its replication can introduce a discrepancy that is typically overlooked, a problem that becomes more pronounced as the number of studies increases. We conclude by outlining a design-driven approach to replication, which responds to the issues our framework identifies and details how a research agenda can manage them productively.
The validity of conclusions drawn from specific research studies must be evaluated in light of the purposes for which the research was undertaken. We distinguish four general types of research: description and point estimation, correlation and prediction, causal inference, and explanation. For causal and explanatory research, internal validity is critical – the extent to which a causal relationship can be inferred from the results of variation in the independent and dependent variables of an experiment. Random assignment is discussed as the key to avoiding threats to internal validity. Internal validity is distinguished from construct validity (the relationship between a theoretical construct and the methods used to operationalize that concept) and external validity (the extent to which the results of a research study can be generalized to other contexts). Construct validity is discussed in terms of multiple operations and discriminant and convergent validity assessment. External validity is discussed in terms of replicability, robustness, and relevance of specific research findings.
The accumulation of empirical evidence that has been collected in multiple contexts, places, and times requires a more comprehensive understanding of empirical research than is typically required for interpreting the findings from individual studies. We advance a novel conceptual framework where causal mechanisms are central to characterizing social phenomena that transcend context, place, or time. We distinguish various concepts of external validity, all of which characterize the relationship between the effects produced by mechanisms in different settings. Approaches to evidence accumulation require careful consideration of cross-study features, including theoretical considerations that link constituent studies and measurement considerations about how phenomena are quantifed. Our main theoretical contribution is developing uniting principles that constitute the qualitative and quantitative assumptions that form the basis for a quantitative relationship between constituent studies. We then apply our framework to three approaches to studying general social phenomena: meta-analysis, replication, and extrapolation.
This chapter examines the generalizability of the book’s main argument. It synthesizes the conclusions of other studies on the consequences of three similar episodes of forced migration in the twentieth century: the Greek-Turkish population exchange, the Partition of India, and the repatriation of Pied-Noirs to France. It then considers ways in which the argument can be extended to other cases of forced and voluntary migration.
from
Part II
-
The Practice of Experimentation in Sociology
Davide Barrera, Università degli Studi di Torino, Italy,Klarita Gërxhani, Vrije Universiteit, Amsterdam,Bernhard Kittel, Universität Wien, Austria,Luis Miller, Institute of Public Goods and Policies, Spanish National Research Council,Tobias Wolbring, School of Business, Economics and Society at the Friedrich-Alexander-University Erlangen-Nürnberg
Field experiments have a long tradition in some areas of the social and behavioral sciences and have become increasingly popular in sociology. Field experiments are staged in "natural" research settings where individuals usually interact in everyday life and regularly complete the task under investigation. The implementation in the field is the core feature distinguishing the approach from laboratory experiments. It is also one of the major reasons why researchers use field experiments; they allow incorporating social context, investigating subjects under "natural" conditions, and collecting unobtrusive measures of behavior. However, these advantages of field experiments come at the price of reduced control. In contrast to the controlled setting of the laboratory, many factors can influence the outcome but are not under the experimenter’s control and are often hard to measure in the field. Using field experiments on the broken windows theory, the strengths and potential pitfalls of experimenting in the field are illustrated. The chapter also covers the nascent area of digital field experiments, which share key features with other types of experiments but offer exciting new ways to study social behavior by enabling the collection large-scale data with fine-grained and unobtrusive behavioral measures at relatively low variable costs.
Survey experiments on probability samples are a popular method for investigating population-level causal questions due to their strong internal validity. However, lower survey response rates and an increased reliance on online convenience samples raise questions about the generalizability of survey experiments. We examine this concern using data from a collection of 50 survey experiments which represent a wide range of social science studies. Recruitment for these studies employed a unique double sampling strategy that first obtains a sample of “eager” respondents and then employs much more aggressive recruitment methods with the goal of adding “reluctant” respondents to the sample in a second sampling wave. This approach substantially increases the number of reluctant respondents who participate and also allows for straightforward categorization of eager and reluctant survey respondents within each sample. We find no evidence that treatment effects for eager and reluctant respondents differ substantially. Within demographic categories often used for weighting surveys, there is also little evidence of response heterogeneity between eager and reluctant respondents. Our results suggest that social science findings based on survey experiments, even in the modern era of very low response rates, provide reasonable estimates of population average treatment effects among a deeper pool of survey respondents in a wide range of settings.