Hostname: page-component-cd9895bd7-fscjk Total loading time: 0 Render date: 2024-12-23T12:54:51.421Z Has data issue: false hasContentIssue false

How to study cognitive decision algorithms: The case of the priority heuristic

Published online by Cambridge University Press:  01 January 2023

Klaus Fiedler*
Affiliation:
University of Heidelberg
*
* Correspondence should be addressed to Klaus Fiedler, Department of Psychology, University of Heidelberg, Hauptstrasse 47–51, 69117 Heidelberg, FRG. Email: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Although the priority heuristic (PH) is conceived as a cognitive-process model, some of its critical process assumptions remain to be tested. The PH makes very strong ordinal and quantitative assumptions about the strictly sequential, non-compensatory use of three cues in choices between lotteries: (1) the difference between worst outcomes, (2) the difference in worst-case probabilities, and (3) the best outcome that can be obtained. These aspects were manipulated orthogonally in the present experiment. No support was found for the PH. Although the main effect of the primary worst-outcome manipulation was significant, it came along with other effects that the PH excludes. A strong effect of the secondary manipulation of worst-outcome probabilities was not confined to small differences in worst-outcomes; it was actually stronger for large worst-outcome differences. Overall winning probabilities that the PH ignores exerted a systematic influence. The overall rate of choices correctly predicted by the PH was close to chance, although high inter-judge agreement reflected systematic responding. These findings raise fundamental questions about the theoretical status of heuristics as fixed modules.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
The authors license this article under the terms of the Creative Commons Attribution 3.0 License.
Copyright
Copyright © The Authors [2010] This is an Open Access article, distributed under the terms of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.

1 Introduction

For almost four decades, theoretical and empirical work on judgment and decision making has been inspired by the notion of cognitive heuristics. Accordingly, people rarely try to utilize all available information exhaustively, making perfectly accurate judgments. They are usually content with non-optimal but satisficing solutions (Simon, Reference Simon1983). The cognitive tools that afford such satisficing solutions are commonly called heuristics. Their reputation has improved enormously. Having first been devalued as mental short-cuts, sloppy rules of thumb, and sources of biases and shortcomings, in the more recent literature heuristics are often characterized as fast, frugal, and functional. “Simple heuristics that make us smart” (Gigerenzer, Todd and the ABC research group, Reference Gigerenzer and Todd1999; Katsikopoulos et al. Reference Katsikopoulos, Pachur, Machery and Wallin2008) were shown to outperform more ambitious models of rational inference in simulation studies (Gigerenzer & Goldstein, Reference Gigerenzer and Goldstein1996). Yet, in addition to the mathematical proof and simulation that heuristics may perform well when they are applied, the crucial psychological assumption says that decision makers actually do use such heuristics, which are sometimes explained as reflecting phylogenetic, evolutionary learning (Cosmides & Tooby, Reference Cosmides and Tooby2006; Todd, Reference Todd2000).

Although correlational evidence for the correspondence of a simulated heuristic and a validity criterion is sufficient to study the first (functional) aspect, hypothesis testing about the actual cognitive process supposed in a heuristic calls for the repertoire of experimental cognitive psychology. Thus, for a crucial test of the assumption that judgments of frequency or probability actually follow the availability heuristic (Tversky & Kahneman, Reference Tversky and Kahneman1973), it is essential to manipulate its crucial feature, namely the ease with which information comes to mind. Likewise, for a cogent test of the anchoring heuristic (Tversky & Kahneman, Reference Tversky and Kahneman1974), it has to be shown that judges actually adjust an initial extreme anchor insufficiently. Without appropriate experimental manipulations of the presumed mental operations, it is impossible to prove the causal role of the hypothesized heuristic process. The percentage of a focal heuristic’s correct predictions of judgments or decisions cannot provide cogent and distinct evidence about the underlying process (Hilbig & Pohl, Reference Hilbig and Pohl2008; Roberts & Pashler, Reference Roberts and Pashler2000).

In the early stage of the heuristics-and-biases research program, though, serious experimental attempts to assess the postulated cognitive operations had been remarkably rare. Hardly any experiment had manipulated the ease of a clearly specified retrieval operation supposed to underlie the availability heuristic (some exceptions were Schwarz et al., Reference Schwarz, Bless, Strack, Klumpp, Rittenauer-Schatka and Simons1991, and Wänke & Bless, Reference Wänke and Bless.2000) or the gradual adjustment process supposed to underlie the anchoring heuristic (Fiedler et al., Reference Fiedler, Schmid, Kurzenhaeuser and Schroeter2000).Footnote 1 More recently, though, this situation has been changing. A number of fast and frugal heuristics have been specified precisely enough to allow for strict experimental tests of underlying cognitive processes. Tests of the Take-the-Best heuristic (Gigerenzer et al., Reference Gigerenzer and Todd1999) have been concerned with the assumption that cues can be ordered by validity (Bröder & Schiffer, Reference Bröder and Schiffer2003; Newell et al., Reference Newell, Rakow, Weston and Shanks2004; Rieskamp & Otto, Reference Rieskamp and Otto2006). Research on the recognition heuristic tested whether comparative judgments are really determined by mere exposure rather than a substantive evaluation of the comparison objects (Hilbig & Pohl, Reference Hilbig and Pohl2008). It seems fair to conclude that strict empirical tests have resulted in a more critical picture of the validity and scope of the postulated heuristics (Dougherty, Franco-Watkins, & Thomas, Reference Dougherty, Franco-Watkins and Thomas2008; but see Gigerenzer, Hoffrage, & Goldstein, Reference Gigerenzer, Hoffrage and Goldstein2008).

1.1 The case of the priority heuristic (PH)

The present research aims to test the cognitive-process assumptions underlying another heuristic that was recently published in prominent journals, the priority heuristic (PH). The PH (Brandstätter, Gigerenzer & Hertwig, Reference Brandstätter, Gigerenzer and Hertwig2006, Reference Brandstätter, Gigerenzer and Hertwig2008) constitutes a non-compensatory heuristic that compares decision options on only one cue dimension at a time, rather than trying to integrate compensatory influences of two or more cues. The PH affords an algorithm for choosing between two outcome gambles or lotteries. It consists of three successive steps, concentrating first on the worst outcomes, then on the likelihood of the worst outcomes, and finally on the best possible outcome.Footnote 2

Specifically, let A and B be two lotteries, each with a maximal outcome, o maxA, o maxB and a minimal outcome o minA, o minB, with corresponding probabilities p(o maxA), p(o maxB), p(o minA), and p(o minB), respectively. Let A for convenience always be the risky lottery with the lowest possible outcome, o minA < o minB. Then PH involves the following three steps:

(1) Consider first the difference in worst outcome o minBo minA. Choose B and truncate the choice process if the worst-outcome difference in favor of B is larger than 1/10 of the overall maximum (i.e., of both o maxA and o maxB).

(2) If no choice in terms of clearly different worst outcomes is possible, consider next the worst-outcome probabilities, p(o minA), and p(o minB). Truncate the process if the worst-outcome probabilities differ by at least 1/10. In that case, choose the lottery with the lower p(o min); otherwise proceed to the third stage.

(3) Choose the lottery with the higher maximum outcome by comparing o maxA and o maxB. Otherwise guess randomly.

How to test the PH model. Although the PH relies on very strong and specific, hierarchically ordered assumptions about cue priority, truncation rules, and quantitative parameters (i.e., the critical 1/10 factor, called aspiration level, Simon, Reference Simon1983), it is supposed to be generally applicable to gambles in any content domain. Brandstätter et al. (Reference Brandstätter, Gigerenzer and Hertwig2006) explicitly propose the PH as a model of the actual cognitive process informing risky choices, assuming that the three cues of the PH, omin, p(omin), and omax, are used in a strictly sequential, non-compensatory way, with only one cue active at each process stage.Footnote 3 Although Brandstätter et al. (Reference Brandstätter, Gigerenzer and Hertwig2006) provide choice and latency data to support the PH’s process assumptions, other empirical tests (Ayal & Hochman, Reference Ayal and Hochman2009; Birnbaum, Reference Birnbaum2008a; Gloeckner & Betsch, Reference Gloeckner and Betsch2008; Hilbig, Reference Hilbig2008) suggest that PH assumptions may be hard to maintain. The assumption of a strictly sequential use of singular cues was tackled and refuted by Hilbig (Reference Hilbig2008) and by Birnbaum (Reference Birnbaum2008a), who found PH-incompatible cue interactions. Ayal and Hochman (Reference Ayal and Hochman2009) showed that reaction times, choice patterns, confidence level, and accuracy were better predicted by compensatory models. Johnson, Schulte-Mecklenbeck, and Willemsen (Reference Johnson, Schulte-Mecklenbeck and Willemsen2008) complained the paucity of process data obtained to substantiate the PH assumptions. However, in spite of this growing evidence, no experiment so far has directly manipulated the aspiration levels of the three-step PH decision process, which together make up the design of the present study. While this is certainly not the only possible way to test the PH, the following considerations provide a straightforward test of its core assumptions.

First, if the initial step involves a comparison of o minBo minA with 1/10 of the maximal outcome, it is essential to manipulate maximal outcomes to be either smaller or larger than 1/10 of that difference. In the study reported below, the worst-outcome difference factor varies such that o minBo minA is either 1/7 or 1/14 of o maxA. If the primary PH process assumption is correct, people should uniformly (or at least mostly) choose B in the 1/7 condition. No other factors should influence the decision process in this condition. In particular, no probability should be considered. Even when, say, p(o maxA) is very high, or p(o maxB) is very low, this should be radically ignored if PH assumptions are taken seriously.

Second, for a critical test of the next process step, which should only be reached for worst-outcome differences of 1/14, it is necessary to manipulate the worst-case probability difference, p(o minB) – p(o minA), to be either greater or smaller than 10% (i.e., 1/10 of the possible range of p). Specifically, we manipulate this difference to be +5% versus +40%. If, and only if, p(o minB) and p(o minA) do differ by 40% (i.e., more than 1/10), lottery A should be uniformly chosen. Otherwise, if the worst outcome of B is not clearly more likely than the worst case of A, people should proceed to the third step.

In this final step, the lottery with the higher maximum outcome should be chosen, regardless of all other differences between A and B. Neither the cues considered earlier during the first two steps nor any other aspect of the lotteries should have any influence. For a step-3 test of the strong single-cue assumption, it is appropriate to manipulate one other plausible factor that is irrelevant to the PH. Specifically, we manipulate the overall winning probability for p(o maxB) to be either small or moderate (10% or 30% + increments). According to PH, this irrelevant factor should never play any role for the preference decision. Because p(o maxB) = 1 – p(o minB) and p(o maxA) = 1 – p(o minA), the remaining probability p(o maxA) is equal to p(o maxB) plus the same difference (i.e., 5% or 40%) that holds between p(o minB) – p(o minA).

Such an experiment, to be sure, provides ample opportunities to falsify the PH model: Participants may not uniformly choose B at step 1 when the worst outcome difference is 1/7 of the maximal outcome. Other factors or interactions thereof may influence the choice even when the primary worst-case difference is large enough for a quick step-1 decision. At step 2, people may not choose A if the worst-case probability difference is marked (.40). Rather, the manipulation of p(o minB) – p(o minA) may interact with other factors, contrary to the single-cue assumption of a non-compensatory heuristic. Similarly, at step 3, many people may not choose the option with the highest outcome, or a tendency to do so may interact with the baseline winning probability, or any other factor. While the long list of possible violations of the PH algorithm may appear too strict and almost “unfair”, it only highlights how strong and demanding a process model the PH represents. Rather than protecting the PH from strict tests, my strategy here is to test it critically, taking its precise assumptions seriously. However, such an exercise may yield results that are more generally applicable.

1.2 How to select lotteries for a PH test

It is worthwhile reasoning about an appropriate sampling of test cases. In prior publications, PH proponents have referred to often-cited choice sets (e.g., Lopes & Oden, Reference Lopes and Oden1999; Tversky & Kahneman, Reference Tversky and Kahneman1992) and extreme cases like the Allais (Reference Allais1979) paradox to substantiate the viability and validity of the heuristic. Critiques (Gloeckner & Betsch, Reference Gloeckner and Betsch2008; Hilbig, Reference Hilbig2008) have applied the PH to randomly constructed tests that were not meant to represent an explicitly defined universe of lotteries or gambles. The stimulus sampling for the present research is guided by the following rationale.

Many economic, social, or health-related decisions involve a trade-off between risk and payoff. To increase one’s payoff or satisfaction, one has to accept an elevated risk level. Accordingly, pairs of lotteries, A and B, were used such that A leads to a high payoff in the fortunate case, whereas in the unfortunate case A’s payoff is lower than in the worst case of a less risky lottery B, which has a flatter outcome distribution. Lotteries that fit this scenario are generated from different basic worst outcomes (i.e., 20,30; 10,80; 80,90; 30,60 for o minA and o minB, respectively), to which a small random increment (chosen from a rectangular distribution of natural numbers from 1 to 10) is added to create different versions. The maximum outcome o maxA is then either 7 times or 14 times as large as the worst outcome difference o minBo minA. The other lottery’s maximum outcome, o maxB, is set to o maxA · o minA/o minB so that B’s relative advantage in the worst case is proportional to a corresponding disadvantage in the fortunate case. Next the baseline probability p(o maxB) for a success on the non-risky lottery B, p(o maxB), is set to a starting value (either 10% or 30%, again plus a rectangular random increment between 1% and 10%); the complementary probability p(o minB) = 1 – p(o maxB) holds for the unfortunate outcome of B. For the risky option A, a probability increment (either 5% or 40%) is added to the fortunate outcome, p(o maxA) = p(o maxB) + increment, to compensate for A’s disadvantage in o min, while the same decrement is subtracted from the unfortunate outcome, p(o minA) = p(o minB) – increment.

The resulting lotteries (see Appendix A) are meant to be representative of real life trade-offs between risk and hedonic payoff. No boundaries are placed on the ratio of the two options’ expected values (EV). Yet, regarding Brandstätter et al.’s (Reference Brandstätter, Gigerenzer and Hertwig2006, Reference Brandstätter, Gigerenzer and Hertwig2008) contention that the PH functions well only for gambles of similar EV, it should be noted that three of the four worst-outcome starting pairs (20,30; 80,90; 30,60) produce mainly EV ratios well below 2. Only 10,80 tasks often yield higher EV ratios in the range of 2 to 6. In any case, the lotteries’ EV ratio will be controlled in the data analysis as a relevant boundary condition.

2 Methods

Participants and design. Fifty male and female students participated either for payment or to meet a study requirement. They were randomly assigned to one of two questionnaires containing different sets of 32 lottery tasks. Each questionnaire (odd and even row numbers in Appendix A) included four lottery pairs for each of 8 = 2 x 2 x 2 combinations of three within-subjects manipulations: worst-outcome difference (1/7 vs. 1/14 of o max) x worst-case probability difference (5% vs. 40%) x baseline winning probability (10% vs. 30% plus increments). The major dependent variable was the proportion of choices of the risky option (out of the 4 replications per condition), supplemented by ratings of the experienced difficulty of the decision and of an appropriate price to pay for the lottery.

Materials and procedure. Two independent computer-generated sets of 32 lottery tasks comprised the two forms of a questionnaire, both constructed according to the aforementioned generative rules but with different added random components and in different random order. Four lottery pairs were included on each page, each consisting of a tabular presentation of the lottery pair, a prompt to tick the preferred lottery, a pricing task prompted by the sentence “I would be willing to pay the following price for the chosen lottery ____ €”,Footnote 4 and a five-point rating of how easy or difficult it was to choose A or B (1 = Very easy, 5 = very difficult). The original questionnaire format is present in Appendix B.

Instructions were provided on the cover page. A cover story mentioned that the research aimed at finding out how the attractiveness of lotteries can be increased. In particular, we were allegedly interested in whether the frustration of not winning can be ameliorated when the bad outcome is not zero but some smaller payoff that is still higher than zero. To help the researchers investigate this question, participants were asked to imagine fictitious horse-betting games, each one involving two horses. For each lottery task presented in the format of Appendix B, they had to tick a preferred horse, to indicate their willingness to pay for the lottery in Euro, and to indicate the experienced difficulty on a five-point scale.

3 Results

Quality of judgments. For a first check on the reliability and regularity of the preference data, I examined the extent to which the 32 lottery tasks solicited similar responses from different judges, using an index of inter-judge agreement suggested by Rosenthal (Reference Rosenthal1987). This index, which is comparable to Cronbach’s internal consistency, can vary between 0 and 1. It increases to the extent that there is little variance between decision makers in discriminating between lotteries. Inter-judge agreement was considerable (r = .73 and .79 for the two sets), testifying to the participants’ accuracy and motivation. Given this initial check on the quality of data, the present results cannot be discarded as due to noise or low motivation.

Lottery preferences. Let us now turn to the crucial question of whether the influences of the experimental manipulations on the lottery preferences support the PH predictions. For a suitable empirical measure, the number of A choices (i.e., choices of the risky alternative) per participant was computed for each of the eight within-subjects conditions, across all four tasks in each condition. Dividing these numbers by four resulted in convenient proportions, which can be called risky choice (RC) scores. These proportion scores were roughly normally distributed with homogeneous variance.

To recapitulate, the PH predicts that RC should be uniformly low for a worst-outcome difference of 1/7, when quick and conflictless step-1 decisions should produce unambivalent choices of the safer alternative, B. Only for an outcome difference of 1/14 should the PH allow for A choices, thus producing high RC scores. Note also that within the present design, the PH predicts always A choices when the worst-outcome difference is 1/14. This clear-cut prediction follows from two design features. Either in the 40% probability-difference condition, option B is too likely to yield its worst outcome in step 2. Or, if no choice is made in the 5% probability difference condition in step 2, step 3 will also lead to an A choice, because o maxA is maximal. In any case, in a three-factorial analysis of variance (ANOVA), the main effect of the worst-outcome difference factor represents a condensed test of all three PH predictions, all implying higher RC scores for the 1/14 than the 1/7 condition.

The PH does not predict any significant influence of the worst-case probability difference manipulation in the present design, because for a worst-outcome difference of 1/7 the probability difference should be bypassed anyway, and for 1/14 both probability differences should equally lead to A choices. Thus, any main effect or interaction involving the probability difference factor can only disconfirm the PH. (Notice that this prediction holds regardless of the probability-difference cutoff.) Similarly, any impact of the third factor, basic winning probability of the risky option, would contradict the PH. To the extent that this PH-irrelevant factor exerts a direct influence, or moderates the influence of other factors’, this could only disconfirm the PH.

A glance at Figure 1a, which displays mean RC as a function of experimental conditions, reveals a pattern that diverges from the PH predictions in many respects. Although a worst-outcome main effect, F(1,49) = 9.37, p < .01, reflects the predicted tendency for RC to increase from 1/7 to 1/14 outcome differences (Mean RC = .734 vs. .833), this tendency is restricted to small (5%) differences of worst-outcome probability (.611 vs. .721). It completely disappears for large (40%) probability differences (.875 vs. .880), as manifested in a worst-outcome x probability difference interaction, F(1,49) = 12.17, p < .01. This pattern is inconsistent with the assumed priority of worst outcomes over worst-outcome probabilities (regardless of the ratio between the worst-outcome difference and the maximum outcome, which the PH requires as a condition for using the worst outcome).

Figure 1: Mean proportions of risky choices (a), mean difficulty ratings (b), and mean pricing index (c) as a function of experimental conditions.

The secondary factor, difference in worst-outcome probability, produces a strong main effect, F(1,49) = 45.39, p < .001. The preference for the risky option A is clearly stronger when the worst-outcome probability difference favors A by 40% rather than only 5% (Mean RC = .878 vs. .666). This dominant effect is not nested within the small (1/14) worst-outcome difference (.880 vs. .760). The interaction (Figure 1a) shows that it is indeed stronger for large (1/7) differences (.875 vs. .611), contrary to the PH’s prediction that probability differences are ignored for worst-outcome differences exceeding 1/10 of the maximal outcome. (Again, this result does not depend on any assumption about the threshold for considering the worst-outcome difference.)

Finally, the main effect of the third factor, overall winning probability, is also significant, F(1,49) = 9.75, p < .01. Risky choices were more frequent for moderate winning chances of 30% (+ increments) than for small chances of 10% (+ increments). The corresponding difference in mean RC amounts to .814 versus .730. According to the PH, winning probabilities should be totally ignored. No other main effect or interaction was significant in the ANOVA of RC scores.

Altogether, this pattern diverges markedly from the core predictions derived from the PH model. The homogeneous main effect for the worst-outcome factor that the PH predicts for the present design was not obtained. Neither the PH’s assumption about the strictly sequential nature of the choice process nor the assumption of a strictly non-compensatory process, with only one cue operating at a time, can be reconciled with the present findings.

Consistency with PH predictions. Within each participant, a fit score was computed as the average proportion of choices consistent with the PH. The average fit score amounts to .526, hardly different from the chance rate of .50. A small standard deviation of only .103 suggests that this low fit holds for most individual participants. Forty-two of the 50 participants had fit scores between .40 and .60; only eight were above .60.Footnote 5

Counting only difficult lotteries with EV ratios of maximally 1.2, as suggested by Brandstätter et al. (Reference Brandstätter, Gigerenzer and Hertwig2008), the average fit score does not increase (M = .529).Footnote 6 The number of participants whose fit score exceeds .50 is 29 (out of 50) for all lotteries and 31 for difficult lotteries with an EV ratio ≤ 1.2. The corresponding numbers for participants exceeding a fit of .65 is 3 and 5, respectively.Footnote 7

Within participants, though, PH-consistent choices varied strongly as a function of experimental conditions (see Appendix A). A three-factorial ANOVA of the fit scores yielded a strong main effect for the worst-outcome difference, F(1,49) = 84.92, p < .001. PH fit was much lower in the 1/7 than in the 1/14 condition (.256 vs. .795). Apparently, this difference is due to the fact that the PH did not predict the relatively high rate of risky choices that were made even when the difference in worst outcomes exceeded 1/10 of the maximal outcome.

However, a strong worst-outcome difference x worst-case probability difference interaction, F(1,49) = 44.87, p < .001, indicates that the dependence of the PH fit on the first factor was greatly reduced when the probability difference was 5% (.388 vs. .715) rather than 40% (.125 vs. .875). The worst-outcome factor also interacted with the winning probability, F(1,48) = 9.62, p < .01; the impact of worst outcomes increased from low (10%) to moderate (30%) winning probability. The only other significant result was a main effect for the worst-case probability difference, F(1,49) = 11.33, p < .01, reflecting a higher PH fit for probability differences of 5% (.551) rather than 40% (.500). Thus, neither the absolute fit nor the relative differences of PH fit between task conditions lend support to the PH (see Appendix A).

Subjective difficulty. It is interesting to look at the ratings of subjective difficulty. A main effect for the worst-outcome difference in an ANOVA of the difficulty ratings, F(1,48) = 11.57, p < .01,Footnote 8 shows that choices were experienced as more difficult when the worst-outcome difference decreased from 1/7 (M = 1.95) to 1/14 (M = 2.20), although the PH fit was higher for the latter condition. However, a worst-case probability difference main effect, F(1,48) = 7.53, p < .01, and a worst-outcome difference x probability difference interaction, F(1,48) = 7.51, p < .05, together preclude an interpretation of the worst-outcome main effect as a reflection of the number of process steps. Figure 1b shows that when the probability difference was low (5%) rather than high (40%), the worst-outcome influence on perceived difficulty disappeared completely (Figure 1b).Footnote 9 I refrain from interpreting the three-way interaction, which was also significant, F(1,48) = 9.54, p < .01.

Willingness to Pay (WTP). Only 26 participants responded persistently to the WTP tasks. For these participants, I calculated a WTP index by multiplying the indicated price by +1 and –1 for A and B choices, respectively. Analogously to the RC index, the WTP index increases to the extent that WTP is higher for A than for B. The WTP ANOVA yielded only a significant main effects for the worst-case probability difference, F(1,25) = 7.06, p < .05, reflecting higher WTP when the probability difference was high (40%) rather than low (5%) (see Figure 1c). Across all 64 lottery items, the correlation between RC and WTP was r = .57.

4 Discussion

On summary, the results obtained in the present experiment do not support the PH. Preferential choices between pairs of lotteries did not follow the PH’s three-step decision process that was captured by the three design factors. Although the PH model predicted only one dominant main effect for the worst-outcome difference between the two lotteries, this main effect was strongly moderated by other factors. The strongest result was due to the manipulation of the worst-case probability difference, pertaining to the second stage of the PH process. The impact of this manipulation was not confined to the weak (1/14) worst-outcome difference; it was actually enhanced for the strong (1/7) worst-outcome difference condition, in which the worst-case probability should play no role. Moreover, risky choices also increased as a function of increasing winning probabilities, which should be totally ignored. Altogether, then, this pattern is inconsistent with a sequentially ordered three-stage process, with only one cue being active at every stage. Neither the individual participants’ choices nor the average choices per decision item reached a satisfactory fit with the PH predictions.

One might argue that certain boundary conditions for the domain of the PH were not met. For instance, the PH may be confined to difficult choice problems with EV-ratios as low as maximally 1.2 (Brandstätter et al., Reference Brandstätter, Gigerenzer and Hertwig2008). Although this condition was not met for a subset of tasks, the remaining subset of lotteries with an EV ratio of maximally 1.2 yielded the same poor fit. Thus, in the context of 64 problems constructed to represent challenging conditions for the PH, the negative correlation between EV ratio and PH fit seems to disappear.

From a logical or psychological perspective, indeed, introducing such a restrictive EV assumption is highly unsatisfactory for two reasons. First, restricting the PH domain to nearly equal-EV gambles implies that compensation (of a high p by a low o and vice versa) is guaranteed by design. A strong test of the non-compensatory process assumption is not possible if the non-attended dimension (e.g., the p dimension when the focus is on o in the first stage) is not allowed to take values that produce discrepant EVs. And secondly, it is hard to understand why an EV must be determined as a precondition for the selection of a heuristic supposed to be much simpler and faster than EV calculation. Even when some proxy is used to estimate EV, rather than computing EV properly, the question is why the heuristic does not use that proxy but resorts instead to a refined three-stage algorithm.

One might also object that the PH is but one item from an adaptive toolbox containing many different heuristics. The failure of the PH under the present task conditions may mean only that some other heuristic was at work. However, while the adaptive toolbox affords an intriguing theoretical perspective, it has to go beyond the truism that many heuristics can explain many behaviors. It is rather necessary to figure out constraints on the operation of the PH. What conditions delimit the heuristic’s domain, and what behavioral outcomes does the underlying model exclude when the domain-specific conditions are met? Elaborating on these two kinds of constraints is essential for theoretical progress (Platt, Reference Platt1964; Roberts & Pashler, Reference Roberts and Pashler2000). Most findings obtained in the present study are of the kind that the PH model would exclude, even though the starting PH conditions were met in most of the lottery tasks. Although the EV ratio exceeded 1.2 in a small subset of tasks, the exclusion of these tasks did not increase the PH’s ability to account for the present decision data.

The purpose of the present study was to test critical implications of the PH. It was not designed to provide a comprehensive test of all alternative models, such as cumulative prospect theory (Tversky & Kahneman, Reference Tversky and Kahneman1992) or the transfer of attention exchange model (Birnbaum, Reference Birnbaum2008a). Again, an informed test of these models would have to rely on the controlled manipulation of distinct constraints imposed by these models, rather than the mere correlation of their predictions with the present data. Given the extended debate instigated by the PH (Birnbaum, Reference Birnbaum2008b; Johnson et al., Reference Johnson, Schulte-Mecklenbeck and Willemsen2008) and the attention it is receiving in decision research, its critical analysis should be a valuable research topic in its own right.

While I defend Popperian scrutiny as constructive and enlightening rather than merely sceptical, I hasten to add that the purpose of the present study has never been only to disconfirm a model as specific as the PH. It is rather motivated by general concerns about the manner in which heuristics as explanatory constructs should be tested. The PH is but a welcome example to illustrate this theoretical and methodological issue. It highlights the notion that rates of correct predictions do not afford an appropriate test of cognitive-process assumptions (Roberts & Pashler, Reference Roberts and Pashler2000). If correspondence alone counts (i.e., the predictive success of a model across a range of applications), then we would have to accept that Clever Hans, the horse whose enumerative motor responses corresponded to the correct solution of arithmetic problems, was actually able to calculate, rather than using his owner’s subtle non-verbal signals (Pfungst, Reference Pfungst1911). Just as explaining Clever Hans’ miracle required more than a correctness count, tests of heuristic models also call for manipulations of their critical features. If a heuristic is to explain the decision process, rather than only providing a paramorphic model (Hoffman, Reference Hoffman1960), it is essential to test its distinct process features.

One problematic feature of the PH that I believe deserves to be discussed more openly is the overly strong assumption that only one cue is utilized at a time, in a strictly sequential order. Brunswik’s (Reference Brunswik1952) notion of vicarious functioning has told us that organisms flexibly change and combine the cues they are using, rather than always adhering to fixed sequential algorithms. Just as depth perception calls for a flexible use of different cues when it is dark rather than light, when one eye is closed, or when sound is available in addition to visual cues, the evaluation of preferences under risk need not obey any specific sequence of domain-unspecific cues, all in a strictly sequential order.

From the cognitive psychology of concept learning (Evans et al., Reference Evans, Clibbens, Cattani, Harris and Dennis2003), and scientific hypothesis testing (Mynatt, Doherty, & Tweney, Reference Mynatt, Doherty and Tweney1977), we know how difficult it is to verify a complex, conjunctive hypothesis. The PH postulates a sophisticated interplay of three specific cues, ordered in one and only one sequence, constrained to only one active cue at a time, applying an ominous 1/10 parameter as a stopping rule, and excluding all other cues. Logically, testing such a complex hypothesis means to exclude hundreds of alternative hypotheses that deviate from the PH in one or two or more aspects, or in countless combinations thereof. A research strategy that focuses on such complex concepts requires hundreds of parametrically organized studies to rule out alternative accounts.

Adaptive cognition is the ability to utilize and combine elementary cues in countless ways, depending on the requirements of the current situation. Organisms can quickly re-learn and invert vertical orientation when wearing mirror glasses (Kohler, Reference Kohler1956). They can reverse the fluency cue, learning that truth is associated with easy rather than difficult stimuli in a certain task context (Unkelbach, Reference Unkelbach2006). In priming experiments, they can learn to expect incongruent rather than congruent prime-target transitions. Given this amazing flexibility, or vicarious functioning, at the level of elementary cues, the question that suggests itself is what learning process – ontogenetic or phylogenetic — should support the acquisition of a strictly sequential, syntactically ordered cue utilization process that is restricted to one and only one cue, let alone the fundamental question of how singular cues can be distinguished from relational cues and interactions of multiple subordinate cues.

Raising these theoretical and logical questions is the ultimate purpose of the present paper. The PH is but a provocative exemplar of a research program that continues to fascinate psychologists, while at the same time reminding them of persisting theoretical and methodological problems.

Appendix A: Overview of results for all lotteries used to study the Priority Heuristic (PH)

Note: Odd and even row numbers represent the two questionnaire versions.

Appendix B: Presentation format for lottery tasks

Footnotes

*

The research underlying this paper was supported by the Deutsche Forschungsgemeinschaft. Very helpful comments on a draft of this paper came from Benjamin Hilbig, Ralph Hertwig, Ulrich Hoffrage, and Florian Kutzner.

Note: Odd and even row numbers represent the two questionnaire versions.

1 Although Schwarz et al. (1991) are often cited for their valuable attempt to separate ease and amount of generated information, their manipulation pertains to the experienced ease of a task rather than ease of a specific retrieval operation.

2 This characterization holds only for lotteries involving gains. For losses, the PH starts by considering best outcomes.

3 Note that, strictly speaking, the assumption of a single-cue strategy contradicts the fact that the omin difference cue entails a comparison with omax.

4 One might argue that the inclusion of a pricing task across all 32 lottery pairs may induce an analytical mindset that works against the use of PH. However, theoretically, the PH is not constrained to task settings that inhibit price calculations. If so, this would reduce the PH’s generality and its applicability in reality.

5 The raw data and other files are found in http://journal.sjdm.org/vol5.1.html.

6 Across all 64 tasks, the correlation of EV ratios and average fit score per item was even slightly positive, r = .22. PH fit slightly increased with increasing EV ratio.

7 For all 32 lotteries, .65 has a p-level of .055 one tailed, uncorrected for multiple tests.

8 Only 59 participants, who provided all difficulty ratings, could be included in this analysis.

9 The finding that difficulty was highest when a small difference in worst outcomes (1/14) coincided with a large probability difference (40%) is not quite consistent with Hilbig’s (2008) contention that subjective difficulty reflects the combined difficulty of different cues.

References

Allais, M. (1979). The so-called Allais paradox and rational decisions under uncertainty. In M. Allais & O. Hagen (Eds.), Expected utility hypotheses and the Allais paradox (pp. 437681). Dordrecht, the Netherlands: Reidel.10.1007/978-94-015-7629-1_17CrossRefGoogle Scholar
Ayal, S. & Hochman, G. (2009). Ignorance or integration: The cognitive processes underlying choice behavior. Journal of Behavioral Decision Making, 22, 455474.CrossRefGoogle Scholar
Birnbaum, M. (2008a). New tests of cumulative prospect theory and the priority heuristic: Probability-outcome tradeoff with branch splitting. Judgment and Decision Making, 3(4), 304316.CrossRefGoogle Scholar
Birnbaum, M. H. (2008b). Evaluation of the priority heuristic as a descriptive model of risky decision making: Comment on Brandstätter, Gigerenzer, and Hertwig (2006). Psychological Review, 115, 253262.10.1037/0033-295X.115.1.253CrossRefGoogle Scholar
Brandstätter, E., Gigerenzer, G., & Hertwig, R. (2006). The Priority Heuristic: Making choices without trade-offs. Psychological Review, 113, 409432.10.1037/0033-295X.113.2.409CrossRefGoogle ScholarPubMed
Brandstätter, E., Gigerenzer, G., & Hertwig, R. (2008). Risky choice with heuristics: Reply to Birnbaum (2008), Johnson, Schulte-Mecklenbeck, and Willemsen (2008), and Rieger and Wang (2008). Psychological Review, 115, 281289.10.1037/0033-295X.115.1.281CrossRefGoogle Scholar
Bröder, A., & Schiffer, S. (2003). Take The Best versus simultaneous feature matching: Probabilistic inferences from memory and effects of reprensentation format. Journal of Experimental Psychology: General, 132, 277293.10.1037/0096-3445.132.2.277CrossRefGoogle ScholarPubMed
Brunswik, E. (1952). Conceptual framework of psychology. Chicago, IL: University of Chicago Press.Google Scholar
Cosmides, L., & Tooby, J. (2006). Evolutionary psychology, moral heuristics, and the law. Heuristics and the law (pp. 175205). Cambridge, MA Berlin USGermany: MIT Press.10.7551/mitpress/3488.003.0012CrossRefGoogle Scholar
Dougherty, M., Franco-Watkins, A., & Thomas, R. (2008). Psychological plausibility of the theory of probabilistic mental models and the fast and frugal heuristics. Psychological Review, 115, 199211.10.1037/0033-295X.115.1.199CrossRefGoogle ScholarPubMed
Evans, J. St. B. T., Clibbens, J., Cattani, A., Harris, A., & Dennis, I. (2003). Explicit and implicit processes in multicue judgement. Memory and Cognition, 31, 608618.10.3758/BF03196101CrossRefGoogle Scholar
Fiedler, K., Schmid, J., Kurzenhaeuser, S., & Schroeter, V. (2000). Lie detection as an attribution process: The anchoring effect revisited. In V. De Pascalis, V. Gheorghiu, P.W. Sheehan & I. Kirsch (Eds.). Suggestion and suggestibility: Advances in theory and research (pp. 113136). Munich: M.E.G. Stiftung.Google Scholar
Gigerenzer, G., & Goldstein, D. (1996). Reasoning the fast and frugal way: Models of bounded rationality. Psychological Review, 103, 650669.CrossRefGoogle ScholarPubMed
Gigerenzer, G., Hoffrage, U., & Goldstein, D. (2008). Postscript: Fast and frugal heuristics. Psychological Review, 115, 238239.CrossRefGoogle Scholar
Gigerenzer, G., Todd, P. M., and the ABC group. (1999). Simple heuristics that make us smart. Evolution and cognition. New York, NY: Oxford University Press.Google Scholar
Gloeckner, A., & Betsch, T. (2008). Do people make decisions under risk based on ignorance? An empirical test of the priority heuristic against cumulative prospect theory. Organizational Behavior and Human Decision Processes, 107, 7595.CrossRefGoogle Scholar
Hilbig, B. E. (2008). One-reason decision making in risky choice? A closer look at the priority heuristic. Judgment and Decision Making, 3(6), 457462.10.1017/S1930297500000036CrossRefGoogle Scholar
Hilbig, B. E., & Pohl, R. F. (2008). Recognizing users of the recognition heuristic. Experimental Psychology, 55, 394401.10.1027/1618-3169.55.6.394CrossRefGoogle ScholarPubMed
Hoffman, P.J. (1960). The paramorphic representation of clinical judgment. Psychological Bulletin, 57, 116131.CrossRefGoogle ScholarPubMed
Johnson, E. J., Schulte-Mecklenbeck, M., & Willemsen, M. C. (2008). Process models deserve process data: Comment on Brandstätter, Gigerenzer, and Hertwig (2006). Psychological Review, 115, 263272.CrossRefGoogle Scholar
Katsikopoulos, K., Pachur, T., Machery, E., & Wallin, A. (2008). From Meehl to fast and frugal heuristics (and back): New insights into how to bridge the clinical-actuarial divide. Theory & Psychology, 18, 443464.CrossRefGoogle Scholar
Kohler, I. (1956). Der Brillenversuch in der Wahrnehmungspsychologie mit Bemerkungen zur Lehre von der Adaptation [The mirror glass experiment in perception psychology with comments on the study of adaptation]. Zeitschrift für Experimentelle und Angewandte Psychologie, 3, 381417.Google Scholar
Lopes, L. L., & Oden, G. C. (1999). The role of aspiration level in risky choice: A comparison of cumulative prospect theory and SP/A theory. Journal of Mathematical Psychology, 43, 286313.10.1006/jmps.1999.1259CrossRefGoogle Scholar
Mynatt, C., Doherty, M., & Tweney, R. (1977). Confirmation bias in a simulated research environment: An experimental study of scientific inference. The Quarterly Journal of Experimental Psychology, 29, 8595.10.1080/00335557743000053CrossRefGoogle Scholar
Newell, B. R., Rakow, T., Weston, N. J., & Shanks, D. R. (2004). Search strategies in decision making: The success of “success.” Journal of Behavioral Decision Making, 17, 117137.CrossRefGoogle Scholar
Platt, J. R. (1964). Strong inference. Science, 146, 347353.10.1126/science.146.3642.347CrossRefGoogle ScholarPubMed
Pfungst, O. (1911). Clever Hans. New York: Holt.Google Scholar
Rieskamp, J. (2008). The probabilistic nature of preferential choice. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34, 14461465.Google ScholarPubMed
Rieskamp, J., & Otto, P. (2006). SSL: A theory of how people learn to select strategies. Journal of Experimental Psychology: General, 135, 207236.CrossRefGoogle ScholarPubMed
Roberts, S., & Pashler, H. (2000). How persuasive is a good fit? A comment on theory testing. Psychological Review, 107, 358–367.10.1037/0033-295X.107.2.358CrossRefGoogle ScholarPubMed
Rosenthal, R. (1987). Judgment studies: Design, analysis, and meta-analysis. New York: Cambridge University Press.CrossRefGoogle Scholar
Schwarz, N., Bless, H., Strack, F., Klumpp, G., Rittenauer-Schatka, H., & Simons, A. (1991). Ease of retrieval as information: Another look at the availability heuristic. Journal of Personality and Social Psychology, 61, 195202.CrossRefGoogle Scholar
Simon, H. A. (1983). Reason in human affairs. Stanford, CA: Stanford University Press.Google Scholar
Todd, P. (2000). The ecological rationality of mechanisms evolved to make up minds. American Behavioral Scientist, 43, 940956.CrossRefGoogle Scholar
Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5, 207232.10.1016/0010-0285(73)90033-9CrossRefGoogle Scholar
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 11241131.10.1126/science.185.4157.1124CrossRefGoogle ScholarPubMed
Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5, 297323.CrossRefGoogle Scholar
Unkelbach, C. (2006). The learned interpretation of cognitive fluency. Psychological Science, 17, 339345.10.1111/j.1467-9280.2006.01708.xCrossRefGoogle ScholarPubMed
Wänke, M., & Bless., H. (2000). How ease of retrieval may affect attitude judgments. In H. Bless & J. Forgas (Eds.) The role of subjective states in social cognition and behavior (pp. 143161). New York: Psychology Press.Google Scholar
Figure 0

Figure 1: Mean proportions of risky choices (a), mean difficulty ratings (b), and mean pricing index (c) as a function of experimental conditions.

Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 1
Download Fiedler supplementary material(File)
File 1.6 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 2
Download Fiedler supplementary material(File)
File 6.5 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 3
Download Fiedler supplementary material(File)
File 6.5 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 4
Download Fiedler supplementary material(File)
File 6.5 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 5
Download Fiedler supplementary material(File)
File 6.5 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 6
Download Fiedler supplementary material(File)
File 2.1 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 7
Download Fiedler supplementary material(File)
File 2.1 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 8
Download Fiedler supplementary material(File)
File 544 Bytes
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 1

Download Fiedler supplementary material(File)
File 1.6 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 2

Download Fiedler supplementary material(File)
File 6.5 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 3

Download Fiedler supplementary material(File)
File 6.5 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 4

Download Fiedler supplementary material(File)
File 6.5 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 5

Download Fiedler supplementary material(File)
File 6.5 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 6

Download Fiedler supplementary material(File)
File 2.1 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 7

Download Fiedler supplementary material(File)
File 2.1 KB
Supplementary material: File

Fiedler supplementary material

Fiedler supplementary material 8

Download Fiedler supplementary material(File)
File 544 Bytes