Introduction
People frequently make intertemporal choices—tradeoffs between gains and losses occurring at different times (Loewenstein & Thaler, Reference Loewenstein and Thaler1989). Researchers have found that the way people make intertemporal tradeoffs in laboratory studies correlates with how they make various intertemporal tradeoffs in life, ranging from drug use and exercise to savings and credit card debt (Bickel et al., Reference Bickel, Odum and Madden1999; Chabris et al., Reference Chabris, Laibson, Morris, Schuldt and Taubinsky2008; Chapman, Reference Chapman1996; Hardisty et al., Reference Hardisty, Thompson, Krantz and Weber2013; Hardisty & Weber, Reference Hardisty and Weber2009; MacKillop et al., Reference MacKillop, Amlung, Few, Ray, Sweet and Munafò2011; Madden et al., Reference Madden, Petry, Badger and Bickel1997; Meier & Sprenger, Reference Meier and Sprenger2012; Reimers et al., Reference Reimers, Maylor, Stewart and Chater2009). The implicit assumption is that people’s intertemporal choices are driven by their time preferences—that is, the degree to which they devalue a future outcome as it is delayed. However, the predictive power of time preferences inferred from intertemporal choices in the lab is usually modest and sometimes close to zero (Bartels et al., Reference Bartels, Li and Bharti2023).
One reason for time preferences’ modest predictive power could be that intertemporal choice studies have almost entirely utilized choices between pure gains available at different times, such as (A) $50 today versus (B) $100 in a year. Indeed, when we reviewed a sample of 100 published papers examining correlations between time preferences and real-world behaviors, 93% of the studies used pure gain intertemporal choices to measure time preferences (see Appendix A for the search criteria, list of papers, and coding).
In contrast, most intertemporal choices in life involve a mix of gains and losses. For example, when a person considers smoking, the tradeoff is between the short-term pleasure of smoking and the long-term costs of an increased likelihood of lung cancer and respiratory diseases. Or, when a person chooses to take care of their health by exercising, the tradeoff consists of short-term costs, while the main benefits are usually reaped months or years later. Indeed, although the pure gain choice is the ‘fruit fly’ of intertemporal choice studies in the lab, they seem to be rare outside of the lab.
Along with the overwhelming reliance on pure gain intertemporal choices, most past research has also implicitly assumed that various types of intertemporal choices all share the same underlying process (i.e., the temporal discounting of future outcomes). However, this assumption is mainly for convenience, and it is likely that there are multiple processes underlying intertemporal choices. For example, while some intertemporal choices reflect a fight against a short-term desire (e.g., smoking, eating junk food), others (e.g., exercise, flossing) do not seem as impulsive and rather require a mental push to act. Given that many of these real-world behaviors involve both gains and losses, might mixed-sign intertemporal choices represent a more ecologically valid measure of time preferences that tap into different psychological processes than pure gain choices? If so, time preferences inferred from mixed-sign intertemporal choices in the lab might be better able to predict such mixed-sign real-life behaviors.
Specifically, when people are offered a choice between (A) gaining a small amount today and (B) paying a small amount today but gaining a larger amount later, how will their choice processes differ as compared with the pure gain choices typically employed by researchers? This question has not been addressed: relatively few papers have considered losses in intertemporal choice (e.g., Chapman, Reference Chapman1996; Hardisty et al., Reference Hardisty, Thompson, Krantz and Weber2013; Hardisty & Weber, Reference Hardisty and Weber2009, Reference Hardisty and Weber2020; Molouki et al., Reference Molouki, Hardisty and Caruso2019; Thaler, Reference Thaler1981); even fewer have considered mixtures of gains and losses in intertemporal choice (Ostaszewski, Reference Ostaszewski2007); and none have explicitly compared pure gain choices with mixed-sign choices. We fill this gap in this article.
How might people make different choices when facing mixed-sign intertemporal choices compared to pure gain choices? We assess differences in both the magnitude of time preferences estimated from intertemporal choices and the predictive validity of those time preferences. In terms of time preference magnitudes, previous studies using pure gain intertemporal choices have estimated time preferences to be remarkably impatient when compared to prevailing market interest rates (Frederick et al., Reference Frederick, Loewenstein and O’Donoghue2003). Conversely, few studies using pure loss intertemporal choices have yielded much more patient time preferences (e.g., Chapman, Reference Chapman1996; Hardisty & Weber, Reference Hardisty and Weber2009; Thaler, Reference Thaler1981). Pure gains and losses represent two extremes on a continuum; thus, we explore whether mixed-sign intertemporal choices might yield more moderate time preferences, falling between these two extremes. Stated formally, our first research question is:
RQ1: Are average time preferences elicited from mixed-sign intertemporal choices different from time preferences elicited from pure gain intertemporal choices and/or pure loss intertemporal choices?
In terms of predictive validity, prior work has found correlations between time preferences derived from intertemporal choice measures in the lab and real-world behaviors that involve delayed consequences to be modest at best (e.g., MacKillop et al, Reference MacKillop, Amlung, Few, Ray, Sweet and Munafò2011; Bartels, Li, & Bharti, Reference Bartels, Li and Bharti2023). We posited that the modesty of these correlations could at least partially be attributed to a mismatch between the structure of intertemporal decisions made in the lab and those made in the real world. Real-world behaviors rarely involve decisions between smaller gains in the short term and larger gains in the long term. Rather, these decisions more often involve tradeoffs between a mixture of gains and losses over time. Because the psychology of mixed-sign (loss-now-gain-later and gain-now-loss-later) intertemporal choices may more closely mirror the psychology of mixed-sign real-world intertemporal behaviors (e.g., exercise and smoking, respectively), we explore whether the predictive power of mixed-sign discounting measures might be stronger for such behaviors. Stated formally, our second research question is:
RQ2: Are time preferences elicited from mixed-sign (loss-now-gain-later and gain-now-loss-later) intertemporal choices correlated more strongly (vs. single-sign pure gain or pure loss intertemporal choice measures) with real-world intertemporal choice behaviors, especially those that involve tradeoffs of short-term costs for future benefits or short-term gains for future costs?
Empirical overview
To address these research questions, we first ran a series of exploratory pilot studies (the largest of these pilot studies, with N = 3,200, is reported in Appendix B). These pilot studies yielded consistent results for RQ1 and suggestive results for RQ2 that were inconsistent between studies. A major challenge in answering RQ2 is that correlations between time preferences and intertemporal choice behaviors in the real world are typically modest, and a large sample size is required to reliably detect the small differences between these modest correlations. Therefore, to answer RQ1 and RQ2 more definitively, we ran a large-scale (N = 7,000), preregistered online study. The study materials, preregistration, data, analysis code, and additional analyses are available on OSF: https://osf.io/vayzt/
Methods
Overview
The sample size needed to detect a difference between two correlations with a difference of r = .1, α = .05, and power = .8 is 1,573 per condition (Eid et al., Reference Eid, Gollwitzer and Schmitt2013; Lenhard & Lenhard, Reference Lenhard and Lenhard2014). Accounting for multiple comparisons and expected preregistered exclusions of data due to perverse responses, duplicate IP addresses, and so forth, we selected a target sample size of 1,750 per cell for a total sample size of 7,000.
Materials
All participants first read the following instructions: ‘In the next set of questions, we will ask you about gaining (i.e., receiving $__) or losing money (i.e., having to pay $__) at different points in time. Although these questions are hypothetical, please do your best to treat them as if they were real’. Next, participants completed an intertemporal choice measure consisting of a series of 17 choices between immediate options and options delayed by 3 months: ‘These questions are about [both] [receiving (i.e., gaining)] [and] [paying (i.e., losing)] money. Please choose which option you would prefer in each pair’. The parts in brackets changed depending on the condition; bolding was used in the original text. Participants in the Gain condition considered choice options such as ‘receive $50 today’ versus ‘receive $100 in 3 months’, where the later amount was fixed and the sooner amount varied from $25 to $105 in increments of $5 (i.e., a choice ‘staircase’).
For participants in the loss-now-gain-later condition, a $25 immediate loss was paired with the delayed gain, and—relative to the gain-only condition—the range of today options across trials was shifted down by $25, from receiving $0 to $80 today. Thus, the ‘later’ option in the Loss-Now-Gain-Later condition was a mixed outcome with both an immediate loss and a future gain, creating choices such as ‘receive $25 today’ versus ‘pay $25 today and receive $100 in 3 months’. This transformation leaves the choice normatively equivalent to the untransformed choice, assuming utility is linear. Furthermore, note that this technique of creating normatively equivalent mixed-outcome intertemporal choice pairs can be adapted for any intertemporal choice measure (e.g., Kirby et al., Reference Kirby, Petry and Bickel1999) simply by subtracting a fixed amount (e.g., $25 today) from the gain outcomes or adding a fixed amount to loss outcomes.
Participants in the loss and gain-now-loss-later conditions saw the same sets of options as those in the gain and loss-now-gain-later conditions, respectively, but with gains and losses reversed (i.e., we flipped the signs for all outcomes). Thus, participants in the Loss condition considered choice options such as ‘pay $50 today’ versus ‘pay $100 in 3 months’. Correspondingly, participants in the loss-now-gain-later considered choice options such as ‘pay $25 today’ versus ‘receive $25 today and today $100 in 3 months’.
We measured 17 real-world intertemporal choice behaviors that are a subset of those examined by Bartels et al. (Reference Bartels, Li and Bharti2023). Our pilot testing found that many of the original 36 behaviors did not bear significant correlations with time preferences, were largely redundant with other measures, or yielded too little variance to be meaningfully analyzed. The measures that we retained (see Table 1) were: alcohol use, body mass index (BMI), credit card late payment frequency, drug use, nicotine use, likelihood of paying credit card bills in full, dental cleaning frequency, doctor exam frequency, education level, physical activity, prescription drug compliance, percent income saved, tendency to start tasks well before deadlines, sunscreen use, wealth accumulation, coupon use, and punctuality.
Note: Italicized correlations are significant at p < .05 and bolded correlations are significant at p < .01 or lower. In each row, pairwise equivalence or differences between correlations (as calculated by z-tests between Fisher’s z-transformed correlations) are denoted using superscripts; correlations that share a superscript are NOT significantly different from each other (p > .05). For example, in the second row of the table (‘percent income saved’), the correlation in the gain condition (.13a) is not significantly different from the correlation in the loss-now-gain-later condition (.09ab), because they share an ‘a’ superscript. Meanwhile, the Gain condition *is* significantly different from the Loss condition (.03bc) because they do not share a superscript. Note that these comparison tests are not adjusted for multiple comparisons. For detailed test statistics, see Supplementary Materials.
Additionally, we included an attention check question: ‘How often do you pay attention to questions while taking surveys? If you are paying attention, please do not answer this question by leaving it blank. Answer I prefer not to answer if you already clicked something.’ (As preregistered, this attention check was included to assess data quality but was not used to exclude any participants.)
Data collection
To calibrate the study length, we first collected data from 190 undergraduate students. Prior to collecting this data, we were not sure of the limits of the Amazon Mechanical Turk (MTurk) participant pool and were concerned that data quality would be less stable as we sampled deeper and deeper from the pool. As such, we preregistered a stringent data cleaning protocol as briefly summarized below. Additionally, we used the CloudResearch platform to maximize data quality, minimize duplicate participants, and collect data in batches (Litman et al., Reference Litman, Robinson and Abberbock2017). We gathered data in 10 batches over the course of several months as each relaunching of the survey (while excluding participants who had already completed it) gained renewed attention from potential participants. We first requested two small samples (n = 110, n = 100), then two medium-sized samples (n = 500), and finally six large samples (n = 1,000). The actual sample sizes for each batch were slightly lower than the requested sizes because of the exclusion criteria.
The full dataset started with 7,584 responses. We cleaned the data following our preregistered protocol with the aid of a research assistant, which excluded 389 responses. We also deleted 246 responses containing duplicate IP addresses, leaving a sample size of 6,949. For the attention check, 81.9% of participants correctly left it blank, 13.8% somewhat correctly clicked ‘I prefer not to answer’, and 4.3% failed the attention check altogether by giving a different answer. Failure to pass the attention check did not merit exclusion from analysis.Footnote 1 Some participants selected ‘I prefer not to answer’ to questions such as those about drug use, alcohol consumption, or sunscreen use. Additionally, some participants indicated that they did not use credit cards in the credit card payment frequency and likelihood of paying credit card in full measures. We coded these responses as missing data.
Data processing
For each participant, we inferred their indifference point between the sooner and later options by taking the midpoint of the choices in which they switched their choice between the sooner and later options. For example, if a participant chose $100 today over $100 in 3 months, but chose $100 in 3 months over $95 today, we would infer that they would be indifferent between $97.50 today and $100 in 3 months. If a participant always chose the sooner option or always chose the later option (3.1% and 6.5% of the sample, respectively), we conservatively calculated their indifference point at $2.50 beyond the endpoint. For example, if a participant in the Gain condition always chose the ‘today’ option, we would infer that they would be indifferent between $22.50 today and $100 in 3 months. For an interpretable measure of time preference that can easily be compared across conditions, we used the number of patient choices as our primary measure. (We also calculated log transformed hyperbolic discount rates and exponential discount rates, which yielded highly similar results; these can be found in the Supplementary Materials.)
Some participants appeared to mistake losses as gains in the loss and gain-now-loss-later conditions and thus made choices suggesting that they preferred more losses to less (ns = 189 and 128, respectively). A few participants also seemed to misunderstand the choices in the Gain condition (n = 4). Following our preregistered protocol, we excluded these ‘perverse’ responses from analysis.
A total of 1098 participants made non-monotonic responses (i.e., they switched back and forth between sooner or later options) for the intertemporal choices. For example, if someone chose to receive $100 in 3 months over $100 today, and then chose $95 today over $100 in 3 months, their choices would be considered non-monotonic, perhaps indicating that they were not paying attention, did not understand the instructions, or had a large range of indifference. The proportions of non-monotonic responses by condition were 12.7% for Gain, 19.9% for gain-now-loss-later, 14.9% for Loss, and 16.7% for loss-now-gain-later. A chi-square test indicated that the rate of non-monotonic responding varied by condition (χ 2 (3, 5814) = 38.42, p < .001), which might be attributable to differences in how easy it was for participants to interpret each measure.
For participants whose choices were non-monotonic, we calculated a maximum likelihood indifference point as the point that yielded the maximum consistency among that participant’s responses and excluded participants whose consistency was below 75% (Kirby et al., Reference Kirby, Petry and Bickel1999).Footnote 2 For our analyses, we translated the maximum likelihood indifference point to the number of patient choices for a more easily interpretable measure of time preference.
Exclusion of participants with inconsistent (<75% consistency) or perverse responses left 1,546 participants in the Gain condition; 1,531 in the Loss-Now-Gain-Later condition; 1,356 in the loss condition; and 1,381 in the gain-now-loss-later condition.Footnote 3
Some of the behavioral measures were reverse coded to ensure consistent directionality among variables and to facilitate interpretation. Specifically, we recoded the following behaviors such that lower values corresponded to a preference for temporally proximal rewards and higher values corresponded to a preference for temporally distal rewards: BMI, Credit Card Late Payment Frequency, Nicotine Use, Drug Use, and Alcohol Use. This directionality is consistent with our time preference measure, with larger time preferences corresponding to greater patience.
Averages of correlation coefficients by condition were calculated by applying a Fisher’s z transformation to the correlation coefficients, calculating the condition-wise means of the absolute values of these z coefficients, and then reverse transforming the resulting means into correlation coefficients for descriptive reporting in the table (Silver & Dunlap, Reference Silver and Dunlap1987). In order to calculate the significance of pairwise differences between time preference-behavior correlations between conditions, we conducted pairwise z-tests, as reported in Table 1.
Results
Time preferences
As shown in Figure 1, time preferences for the Gain condition were least patient (M G = 10.3), those for the Loss condition were most patient (M L = 13.7), and those for the mixed-sign conditions (loss-now-gain-later and gain-now-loss-later) fell in the middle (M LG = 11.3, M GL = 13.5). To formally test RQ1, we conducted a one-way ANOVA and the nonparametric equivalent, a Kruskal–Wallis test, on the time preferences by condition. We then did planned contrasts using t tests and Dunn tests, a nonparametric test commonly used for planned comparisons (Dunn, Reference Dunn1961). These tests revealed significant differences in time preferences between conditions (F(3, 5810) = 221.7, p < .001; Kruskal–Wallis χ 2 (3) = 1143.3, p < .001). Time preferences were less patient in the Gain condition than in the Loss-Now-Gain-Later (t(5810) = −6.52, p < .001; z = −6.95, p < .001); Loss (t(5810) = −21.43, p < .001; z = −25.34, p < .001); and Gain-Now-Loss-Later conditions (t(5810) = −20.16, p < .001; z = −28.07, p < .001). Likewise, time preferences in the loss-now-gain-later condition were more patient than in the Loss (t(5810) = −15.08, p < .001; z = −18.57, p < .001) and Gain-Now-Loss-Later conditions (t(5810) = −13.77, p < .001; z = −21.25, p < .001). The difference in time preferences between the loss condition and the gain-now-loss-later condition was small but in the predicted direction (t(5810) = 1.34, p = .18; z = −2.52, p = .01).
Correlations with ‘real-world’ behaviors
To examine RQ2, we computed the Pearson correlations between time preferences and self-reported behaviors in each condition (see Table 1; Spearman correlations are reported in the Supplementary Materials). Several observations are of note. First, correlations were modest at best overall; the strongest correlations we observed were for credit card late payment frequency in the Loss condition (r = .28, p < .001); gain condition (r = .23, p < .001); and gain-now-loss-later condition (r = .22, p < .001) and likelihood of paying credit card bills in full in the gain condition (r = .22, p < .001); loss condition (r = .19, p < .001); and loss-now-gain-later condition (r = .18, p < .001). Notably, the fact that the Loss time preference is most correlated with the two credit card repayment behaviors makes intuitive sense because credit card repayment requires choosing between smaller sooner payments versus larger later payments.
Second, while most behaviors correlated with time preferences in the expected directions, many behaviors were not significantly correlated with time preferences in one or more conditions. For example, four behaviors, coupon usage (r = .06, p = .02), doctor exam frequency (r = .06, p = .02), physical activity (r = −.08, p = .003), and alcohol use (r = −.08, p = .003), only significantly correlated with time preferences in the loss-now-gain-later condition, with the latter two in the ‘wrong’ direction such that more patience was correlated with greater alcohol use and less physical activity. Education level, percent of income saved, and sunscreen use were only correlated with time preferences in the gain and loss-now-gain-later conditions. Drug use was correlated with time preferences only in the gain and loss conditions. BMI and the tendency to start tasks well before deadlines did not significantly correlate with time preferences obtained in any condition (ps > .05).
Third, contrary to our expectations (RQ2), mixed-sign (loss-now-gain-later and gain-now-loss-later) time preferences measures did not outperform the Gain time preferences. Instead, time preferences in the Gain condition predicted every individual behavioral measure better than or equal to time preferences obtained in the mixed-sign and Loss conditions. Time preferences in the Loss condition also had comparably good predictive validity, underperforming the gain condition only for percentage of income saved and education level (ps < .01). For a few behaviors (credit card late payment frequency, punctuality, prescription drug compliance, and drug use), predictive validities in the Loss condition were directionally higher but the differences were not statistically significant. In terms of the mixed-sign measures, the loss-now-gain-later condition also had reasonable predictive validity, equaling the gain condition for 12 of the 17 behaviors but predicting worse for credit card late payment frequency, punctuality, prescription drug compliance, and nicotine use. Time preferences were the least predictive in the gain-now-loss-later condition, predicting 9 of 17 behaviors worse than in the Gain condition.
Intertemporal choice behavioral factors
Since correlations of 17 behaviors across four conditions may be hard to parse, we used factor analysis to organize the 17 behaviors into factors. To do this, we first conducted an exploratory principal component analysis (PCA). A scree plot yielded an inflection point after five factors, so we conducted a confirmatory PCA specifying five factors and applied an oblimin rotation, which allows for correlations between factors. We assigned each of the 17 behavioral measures to the factor on which it loaded strongest, with the exception of two variables (likelihood of paying credit card in full and coupon use) which each loaded highly on two factors. These variables were assigned to the factor that seemed most theoretically suitable. This analysis led to the following factor model:
Financial ~ Likelihood of paying credit card in full + Percent income saved + Wealth accumulation + BMI + Credit card late payment frequency
Promptness ~ Punctuality + Tendency to start tasks well before deadlines + Prescription drug compliance
Self-care ~ Dental cleaning frequency + Doctor exam frequency + Sunscreen use + Education level + Coupon use
Fitness ~ Physical activity
Vices ~ Nicotine use + Drug use + Alcohol use
Using the Lavaan (Version 0.6-3) package in R, we conducted a confirmatory factor analysis using our five-factor model (model fit test statistic = 2596.906, df = 110, p < .001) and generated factor scores using the regression method, addressing missing data using the full information maximum likelihood method.
The bottom five rows of Table 1 display the correlations between time preferences in each condition and the factor scores for each of these five behavior factors (financial, promptness, self-care, fitness, and vice behaviors). Again, correlations were overall higher for the gain and loss time preferences, with the loss-now-gain-later and loss time preferences exhibiting fair, but weaker predictive performance. The Loss time preferences predicted vice behaviors, promptness behaviors, and financial behaviors equivalently well, and yielded the higher predictive performance relative to the other conditions for the vice and promptness behaviors. Overall, the strongest relationships were observed between time preferences and financial behaviors, which is intuitive given that our time preference measure employed financial intertemporal choices. Fitness (a factor consisting of only hours of physical activity) was the least correlated, with a significant correlation (p < .05) only in the loss-now-gain-later condition and in the wrong direction (i.e., more patient participants were less fit).
The superscripts in the last row of Table 1 indicate the results of pairwise z-tests comparing the differences between time preference-behavior factor correlation coefficients by condition. The gain and loss conditions yielded higher predictive performance than the gain-now-loss-later condition (zs = 2.19 and 1.82, ps = .01 and .02). We also observed marginally better predictive performance for the gain condition versus the loss-now-gain-later condition (z = 1.41, p = .08) and for the loss-now-gain-later condition versus the loss condition (z = 1.37, p = 09).
Discussion
Originally, we expected that time preferences obtained using mix-sign framing that is more congruent to the structure of a given real-world intertemporal behavior would yield stronger correlations. Instead, we observed that time preferences obtained from the gain measure overall correlated with real-world intertemporal behaviors better than or equal to the time preference obtained from the other measures, with the loss measure and loss-now-gain-later measure close behind. The gain-now-loss-later measure generally underperformed.
Interpretation of results
One potential explanation for why the gain measure was more highly correlated with real-world intertemporal choice behaviors is that it was easier for participants to understand. Indeed, fewer participants in the gain condition responded non-monotonically relative to the other conditions (see above). It may be that the gain measure was simply easier for participants to understand and thus yielded a more valid and reliable measure of their temporal time preferences.
This ease explanation is supported by an exploratory comparison of response times (which we log transformed to address skew for analysis, and then exponentiated for interpretable descriptives). Participants in the gain condition took an average of 37 s to make their intertemporal choices, while those in the loss-now-gain-later condition took 53 s, those in the Loss condition took 42 s, and those in the gain-now-loss-later condition took 70 s (roughly twice as long as those in the gain condition). An ANOVA comparing response times across conditions, and all pairwise comparisons, were all significant at p < .001. This analysis supports the idea that the non-gain conditions introduce additional psychological factors that may add noise to the measurement of time preferences.
A second possible explanation is that our initial theory is correct in principle, but in practice did not receive empirical support due to issues pertaining to data quality such as inattention or low effort responding. Such data quality issues might be magnified for more complex measures, leading to even lower data quality for the mixed-sign measures.
A third possible explanation is that making intertemporal choices reflects an underlying trait that is unidimensional rather than multidimensional and, contrary to our initial theorizing, this trait uniquely predicts the variety of real-world behavioral measures in our study. In the gain condition, choosing between receiving $75 today and receiving $100 in 3 months entails a comparison of two values for each of two dimensions—monetary value ($75 and $100) and time (today and 3 months from today). By contrast, choosing between ‘receive $50 today’ and ‘pay $25 today and receive $100 in 3 months’ (a loss-now-gain-later choice) involves comparing and integrating values that differ along three dimensions—monetary value ($50, $25, and $100), time (today and 3 months from today), and sign (gains and losses). The mixture of losses and gains introduces additional psychological factors not present in the Gain measure, such as loss aversion, negativity bias, and the positive utility gained from not having a looming payment due in the future (Hardisty & Weber, Reference Hardisty and Weber2020). These additional factors would serve to add additional noise to what should be a measure of time preferences.
Contributions to prior literature
Our results build on previous literature in several areas. First, our findings contribute to a growing literature exploring better ways to measure time preferences (Chapman, Reference Chapman1996; Fidanoski & Johnson, Reference Fidanoski and Johnson2023; Hardisty et al., Reference Hardisty, Thompson, Krantz and Weber2013; Li et al., Reference Li, Krefeld-Schwalb, Wall, Johnson, Toubia and Bartels2022; Pezzuto & Urminsky, Reference Pezzuto and Urminsky2018; Toubia et al., Reference Toubia, Johnson, Evgeniou and Delquié2013). Our primary result—that the predictive validity of time preferences was best for the simple, ‘pure gain’ intertemporal choice measure, and second best for the ‘pure loss’ measure—is broadly consistent with previous findings that simple measures of time preferences yield equivalent or superior predictive validity as compared to more complex measures (Hardisty et al., Reference Hardisty, Thompson, Krantz and Weber2013; Li et al., Reference Li, Krefeld-Schwalb, Wall, Johnson, Toubia and Bartels2022; Pezzuto & Urminsky, Reference Pezzuto and Urminsky2018; Toubia et al., Reference Toubia, Johnson, Evgeniou and Delquié2013). Indeed, the more complex mixed-sign time preference measures had generally lower predictive validity, more non-monotonic responses, and slower response times.
Second, our finding that time preferences are more patient when losses are involved (whether pure loss or mixed losses and gains) is consistent with previous research on the sign effect (e.g., Hardisty & Weber, Reference Hardisty and Weber2020; Loewenstein & Thaler, Reference Loewenstein and Thaler1989; Molouki et al., Reference Molouki, Hardisty and Caruso2019; Ostaszewski, Reference Ostaszewski2007). In essence, while people often exhibit high degrees of ‘impatience’ when choosing between a smaller reward now versus a larger reward later, people are more future oriented when intertemporal choices involve losses.
Third, our results build on previous literature finding differences in how people process intertemporal losses versus intertemporal gains (Hardisty & Weber, Reference Hardisty and Weber2020; Molouki et al., Reference Molouki, Hardisty and Caruso2019; Myerson et al., Reference Myerson, Baumann and Green2017; Yeh et al., Reference Yeh, Myerson, Strube and Green2020), and extend into an exploration of the processing of mixed outcomes. Specifically, our finding that perverse and non-monotonic responses were more common for losses and mixed outcomes indicate either confusion or distinct and unusual preferences. Also, our finding of notably longer response times for intertemporal losses and especially mixed-valence choices likely indicates greater difficulty in processing these choices. Furthermore, our finding that predictive validity was generally lower for intertemporal choices involving mixed gain-loss outcomes indicates that participants are responding to these choices in a different way. In other words, time preferences for losses and mixed outcomes are not only quantitatively more patient than for gains, but are also psychologically processed in distinct ways that is likely more effortful and error-prone. Future research might explore whether this may partly explain poor decision-making on mixed-sign intertemporal financial choices, such as decisions around purchasing and debt.
Conclusion
Intertemporal choice researchers have overwhelmingly relied on pure gain intertemporal choice questions, in spite of the fact that most real-world behaviors of interest involve a mixture of positive and negative outcomes. In a highly powered study, we found that these classic pure gain questions performed as well or better than alternative formulations involving losses or a mix of losses and gains. Overall, this is reassuring news for researchers, who can have more confidence in previously published findings, and can continue to rely on the simple and effective smaller-sooner gain versus larger-later gain format in future studies. However, the average correlations between lab-measured time preferences and real-world behaviors remain low (e.g., average r = .11 for the Gain condition), so the exploration of alternative paradigms should continue.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/jdm.2024.30.
Data availability statement
The study materials, preregistration, data, analysis code, and additional analyses are available on OSF: https://osf.io/vayzt/.
Appendix A: Coded sample of 100 papers on intertemporal choice
We instructed a research assistant who was blind to the objectives and hypotheses of the research project to search for research articles that included both intertemporal choice survey questions (i.e., choices between different amounts at different times) and measures of (often self-reported) real-world behaviors. To be included, the articles had to introduce new data (i.e., review papers and reanalyses of previously published datasets were not included). The research assistant was instructed to search Google Scholar by using keywords and to follow citations (both forward citations and backward citations) until a sample of 100 qualifying articles was found. Google Scholar sorts results by search term relevance and citation count, so this convenience sample was not random; it was biased toward more highly cited papers. The keywords used for search were as follows: intertemporal choice, delay discounting, discounting, impatience, time preference, hyperbolic discounting, exponential discounting, smaller-sooner versus larger-later, impulsivity, delay of gratification, myopia, vice and virtue, and delayed reward discounting. Subsequently, the same research assistant coded each article for whether the intertemporal choice questions only used money (vs. including other domains such as health), and whether the intertemporal choice questions only used gains (vs. including losses or mixed outcomes). The results of the article sampling and coding are shown in the following table. In summary, 93% of articles using intertemporal choice questions and measuring real-world behaviors relied on gains only for their intertemporal choice questions, 85% relied on money only for the intertemporal choices, and 82% relied on both gains only and money only.
Appendix B: Pilot Study
Methods
Overview
We recruited 3,200 U.S. residents with 95% or better approval ratings from Amazon Mechanical Turk for a 7-min study on decisions over time with a compensation of $0.85. The sample size was selected to have at least 800 participants per condition, to have adequate power to detect differences between conditions. Participants were randomly assigned to complete one of four intertemporal choice measures (gain, loss-now-gain-later, loss, or gain-now-loss-later), as described below. Subsequently, all participants answered a series of 31 questions about their own real-world intertemporal behaviors.
Choice and matching measures
All participants first read the following instructions: ‘In the next set of questions, we will ask you about gaining (i.e., receiving $__) or losing money (i.e., having to pay $__) at different points in time. Although these questions are hypothetical, please do your best to treat them as if they were real’. Next, participants completed a choice measure and a matching measure (described below), in counterbalanced order. While the choice and matching measures were positively correlated, the matching measures had much lower correlations with the ‘real-world’ intertemporal behaviors, consistent with previous research (Hardisty et al., Reference Hardisty, Thompson, Krantz and Weber2013). Furthermore, the order in which participants responded to the choice and matching measures had no effect on the results. For these reasons, the matching data will not be discussed further here but can be found on OSF: https://osf.io/vayzt/.
For the choice measures, participants read, ‘These questions are about [both] [receiving (i.e., gaining)] [and] [paying (i.e., losing)] money. Please choose which option you would prefer in each pair’. The parts in brackets changed depending on the condition. Bolding was used in the original text, as shown above. Next, participants faced a series of 17 choices between immediate options and delayed options in 3 months. Participants in the Gain condition considered choice options such as ‘receive $50 today’ versus ‘receive $100 in 3 months’, where the later amount was fixed and the sooner amount varied from receiving $25 to $105 today, in increments of $5 (i.e., a choice ‘staircase’).
For participants in the Loss-Now-Gain-Later condition, a $25 immediate loss was paired with the delayed gain (i.e., ‘pay $25 now and receive $100 in 3 months’), and—relative to the Gain-only condition—the range of today options across trials was shifted down by $25, from receiving $0 to $80 today. Thus, the ‘later’ option in the Loss-Now-Gain-Later condition was a mixed outcome with both an immediate loss and a future gain. This transformation leaves the choice normatively equivalent to the untransformed choice, assuming utility is linear. Participants in the loss and gain-now-loss-later conditions saw the same set of options, but with gains and losses reversed (i.e., we flipped the signs for all outcomes).
‘Real-world’ intertemporal behavior questions
Next, all participants answered a series of 31 questions about real-world behaviors with intertemporal aspects, listed in Table A1. This set of questions was adapted from a similar set of questions used by Bartels et al. (Reference Bartels, Li and Bharti2023), which were generated by combining the behaviors measured by prominent papers relating time preferences to real-world behaviors, such as those cited in the introduction. Among these, there was an attention check question, ‘How often do you pay attention to questions while taking surveys? If you are paying attention, please do not answer this question by leaving it blank. Answer I prefer not to answer if you already clicked something’. Then, 89.5% of participants correctly left the question blank, 9% somewhat correctly clicked ‘I prefer not to answer’, and 1.5% failed the attention check altogether and gave a different answer. All participants (including those who failed the check) were kept in the dataset for further analysis.
Note: Also depicts correlations between discount rates and factor values and the coding of each behavior based on the type of intertemporal choice it represented. * p < .05, ** p < .01.
Results
Data cleaning and processing
We cleaned the data file as follows (although this study was not preregistered, all cleaning decisions were made prior to data analysis, following Bartels et al., Reference Bartels, Li and Bharti2023): We first removed participants with duplicate IP addresses (keeping the first survey attempt and removing the second) or incomplete survey attempts, leaving 3121 legitimate completions for further analysis. Height and weight were converted to a BMI score using the standard formula, 703 × weight (lb)/height2 (in). The number of packs of cigarettes smoked had high variance and was largely redundant with the nicotine use question, so we dropped it from the dataset. Thus, 29 ‘real-world’ behavior variables remained for further analysis. On other questions, any answers of ‘I prefer not to answer’ or ‘I don’t drive’ (on the driving questions) or similar were treated as missing data. If at least one of the three ‘driving’ answers was ‘I don’t drive’, then the other two answers from that participant were also treated as missing data. Likewise, if the person answered ‘I don’t have a credit card’ to any of the three credit card questions, all three questions were treated as missing data. Finally, the following ‘bizarre’ answers were treated as missing data: % of income saved greater than 100%, height in feet less than 4 or greater than 7, height in inches less than 0 or greater than 12, BMI less than 10 or greater than 60, hours of physical activity greater than 112 h per week, or fitness hours greater than active hours.
Some participants gave non-monotonic answers (switching back and forth between sooner or later options) or perverse answers on the choice measure. For example, if someone chose to receive $100 in 3 months over $100 today, and then chose $95 today over $100 in 3 months, this either indicates inattention or that they prefer less money to more. Importantly, this varied by experimental condition. Only 2% of those in the Gain condition and 3% of those in the Loss-Now-Gain-Later condition gave non-monotonic or perverse answers, compared with 17% of those in the Loss condition and 16% of those in the Gain-Now-Loss-Later condition, a significant difference (p < .001). Thus, one potential disadvantage of the loss and gain-now-loss-later choice measures is an increased likelihood of errors. We excluded these participants (10% of the sample overall) from further analysis, because it is difficult to interpret their answers and calculate discount rates. Alternative analyses with the full sample—using proportion of ‘now’ answers rather than discount rates—reach similar conclusions (see OSF page).
For each participant, we inferred their indifference point between the sooner and later options by taking the midpoint of the choices in which they switched their choice between the sooner and later options. For example, if a participant chose $100 today over $100 in 3 months, but chose $100 in 3 months over $95 today, we would infer that they would be indifferent between $97.50 today and $100 in 3 months. If a participant always chose the sooner option or always chose the later option (3% and 6% of the sample, respectively), we conservatively calculated their indifference point at $2.50 beyond the endpoint. For example, if a participant in the Gain condition always chose the ‘today’ option, we would infer that they would be indifferent between $22.50 today and $100 in 3 months.
To compare time preferences more easily across conditions and with previous research, we converted indifference points to annual discount rates using the continuously compounded exponential discounting formula, V 1 = V 2 + Ae−kD , where V 1 is the amount for the sooner option, V 2 is the today amount for the later option (zero for gain and loss conditions), A is the later amount, k is the discount rate, and D is the later option’s delay in years. For example, for a participant indifferent between ‘receive $72.50 today’ and ‘pay $25 today and receive $100 in 3 months’, solving the formula 72.50 = −25 + 100 × e−k*3/12 gives k = .10, meaning an annual discount rate of 10%. We chose exponential discounting rather than hyperbolic discounting (e.g., Mazur, Reference Mazur1987) because the discount rate is more interpretable (i.e., k = .20 is a 20% annual discount rate in standard economic terms) and the results are nearly identical (see OSF page).
Discount rates
As seen in Figure 1 and confirmed with a between-subjects ANOVA, mean discount rates varied by condition, F(3, 2820) = 80.5, p < .001, ${\eta}_p^2=.08$ . Pairwise comparisons confirmed that discount rates were higher in the gain condition than the loss-now-gain-later condition, t(1535) = 4.3, p < .001, as well as the other two conditions, both p’s < .001. Likewise, the loss-now-gain-later condition was higher than the loss condition and the gain-now-loss-later condition (ps < .001). There was no difference, however, between the loss and gain-now-loss-later conditions, t(1285) = 0.3, p = .76. We therefore found partial support for RQ1.
Correlation with ‘real-world’ behaviors
Having established differences in the average discount rate between conditions, we next examined whether the four intertemporal choice measures had differing levels of predictive validity—that is, do they differ in their ability to ‘predict’ real-world intertemporal choice behaviors? Table A1 shows the correlations between the discount rates and each of the self-reported behaviors. Correlations ranged from null to modest, with absolute values mostly under 0.2. Of the 27 behaviors measured, 17 were significantly predicted by at least one of the four intertemporal choice measures. The average absolute correlations across the 27 behaviors was highest for the Loss-Now-Gain-Later measure, and lowest for the Loss measure.
As a first pass analysis, we counted the number of times each intertemporal choice measure had the highest predictive validity across the 27 behaviors. The loss-now-gain-later measure was the best predictor of 9 behaviors, the gain-now-loss-later measure was the best predictor of 11 behaviors, and the gain and loss measures were the best predictors of 4 behaviors each. If we restrict this analysis to just behaviors that are significantly predicted by at least one measure, then the loss-now-gain-later measure was still the best predictor of nine behaviors, the gain-now-loss-later measure was the best predictor of four behaviors, and the Gain and Loss measures were the best predictors of only two behaviors each.
To gain more insight into the large number of behavioral measures, we conducted an exploratory PCA. Because PCA is sensitive to abnormality in variable distributions, we first log transformed five variables that were positively skewed: credit card debt, percent income saved, BMI, fitness activity, and physical activity. A scree plot yielded an inflection point after five factors, so we conducted a PCA specifying five factors and applied an oblimin rotation, which allows for correlations between factors. We assigned each of the 27 behavioral measures in our dataset to the factor on which it loaded strongest. Doing so, led to the following model:
Financial decision-making ~ Likelihood of paying credit card in full + Credit card debt (−) + Percent income saved + Wealth accumulation + BMI (−) + Credit card late payment frequency (−) + Propensity to leave dishes unwashed (−)
Impulsivity ~ Use of cell phones while driving + Punctuality (−) + Driving recklessly + Tendency to start tasks well before deadlines (−) + Number of speeding tickets + Prescription drug compliance (−) + Propensity to overeat
Self-care ~ Dental cleaning frequency + Doctor exam frequency + Sunscreen use + Education level + Coupon use + Flossing
Fitness ~ Fitness activity + Physical activity + Diet monitoring
Vices ~ Nicotine use + Drug use + Alcohol use + Gambling
Using the Lavaan (version 0.6-3) package in R, we conducted a confirmatory factor analysis using our five-factor model (model fit test statistic = 3804.794, df = 314, p < .001), generated factor scores using the regression method, and addressed missing data using full information maximum likelihood.
Table A1 shows the correlations between exponential discount rates with each of these five behavior factors across the four conditions. Table A2 displays the results of pairwise z tests comparing the average discount rate-behavior correlations by condition. Table A3 displays the results of the same comparisons conducted on the discount rate-behavior factor correlations. None of these comparisons were statistically significant, meaning that we were unable to detect differences in the predictive power of our manipulation between groups. Post hoc, we attribute the absence of a statistically significant difference between groups to a lack of statistical power. The sample size necessary to detect relatively small differences between correlations is substantial; we calculated that in order to detect a difference of r = .1 with 80% power at α = .05, we would need 1,573 participants per cell (Eid et al, Reference Eid, Gollwitzer and Schmitt2013; Lenhard and Lenhard, Reference Lenhard and Lenhard2014). This was the primary motivation for the larger study featured in the main body of this article.
Alternative categorization of behaviors
As a robustness check, we considered an alternative categorization of the behaviors as a function of whether they involved immediate and/or future gains and losses. Two independent research assistants, blind to the four experimental conditions, coded each behavior on four dimensions: ‘Does this issue involve a salient [immediate/future] [gain/loss]?’ Each of these dimensions was coded independently—thus a given behavior could potentially be ‘yes’ on all four dimensions, or ‘no’ on all four dimensions. The coders were instructed to take the perspective of an ‘average’ person, rather than rating their personal views. Interrater reliability was acceptable (kappa = .79) and inconsistencies were resolved by discussion between the coders.
Perhaps surprisingly, all 29 behaviors were classified as mixed-sign, involving both gains and losses at different points in time, as seen on the right side of Table 1. The behaviors fell into three main groups: those involving gains now and losses later (GL), those involving losses now and gains later (LG), and a group of other behaviors that involved both gains and losses in the present and either gains or losses later. Examining the behaviors in each classification in Table 1, the GL behaviors consist mostly of classic impulsive behaviors with the possibility of immediate gratification, such as overeating, drug use, and smoking. The LG behaviors consist mostly of preventative behaviors that require an immediate sacrifice for a future benefit, such as flossing, putting on sunscreen, and saving money. The other behaviors consist mostly of risky behaviors such as speeding and gambling.
Discussion
Our findings revealed that people exhibit high discount rates for intertemporal choices that involve only gains, but exhibit significantly lower discount rates for mixed-sign choices and pure loss choices. Surprisingly, behaviors associated with impulsivity, such as one’s propensity to use a cellphone while driving, were not correlated at all or correlated very weakly with discount rates.
Overall, there are patterns in the data suggesting potential domain differences in predictive power across intertemporal choice measures—with gain and loss-now-gain-later measures performing better for certain domains of real-world behavior—but the large number of measures collected and exploratory nature of the study indicate caution in drawing firm conclusions. We addressed this with a large, preregistered study in the main manuscript.