Learning how to reason and deciding when to decide

Senne Braem; Leslie Held; Amitai Shenhav; Romy Frömer

doi:10.1017/S0140525X22003090

Learning how to reason and deciding when to decide

Published online by Cambridge University Press: 18 July 2023

Amitai Shenhav and

Senne Braem: Affiliation:
Department of Experimental Psychology, Universiteit Gent, Gent, Belgium [email protected]; https://users.ugent.be/~sbraem/ [email protected]
Leslie Held: Affiliation:
Department of Experimental Psychology, Universiteit Gent, Gent, Belgium [email protected]; https://users.ugent.be/~sbraem/ [email protected]
Amitai Shenhav: Affiliation:
Department of Cognitive, Linguistic and Psychological Sciences, Brown University, Providence, RI, USA [email protected]; https://www.shenhavlab.org
Romy Frömer: Affiliation:
Department of Cognitive, Linguistic and Psychological Sciences, Brown University, Providence, RI, USA [email protected]; https://www.shenhavlab.org Centre for Human Brain Health, School of Psychology, University of Birmingham, Birmingham, UK [email protected]

Article contents

Abstract
Financial support
Competing interest
References

Rights & Permissions

Abstract

Research on human reasoning has both popularized and struggled with the idea that intuitive and deliberate thoughts stem from two different systems, raising the question how people switch between them. Inspired by research on cognitive control and conflict monitoring, we argue that detecting the need for further thought relies on an intuitive, context-sensitive process that is learned in itself.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 46 , 2023 , e115

DOI: https://doi.org/10.1017/S0140525X22003090 [Opens in a new window]
Copyright: Copyright © The Author(s), 2023. Published by Cambridge University Press

Research on reasoning about moral dilemmas or logical problems has traditionally dissociated fast, intuitive modes of responding from slow, deliberate response strategies, often referred to as system 1 versus system 2. For example, when deciding to take the plane versus train, our system 1 might make us decide to take the former because of its speed, whereas our system 2 could lead to deliberations on its environmental impact and decide for the train. De Neys proposes a new working model wherein both intuitive and deliberate reasoning are thought to originate from initial “system 1”-intuitions whose activations build up over time and potentially trigger an uncertainty signal. When this uncertainty signal reaches a certain threshold, it can trigger the need for deliberate reasoning, upon which deliberate thought or “system 2,” is called upon to further resolve the reasoning problem. Here, we question the need for assuming a separate, deliberate system, that is activated only conditional upon uncertainty detection. Although we are sympathetic to the idea that uncertainty is being monitored and can trigger changes in the thought process, we believe these changes may result from adaptations in decision boundaries (i.e., deciding when to decide) or other control parameters, rather than invoking qualitatively different thought strategies.

Research on cognitive control often focuses on how goal-directed control processes can help us correct, inhibit, or switch away from interfering action tendencies, such as those originating from overtrained associations (Diamond, Reference Diamond2013; Miller & Cohen, Reference Miller and Cohen2001). For example, when deciding between the train or plane, our prior habit of taking the plane might trigger the same decision at first, while our current goal of being more environmentally friendly should lead us to the train. Importantly, recent theories on cognitive control have emphasized how these goal representations and control processes should not be considered as separate “higher” order processes studied in isolation, but that they are deeply embedded in the same associative network that hosts habits and overtrained responses. That is, goals and control functions can be learned, triggered, and regulated, by the same learning principles that govern other forms of behavior (Abrahamse, Braem, Notebaert, & Verguts, Reference Abrahamse, Braem, Notebaert and Verguts2016; Braem & Egner, Reference Braem and Egner2018; Doebel, Reference Doebel2020; Lieder, Shenhav, Musslick, & Griffiths, Reference Lieder, Shenhav, Musslick and Griffiths2018; Logan, Reference Logan1988). For example, much like the value of simple actions, the value of control functions can be learned (Braem, Reference Braem2017; Bustamante, Lieder, Musslick, Shenhav, & Cohen, Reference Bustamante, Lieder, Musslick, Shenhav and Cohen2021; Grahek, Frömer, Prater Fahey, & Shenhav, Reference Grahek, Frömer, Prater Fahey and Shenhav2023; Otto, Braem, Silvetti, & Vassena, Reference Otto, Braem, Silvetti and Vassena2022; Shenhav, Botvinick, & Cohen, Reference Shenhav, Botvinick and Cohen2013; Yang, Xing, Braem, & Pourtois, Reference Yang, Xing, Braem and Pourtois2022). This way, similar to De Neys's suggestion that we can learn intuitions for the alleged systems 1 and 2 responses (or habitual versus goal-directed responses), we argue that people also learn intuitions for different control functions or parameters (see below).

One popular way to study the dynamic interaction between goal-directed and more automatic, habitual response strategies is through the use of evidence accumulation models. In these models, decisions are often thought to be the product of a noisy evidence accumulation process that triggers a certain response once a predetermined decision boundary is reached (Bogacz, Brown, Moehlis, Holmes, & Cohen, Reference Bogacz, Brown, Moehlis, Holmes and Cohen2006; Ratcliff, Smith, Brown, & McKoon, Reference Ratcliff, Smith, Brown and McKoon2016; Shadlen & Shohamy, Reference Shadlen and Shohamy2016). However, this accumulation of evidence does not qualitatively distinguish between the activation of intuitions versus goal-directed or “controlled” deliberation. Instead, both processes start accumulating evidence at the same time, although potentially from different starting points (e.g., biased toward previous choices or goals) or at different rates (e.g., Ulrich, Schröter, Leuthold, & Birngruber, Reference Ulrich, Schröter, Leuthold and Birngruber2015). Depending on how high a decision maker sets their decision boundary, that is, how cautious versus impulsive they are, the goal-directed process will sometimes be too slow to shape, or merely slow down, the decision. These models have been successfully applied to social decision-making problems (e.g., Hutcherson, Bushong, & Rangel, Reference Hutcherson, Bushong and Rangel2015; Son, Bhandari, & FeldmanHall, Reference Son, Bhandari and FeldmanHall2019).

In line with the proposal by De Neys, we agree that competing evidence accumulation processes could trigger an uncertainty signal (e.g., directional deviations in drift rate), once uncertainty reaches a certain threshold, similar to how it has been formalized in the seminal conflict monitoring theory (Botvinick, Braver, Barch, Carter, & Cohen, Reference Botvinick, Braver, Barch, Carter and Cohen2001), itself inspired by Berlyne (Reference Berlyne1960). However, in our view, the resolution of said signal does not require the activation of an independent system but rather induces controlled changes in parameter settings. Thus, unlike activating a system 2 that provides answers by using a different strategy, cognitive control changes the parameters of the ongoing decision process (for a similar argument, see Shenhav, Reference Shenhav2017). For example, it could evoke a simple increase in decision boundary, allowing for the evidence accumulation process to take more time before making a decision (e.g., Cavanagh et al., Reference Cavanagh, Wiecki, Cohen, Figueroa, Samanta, Sherman and Frank2011; Frömer & Shenhav, Reference Frömer and Shenhav2022; Ratcliff & Frank, Reference Ratcliff and Frank2012). The second-order parameters that determine these adaptive control processes (e.g., how high one's uncertainty threshold should be before calling for adaptations, or how much one should increase their boundary) do not need to be made in the moment, but can be learned (e.g., Abrahamse et al., Reference Abrahamse, Braem, Notebaert and Verguts2016).

Although we focused on the boundary as closely mapping onto fast and slow processing, we believe other process parameters can be altered too. For example, the response to uncertainty may require or could be aided by directed attention (Callaway, Rangel, & Griffiths, Reference Callaway, Rangel and Griffiths2021; Jang, Sharma, & Drugowitsch, Reference Jang, Sharma and Drugowitsch2021; Smith & Krajbich, Reference Smith and Krajbich2019), the memory of previous computations (Dasgupta & Gershman, Reference Dasgupta and Gershman2021), learned higher-order strategies (Griffiths et al., Reference Griffiths, Callaway, Chang, Grant, Krueger and Lieder2019; Wang, Reference Wang2021), or the parsing of a problem into different (evidence accumulation) subprocesses (Hunt et al., Reference Hunt, Daw, Kaanders, MacIver, Mugan, Procyk and Kolling2021). Moreover, a decision maker might even mentally simulate several similar decisions to evaluate one's (un)certainty before making a response (e.g., by covertly solving the same problem multiple times, Gershman, Reference Gershman2021). In sum, we argue that both intuitive and deliberate reasoning result from similar evidence accumulation processes whose parameter adjustments rely on conflict monitoring and learning from previous experiences.

Financial support

This work was supported by an ERC Starting grant awarded to S.B. (European Union's Horizon 2020 research and innovation program, grant agreement 852570), a fellowship by the FWO (Fonds Voor Wetenschappelijk Onderzoek–Research Foundation Flanders 11C2322N) awarded to L.H., and grant R01MH124849 from the National Institute of Mental Health awarded to A.S.

Competing interest

None.

References

Abrahamse, E., Braem, S., Notebaert, W., & Verguts, T. (2016). Grounding cognitive control in associative learning. Psychological Bulletin, 142, 693–728.10.1037/bul0000047CrossRef Google Scholar PubMed

Berlyne, D. (1960). Conflict, arousal, and curiosity. McGraw-Hill.10.1037/11164-000CrossRef Google Scholar

Bogacz, R., Brown, E., Moehlis, J., Holmes, P., & Cohen, J. D. (2006). The physics of optimal decision making: A formal analysis of models of performance in two-alternative forced-choice tasks. Psychological Review, 113(4), 700.10.1037/0033-295X.113.4.700CrossRef Google Scholar PubMed

Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S., & Cohen, J. D. (2001). Conflict monitoring and cognitive control. Psychological Review, 108(3), 624.10.1037/0033-295X.108.3.624CrossRef Google Scholar PubMed

Braem, S. (2017). Conditioning task switching behavior. Cognition, 166, 272–276.10.1016/j.cognition.2017.05.037CrossRef Google Scholar PubMed

Braem, S., & Egner, T. (2018). Getting a grip on cognitive flexibility. Current Directions in Psychological Science, 27(6), 470–476.10.1177/0963721418787475CrossRef Google Scholar PubMed

Bustamante, L., Lieder, F., Musslick, S., Shenhav, A., & Cohen, J. (2021). Learning to overexert cognitive control in a Stroop task. Cognitive, Affective, & Behavioral Neuroscience, 21(3), 453–471.10.3758/s13415-020-00845-xCrossRef Google Scholar

Callaway, F., Rangel, A., & Griffiths, T. L. (2021). Fixation patterns in simple choice reflect optimal information sampling. PLoS Computational Biology, 17(3), e1008863.10.1371/journal.pcbi.1008863CrossRef Google Scholar PubMed

Cavanagh, J. F., Wiecki, T. V., Cohen, M. X., Figueroa, C. M., Samanta, J., Sherman, S. J., & Frank, M. J. (2011). Subthalamic nucleus stimulation reverses mediofrontal influence over decision threshold. Nature Neuroscience, 14(11), 1462–1467.10.1038/nn.2925CrossRef Google Scholar PubMed

Dasgupta, I., & Gershman, S. J. (2021). Memory as a computational resource. Trends in Cognitive Sciences, 25(3), 240–251.10.1016/j.tics.2020.12.008CrossRef Google Scholar PubMed

Diamond, A. (2013). Executive functions. Annual Review of Psychology, 64, 135.10.1146/annurev-psych-113011-143750CrossRef Google Scholar PubMed

Doebel, S. (2020). Rethinking executive function and its development. Perspectives on Psychological Science, 15(4), 942–956.10.1177/1745691620904771CrossRef Google Scholar PubMed

Frömer, R., & Shenhav, A. (2022). Filling the gaps: Cognitive control as a critical lens for understanding mechanisms of value-based decision-making. Neuroscience & Biobehavioral Reviews, 134, 104483.10.1016/j.neubiorev.2021.12.006CrossRef Google Scholar PubMed

Gershman, S. (2021). What makes us smart: The computational logic of human cognition. Princeton University Press.Google Scholar

Grahek, I., Frömer, R., Prater Fahey, M., & Shenhav, A. (2023). Learning when effort matters: Neural dynamics underlying updating and adaptation to changes in performance efficacy. Cerebral Cortex, 33(5), 2395–2411.10.1093/cercor/bhac215CrossRef Google Scholar PubMed

Griffiths, T. L., Callaway, F., Chang, M. B., Grant, E., Krueger, P. M., & Lieder, F. (2019). Doing more with less: Meta-reasoning and meta-learning in humans and machines. Current Opinion in Behavioral Sciences, 29, 24–30.10.1016/j.cobeha.2019.01.005CrossRef Google Scholar

Hunt, L. T., Daw, N. D., Kaanders, P., MacIver, M. A., Mugan, U., Procyk, E., … Kolling, N. (2021). Formalizing planning and information search in naturalistic decision-making. Nature Neuroscience, 24(8), 1051–1064.10.1038/s41593-021-00866-wCrossRef Google Scholar PubMed

Hutcherson, C. A., Bushong, B., & Rangel, A. (2015). A neurocomputational model of altruistic choice and its implications. Neuron, 87(2), 451–462.10.1016/j.neuron.2015.06.031CrossRef Google Scholar PubMed

Jang, A. I., Sharma, R., & Drugowitsch, J. (2021). Optimal policy for attention-modulated decisions explains human fixation behavior. eLife, 10, e63436.10.7554/eLife.63436CrossRef Google Scholar PubMed

Lieder, F., Shenhav, A., Musslick, S., & Griffiths, T. L. (2018). Rational metareasoning and the plasticity of cognitive control. PLoS Computational Biology, 14(4), e1006043.10.1371/journal.pcbi.1006043CrossRef Google Scholar PubMed

Logan, G. D. (1988). Toward an instance theory of automatization. Psychological Review, 95(4), 492.10.1037/0033-295X.95.4.492CrossRef Google Scholar

Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual Review of Neuroscience, 24(1), 167–202.10.1146/annurev.neuro.24.1.167CrossRef Google Scholar PubMed

Otto, A. R., Braem, S., Silvetti, M., & Vassena, E. (2022). Is the juice worth the squeeze? Learning the marginal value of mental effort over time. Journal of Experimental Psychology: General, 151(10), 2324–2341.10.1037/xge0001208CrossRef Google Scholar PubMed

Ratcliff, R., & Frank, M. J. (2012). Reinforcement-based decision making in corticostriatal circuits: Mutual constraints by neurocomputational and diffusion models. Neural Computation, 24(5), 1186–1229.10.1162/NECO_a_00270CrossRef Google Scholar PubMed

Ratcliff, R., Smith, P. L., Brown, S. D., & McKoon, G. (2016). Diffusion decision model: Current issues and history. Trends in Cognitive Sciences, 20(4), 260–281.10.1016/j.tics.2016.01.007CrossRef Google Scholar PubMed

Shadlen, M. N., & Shohamy, D. (2016). Decision making and sequential sampling from memory. Neuron, 90(5), 927–939.10.1016/j.neuron.2016.04.036CrossRef Google Scholar PubMed

Shenhav, A. (2017). The perils of losing control: Why self-control is not just another value-based decision. Psychological Inquiry, 28(2–3), 148–152.10.1080/1047840X.2017.1337407CrossRef Google Scholar

Shenhav, A., Botvinick, M. M., & Cohen, J. D. (2013). The expected value of control: An integrative theory of anterior cingulate cortex function. Neuron, 79(2), 217–240.10.1016/j.neuron.2013.07.007CrossRef Google Scholar PubMed

Smith, S. M., & Krajbich, I. (2019). Gaze amplifies value in decision making. Psychological Science, 30(1), 116–128.10.1177/0956797618810521CrossRef Google Scholar PubMed

Son, J. Y., Bhandari, A., & FeldmanHall, O. (2019). Crowdsourcing punishment: Individuals reference group preferences to inform their own punitive decisions. Scientific Reports, 9(1), 1–15.10.1038/s41598-019-48050-2CrossRef Google Scholar PubMed

Ulrich, R., Schröter, H., Leuthold, H., & Birngruber, T. (2015). Automatic and controlled stimulus processing in conflict tasks: Superimposed diffusion processes and delta functions. Cognitive Psychology, 78, 148–174.10.1016/j.cogpsych.2015.02.005CrossRef Google Scholar PubMed

Wang, J. X. (2021). Meta-learning in natural and artificial intelligence. Current Opinion in Behavioral Sciences, 38, 90–95.10.1016/j.cobeha.2021.01.002CrossRef Google Scholar

Yang, Q., Xing, J., Braem, S., & Pourtois, G. (2022). The selective use of punishments on congruent versus incongruent trials in the Stroop task. Neurobiology of Learning and Memory, 193, 107654.10.1016/j.nlm.2022.107654CrossRef Google Scholar PubMed