Binz et al. provide a meritorious survey of the prospects offered by meta-learning for building models of human cognition. Among the main assets of meta-learning discussed by Binz et al. is the capacity to learn inductive bias from experience independently of the constraints enforced by the modeller. Further, Binz et al. showcase the capacity of meta-learning algorithms to approximate Bayesian inference, the gold standard for modelling rational analysis. Finally, they claim that meta-learning offers an unequalled framework for constructing rational models of human cognition that incorporate insights from neuroscience. We propose that an alternative theory of cognition, active inference, shares the same strengths as Binz et al.'s proposal while establishing precise and empirically validated connections to neurobiological mechanisms underlying cognition.
Learning from experience has become a benchmark in all fields aiming to understand and emulate natural intelligence and might be the next driver of developments in artificial intelligence (Zador & Tsao, Reference Zador, Escola, Richards, Ölveczky, Bengio, Boahen and Tsao2023). In this regard, meta-learning joins other frameworks with the potential to advance the understanding of human cognition as it allows learning algorithms to adapt to experience beyond the modeller's intervention. However, active inference provides a distinct advantage as it has the purpose of modelling and understanding how agents engage with their environment. In active inference, cognitive agents – or algorithms – learn through experience by continually refining their internal model of the environment or the task at hand (Friston, FitzGerald, Rigoli, Schwartenbeck, & Pezzulo, Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2016).
Another critical aspect of Binz et al.'s proposal is their insistent reference to Bayes' optimality. The concept has well-grounded theoretical and empirical foundations (Clark, Reference Clark2013) that make it a good standard, justifying Binz et al.'s eagerness to probe their approach against Bayes' optimality. Their algorithm approximates Bayes’ optimality with the mathematical consequence that any cognitive phenomenon accounted for by Bayesian inference can, in theory, be accounted for by meta-learning. However, the flexibility of Binz et al.'s approach to meta-learning entails a reduced interpretability of the resulting models. In active inference, posterior distributions are inferred using the free-energy principle, a variational approach to Bayesian inference that also approximates intractable computations but within a fully interpretable architecture (see below).
We also commend Binz et al. for describing meta-learning's capacity to incorporate insights from neuroscience, a requisite for a computational understanding of cognition (Kriegeskorte & Douglas, Reference Kriegeskorte and Douglas2018). Yet, the biologically inspired elements introduced are ad hoc and case-dependent, as explicitly stated in their conclusion. Their meta-learning models are not motivated by a fundamental biological principle but are conceived as a powerful tool to enhance learning. By contrast, active inference directly translates into neural mechanisms (Friston, FitzGerald, Rigoli, Schwartenbeck, & Pezzulo, Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2017) and originates from a single unifying principle: the imperative for organisms to avoid surprising states, implemented by a continuous loop between drawing hypotheses on hidden states (e.g., mean length of an insect species) and observations (e.g., length of a particular specimen) (Friston, Reference Friston2010). This principle aligns with the Helmoltzian perspective of perception as inference and subsequent Bayesian brain theories. The variational inferential dynamic when receiving new observations can be naturally cast into a constant bidirectional message passing with direct neural implementation as ascending prediction errors and descending predictions (Pezzulo, Parr, & Friston, Reference Pezzulo, Parr and Friston2024). Importantly, these mechanisms are common to all active inference models and enjoy empirical support (e.g., Bastos et al., Reference Bastos, Usrey, Adams, Mangun, Fries and Friston2012; Schwartenbeck, FitzGerald, Mathys, Dolan, & Friston, Reference Schwartenbeck, FitzGerald, Mathys, Dolan and Friston2015).
Learning is a central construct in active inference. Agents constantly update their generative models based on observations and prediction errors, with the imperative of reducing prediction errors. Generative models represent alternative hypotheses about task execution and associated outcomes (e.g., estimating the average length of a species). These hypotheses, and possibly all components of the generative models, are tested and refined through the agent's experience (Friston et al., Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2016). This experience-dependent plasticity has two fundamental assets.
First, learning in active inference is directly and naturally interpreted in terms of biologically plausible neuronal mechanisms. The updates of all components of the generative models are driven by co-occurrences between predicted outcomes (in postsynaptic units in the neuronal interpretation sketched above) and (presynaptic) observational inputs in a process reminiscent of Hebbian learning (Friston et al., Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2016). Consequently, active inference is cast as a process theory that can draw specific empirical predictions on neuronal dynamics (Whyte & Smith, Reference Whyte and Smith2021).
Second, and crucially, agents in active inference learn the reliability of inputs and prediction errors. This precision estimation is akin to learning meta-parameters (as per Binz et al.) as it entails a weighting process that prioritises reliable sources over uninformative inputs. The balance between exploration and exploitation (central constructs in cognition reflecting epistemic affordances and pragmatic value, respectively) rests upon mechanisms with direct neurobiological substrate in terms of dopamine release, with important implications for rational decision-making – for example, in two-armed bandit tasks (Schwartenbeck et al., Reference Schwartenbeck, Passecker, Hauser, FitzGerald, Kronbichler and Friston2019), maze navigation (Kaplan & Friston, Reference Kaplan and Friston2018) and computational psychiatry (Smith, Badcock, & Friston, Reference Smith, Badcock and Friston2021). Another essential strength of active inference for implementing meta-learning is its natural hierarchical extension. Upper levels can control parameters of lower levels, enabling inference at different timescales whereby learning at lower levels is optimised over time by top-down adjustments from upper levels, which has direct neuronal interpretation in multi-scale hierarchical brain organisation (Pezzulo, Rigoli, & Friston, Reference Pezzulo, Rigoli and Friston2018).
Model preference depends on performance and, primarily, on the scientific question at hand. To understand cognition and its mechanistic underpinnings, models whose components and articulations can be directly interpreted in terms of neural mechanisms are essential. Active inference is a principled, biologically plausible and fully interpretable model of cognition with promising applications to artificial intelligence that accounts for neurobiological and psychological phenomena. We contend that it provides a comprehensive model for understanding biological systems and improving artificial cognition.
Binz et al. provide a meritorious survey of the prospects offered by meta-learning for building models of human cognition. Among the main assets of meta-learning discussed by Binz et al. is the capacity to learn inductive bias from experience independently of the constraints enforced by the modeller. Further, Binz et al. showcase the capacity of meta-learning algorithms to approximate Bayesian inference, the gold standard for modelling rational analysis. Finally, they claim that meta-learning offers an unequalled framework for constructing rational models of human cognition that incorporate insights from neuroscience. We propose that an alternative theory of cognition, active inference, shares the same strengths as Binz et al.'s proposal while establishing precise and empirically validated connections to neurobiological mechanisms underlying cognition.
Learning from experience has become a benchmark in all fields aiming to understand and emulate natural intelligence and might be the next driver of developments in artificial intelligence (Zador & Tsao, Reference Zador, Escola, Richards, Ölveczky, Bengio, Boahen and Tsao2023). In this regard, meta-learning joins other frameworks with the potential to advance the understanding of human cognition as it allows learning algorithms to adapt to experience beyond the modeller's intervention. However, active inference provides a distinct advantage as it has the purpose of modelling and understanding how agents engage with their environment. In active inference, cognitive agents – or algorithms – learn through experience by continually refining their internal model of the environment or the task at hand (Friston, FitzGerald, Rigoli, Schwartenbeck, & Pezzulo, Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2016).
Another critical aspect of Binz et al.'s proposal is their insistent reference to Bayes' optimality. The concept has well-grounded theoretical and empirical foundations (Clark, Reference Clark2013) that make it a good standard, justifying Binz et al.'s eagerness to probe their approach against Bayes' optimality. Their algorithm approximates Bayes’ optimality with the mathematical consequence that any cognitive phenomenon accounted for by Bayesian inference can, in theory, be accounted for by meta-learning. However, the flexibility of Binz et al.'s approach to meta-learning entails a reduced interpretability of the resulting models. In active inference, posterior distributions are inferred using the free-energy principle, a variational approach to Bayesian inference that also approximates intractable computations but within a fully interpretable architecture (see below).
We also commend Binz et al. for describing meta-learning's capacity to incorporate insights from neuroscience, a requisite for a computational understanding of cognition (Kriegeskorte & Douglas, Reference Kriegeskorte and Douglas2018). Yet, the biologically inspired elements introduced are ad hoc and case-dependent, as explicitly stated in their conclusion. Their meta-learning models are not motivated by a fundamental biological principle but are conceived as a powerful tool to enhance learning. By contrast, active inference directly translates into neural mechanisms (Friston, FitzGerald, Rigoli, Schwartenbeck, & Pezzulo, Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2017) and originates from a single unifying principle: the imperative for organisms to avoid surprising states, implemented by a continuous loop between drawing hypotheses on hidden states (e.g., mean length of an insect species) and observations (e.g., length of a particular specimen) (Friston, Reference Friston2010). This principle aligns with the Helmoltzian perspective of perception as inference and subsequent Bayesian brain theories. The variational inferential dynamic when receiving new observations can be naturally cast into a constant bidirectional message passing with direct neural implementation as ascending prediction errors and descending predictions (Pezzulo, Parr, & Friston, Reference Pezzulo, Parr and Friston2024). Importantly, these mechanisms are common to all active inference models and enjoy empirical support (e.g., Bastos et al., Reference Bastos, Usrey, Adams, Mangun, Fries and Friston2012; Schwartenbeck, FitzGerald, Mathys, Dolan, & Friston, Reference Schwartenbeck, FitzGerald, Mathys, Dolan and Friston2015).
Learning is a central construct in active inference. Agents constantly update their generative models based on observations and prediction errors, with the imperative of reducing prediction errors. Generative models represent alternative hypotheses about task execution and associated outcomes (e.g., estimating the average length of a species). These hypotheses, and possibly all components of the generative models, are tested and refined through the agent's experience (Friston et al., Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2016). This experience-dependent plasticity has two fundamental assets.
First, learning in active inference is directly and naturally interpreted in terms of biologically plausible neuronal mechanisms. The updates of all components of the generative models are driven by co-occurrences between predicted outcomes (in postsynaptic units in the neuronal interpretation sketched above) and (presynaptic) observational inputs in a process reminiscent of Hebbian learning (Friston et al., Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2016). Consequently, active inference is cast as a process theory that can draw specific empirical predictions on neuronal dynamics (Whyte & Smith, Reference Whyte and Smith2021).
Second, and crucially, agents in active inference learn the reliability of inputs and prediction errors. This precision estimation is akin to learning meta-parameters (as per Binz et al.) as it entails a weighting process that prioritises reliable sources over uninformative inputs. The balance between exploration and exploitation (central constructs in cognition reflecting epistemic affordances and pragmatic value, respectively) rests upon mechanisms with direct neurobiological substrate in terms of dopamine release, with important implications for rational decision-making – for example, in two-armed bandit tasks (Schwartenbeck et al., Reference Schwartenbeck, Passecker, Hauser, FitzGerald, Kronbichler and Friston2019), maze navigation (Kaplan & Friston, Reference Kaplan and Friston2018) and computational psychiatry (Smith, Badcock, & Friston, Reference Smith, Badcock and Friston2021). Another essential strength of active inference for implementing meta-learning is its natural hierarchical extension. Upper levels can control parameters of lower levels, enabling inference at different timescales whereby learning at lower levels is optimised over time by top-down adjustments from upper levels, which has direct neuronal interpretation in multi-scale hierarchical brain organisation (Pezzulo, Rigoli, & Friston, Reference Pezzulo, Rigoli and Friston2018).
Model preference depends on performance and, primarily, on the scientific question at hand. To understand cognition and its mechanistic underpinnings, models whose components and articulations can be directly interpreted in terms of neural mechanisms are essential. Active inference is a principled, biologically plausible and fully interpretable model of cognition with promising applications to artificial intelligence that accounts for neurobiological and psychological phenomena. We contend that it provides a comprehensive model for understanding biological systems and improving artificial cognition.
Financial support
O. P. was funded by a Maria Zambrano Fellowship for the attraction of international talent for the requalification of the Spanish university system—Next Generation EU.
Competing interests
None.