By one definition of “explanation,” enjoining deep neural network researchers to directly address the functional capabilities and behaviors exhibited by the neurological networks of human brains – as Bowers et al. do – is similar to requiring engineers of digital clocks to use their to explain the specific workings of analog watch subassemblies and components. It also highlights the differences between simulation and emulation on the one hand, and between explanation and prediction or enaction on the other. Each of these approaches spells out a very different conception of what explanations are for, and how they explain.
A digital and an analog clock both keep (relative) time. An analog clock does so by converting energy stored in the translational movement of a spring into the rotational energy of a set of gears whose ratios are designed to register time lapses indexed by the movement of a catchment that indexes a set duration (of, say, 1 second). A digital clock starts with a crystal oscillator vibrating at a high frequency (e.g., 60 Hz), which is digitally subdivided down to the desired frequency (1 Hz) of a wave whose consecutive peaks mark the desired interval. Although the inner components and assemblies of the clocks are very different, the digital clock “models” the analog clock in the sense that it replicates its function; and, vice versa. But the two clocks do not share common components, which is why we cannot use our understanding of the digital clock's mechanisms to understand “how the analog clock works,” if by understand we mean emulate, or, replicate at the component and subassembly level the function of each component or subassembly of an entity – in the way in which one digital computer can emulate the workings of another by executing all of the functions of each of its assemblies.
Explanation as means for emulation: The “artificial replication” approach. If we take the “emulation” route to explanation, we are confronted with a component-level “modeling mismatch” between neurons in a neural network storing information in weights that are integer or rational (i.e., finite-precision) numbers and neurons in biological neurons whose weights can be real numbers that are theoretically capable of storing infinite amounts of information, and, even if truncated, their resolution can be adaptively varied (Balcazar, Gavalda, & Siegelmann, Reference Balcazar, Gavalda and Siegelmann1997). This mismatch cannot be offset by creating neural nets that merely mimic the heterogeneity of human neurons and the topology of brain networks to test for relationships between structure and function “one at a time,” even if we model a single neuron by a deep neural net (Beniaguev, Segev, & London, Reference Beniaguev, Segev and London2021): There is a degree of freedom (weight quantization) that is missing from the model. Moreover, DNNs work in discrete and fixed time steps and do not therefore adequately replicate the fluidly adaptive time constants of real neurological assemblies. And, the smoothing nonlinearities artificially introduced in neural networks to satisfy regularity and convergence properties are introduced ad hoc, to optimize for the properties of an output, rather than allowed to emerge and evolve as a function of time.
So, if by “understanding” we mean that explanantia need to be emulated by the explananda, then we need to engineer building blocks for deep neural networks that heed the continuity and adaptive informational breadth of neurological networks. One example is the design of liquid time constant networks (Hasani, Lechner, Amini, Rus, & Grosu, Reference Hasani, Lechner, Amini, Rus and Grosu2020), built from assemblies of linear, first-order dynamical systems connected by nonlinear gates, which embody dynamical systems with variable (“liquid”) time constants, and achieve higher levels of expressiveness (while maintaining stability) than do their counterparts with fixed time steps and hard-wired nonlinearities. One can alternatively seek to relax the constraint on quantization or resolution for the weights of a neural network (Jia, Lam, & Althoefer, Reference Jia, Lam and Althoefer2022) to more closely resemble the features of their cortical counterparts.
Explanation as means to prediction and production: The “invisible hand approach”. On the contrary, we can take the view that all and only what “understanding” an entity means is predicting and producing the behaviors it exhibits. This is the approach deep neural net designers have taken, at the cost of abstracting away well-defined tasks such as object classification and time series prediction from the panoply of human capabilities, to engineer simple reward functions. Plausibly, this simplificatory approach to neural net engineering has contributed to the divergence of the fields of visual neuroscience and automatic pattern recognition and image classification that Bowers et al. point to. It is, then, unsurprising that deep neural networks currently in use do not replicate human functions such as combinatorial generation of the “possible ways an object can look when turned” and the parsing of two-dimensional (2D) scenes for depth reconstruction and whole-part decomposition: Learning to perform these tasks requires different – and often more complicated – reward functions, that track the multiplicity of ways in which a human uses vision in the wild and the multiplicity of goals one might have when “looking.” Introducing a human in the training loop of a machine is equivalent to creating rewards that encode the complex credit assignment map of a task (designing successful communicative acts) without having to specify, ex ante, why or how that complexity arises. Tellingly, the recent advances in the performance of large language models (e.g., GPT2 to GPT3.5 via InstructGPT) are traceable not only to the increase in the parameter space of the new models, but, more importantly, to the use “human-in-the-loop” reinforcement learning (Ouyang et al., Reference Ouyang, Wu, Jiang, Almeida, Wainwright, Mishkin and Lowe2022) that incorporates feedback from untrained humans that do not “understand the underlying model” but answer questions (e.g., “Helpful? Honest? Harmless?”) in ways that help fine tune it in accordance with a set of human preferences over sequences of acts that induce a multi-dimensional objective function (“what is a successful communicative act?”) which the raters may also not fully understand. One does not have to “know what one is doing” to sufficiently “understand” an environment from sparse inputs provided by people who also do not explicitly “know what they are doing” to provide them.
By one definition of “explanation,” enjoining deep neural network researchers to directly address the functional capabilities and behaviors exhibited by the neurological networks of human brains – as Bowers et al. do – is similar to requiring engineers of digital clocks to use their to explain the specific workings of analog watch subassemblies and components. It also highlights the differences between simulation and emulation on the one hand, and between explanation and prediction or enaction on the other. Each of these approaches spells out a very different conception of what explanations are for, and how they explain.
A digital and an analog clock both keep (relative) time. An analog clock does so by converting energy stored in the translational movement of a spring into the rotational energy of a set of gears whose ratios are designed to register time lapses indexed by the movement of a catchment that indexes a set duration (of, say, 1 second). A digital clock starts with a crystal oscillator vibrating at a high frequency (e.g., 60 Hz), which is digitally subdivided down to the desired frequency (1 Hz) of a wave whose consecutive peaks mark the desired interval. Although the inner components and assemblies of the clocks are very different, the digital clock “models” the analog clock in the sense that it replicates its function; and, vice versa. But the two clocks do not share common components, which is why we cannot use our understanding of the digital clock's mechanisms to understand “how the analog clock works,” if by understand we mean emulate, or, replicate at the component and subassembly level the function of each component or subassembly of an entity – in the way in which one digital computer can emulate the workings of another by executing all of the functions of each of its assemblies.
Explanation as means for emulation: The “artificial replication” approach. If we take the “emulation” route to explanation, we are confronted with a component-level “modeling mismatch” between neurons in a neural network storing information in weights that are integer or rational (i.e., finite-precision) numbers and neurons in biological neurons whose weights can be real numbers that are theoretically capable of storing infinite amounts of information, and, even if truncated, their resolution can be adaptively varied (Balcazar, Gavalda, & Siegelmann, Reference Balcazar, Gavalda and Siegelmann1997). This mismatch cannot be offset by creating neural nets that merely mimic the heterogeneity of human neurons and the topology of brain networks to test for relationships between structure and function “one at a time,” even if we model a single neuron by a deep neural net (Beniaguev, Segev, & London, Reference Beniaguev, Segev and London2021): There is a degree of freedom (weight quantization) that is missing from the model. Moreover, DNNs work in discrete and fixed time steps and do not therefore adequately replicate the fluidly adaptive time constants of real neurological assemblies. And, the smoothing nonlinearities artificially introduced in neural networks to satisfy regularity and convergence properties are introduced ad hoc, to optimize for the properties of an output, rather than allowed to emerge and evolve as a function of time.
So, if by “understanding” we mean that explanantia need to be emulated by the explananda, then we need to engineer building blocks for deep neural networks that heed the continuity and adaptive informational breadth of neurological networks. One example is the design of liquid time constant networks (Hasani, Lechner, Amini, Rus, & Grosu, Reference Hasani, Lechner, Amini, Rus and Grosu2020), built from assemblies of linear, first-order dynamical systems connected by nonlinear gates, which embody dynamical systems with variable (“liquid”) time constants, and achieve higher levels of expressiveness (while maintaining stability) than do their counterparts with fixed time steps and hard-wired nonlinearities. One can alternatively seek to relax the constraint on quantization or resolution for the weights of a neural network (Jia, Lam, & Althoefer, Reference Jia, Lam and Althoefer2022) to more closely resemble the features of their cortical counterparts.
Explanation as means to prediction and production: The “invisible hand approach”. On the contrary, we can take the view that all and only what “understanding” an entity means is predicting and producing the behaviors it exhibits. This is the approach deep neural net designers have taken, at the cost of abstracting away well-defined tasks such as object classification and time series prediction from the panoply of human capabilities, to engineer simple reward functions. Plausibly, this simplificatory approach to neural net engineering has contributed to the divergence of the fields of visual neuroscience and automatic pattern recognition and image classification that Bowers et al. point to. It is, then, unsurprising that deep neural networks currently in use do not replicate human functions such as combinatorial generation of the “possible ways an object can look when turned” and the parsing of two-dimensional (2D) scenes for depth reconstruction and whole-part decomposition: Learning to perform these tasks requires different – and often more complicated – reward functions, that track the multiplicity of ways in which a human uses vision in the wild and the multiplicity of goals one might have when “looking.” Introducing a human in the training loop of a machine is equivalent to creating rewards that encode the complex credit assignment map of a task (designing successful communicative acts) without having to specify, ex ante, why or how that complexity arises. Tellingly, the recent advances in the performance of large language models (e.g., GPT2 to GPT3.5 via InstructGPT) are traceable not only to the increase in the parameter space of the new models, but, more importantly, to the use “human-in-the-loop” reinforcement learning (Ouyang et al., Reference Ouyang, Wu, Jiang, Almeida, Wainwright, Mishkin and Lowe2022) that incorporates feedback from untrained humans that do not “understand the underlying model” but answer questions (e.g., “Helpful? Honest? Harmless?”) in ways that help fine tune it in accordance with a set of human preferences over sequences of acts that induce a multi-dimensional objective function (“what is a successful communicative act?”) which the raters may also not fully understand. One does not have to “know what one is doing” to sufficiently “understand” an environment from sparse inputs provided by people who also do not explicitly “know what they are doing” to provide them.
Financial support
This work was supported by the Desautels Centre for Integrative Thinking, Rotman School of Management, University of Toronto.
Competing interest
None.