Hostname: page-component-586b7cd67f-dlnhk Total loading time: 0 Render date: 2024-11-22T19:15:01.242Z Has data issue: false hasContentIssue false

The model-resistant richness of human visual experience

Published online by Cambridge University Press:  06 December 2023

Jianghao Liu
Affiliation:
Sorbonne Université, Inserm, CNRS, Paris Brain Institute, ICM, Hôpital de la Pitié-Salpêtrière, Paris, France [email protected] [email protected] Dassault Systèmes, Vélizy-Villacoublay, France
Paolo Bartolomeo
Affiliation:
Sorbonne Université, Inserm, CNRS, Paris Brain Institute, ICM, Hôpital de la Pitié-Salpêtrière, Paris, France [email protected] [email protected]

Abstract

Current deep neural networks (DNNs) are far from being able to model the rich landscape of human visual experience. Beyond visual recognition, we explore the neural substrates of visual mental imagery and other visual experiences. Rather than shared visual representations, temporal dynamics and functional connectivity of the process are essential. Generative adversarial networks may drive future developments in simulating human visual experience.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

Bowers et al. report several lines of evidence challenging the alleged similarities between deep neural network (DNN) models of visual recognition and their biological counterparts. However, human visual experience is not limited to visual recognition. In addition to the case of visual illusion presented by Bowers et al., it is important for models of the human visual system to consider a range of other visual experiences, including visual hallucinations, dreams, and mental imagery. For example, most of us can “visualize” objects in their absence, by engaging in visual mental imagery. Using partially shared neural machinery used for visual perception, visual mental imagery allows us to make predictions based on past experiences, imagine future possibilities, and simulate the possible outcomes of our decisions. Our commentary focuses on these relationships and is structured into four key points.

First, shared neural substrates of visual perception and visual mental imagery include high-level visual regions in the ventral temporal cortex (Bartolomeo, Hajhajate, Liu, & Spagna, Reference Bartolomeo, Hajhajate, Liu and Spagna2020; Spagna, Hajhajate, Liu, & Bartolomeo, Reference Spagna, Hajhajate, Liu and Bartolomeo2021). In the absence of visual input, these regions are activated top-down by other systems, such as the semantic system and the frontoparietal attention networks. Bowers et al. highlighted the challenge of modeling top-down activity with feedforward DNNs. It is currently believed that the visual system relies on distinct feedback signals to cortical layers and exhibits individual temporal dynamics for different visual experiences. In particular, visual stimulation modulates activities in mid-layers, while contextual information or illusory content feedbacks to superficial layers, and visual imagery feedbacks to deeper cortical layers (Bergmann, Morgan, & Muckli, Reference Bergmann, Morgan and Muckli2019; Muckli et al., Reference Muckli, De Martino, Vizioli, Petro, Smith, Ugurbil and Yacoub2015). Visual imagery exhibits temporal overlap with perceptual processing during late stages of processing (Dijkstra, Mostert, Lange, Bosch, & van Gerven, Reference Dijkstra, Mostert, Lange, Bosch and van Gerven2018), likely corresponding to activity in the ventral temporal cortex but not in the early visual cortex (Spagna et al., Reference Spagna, Hajhajate, Liu and Bartolomeo2021). In contrast, patients with Charles Bonnet hallucinations show a gradual increase in activity in the early visual cortex, which then gradually decreases as it moves further along the visual hierarchy (Hahamy, Wilf, Rosin, Behrmann, & Malach, Reference Hahamy, Wilf, Rosin, Behrmann and Malach2021).

Second, evidence from neuropsychology, neuroimaging, and direct cortical stimulation suggests striking differences in the activity of the ventral temporal cortex in the two hemispheres when processing visual information (Liu, Spagna, & Bartolomeo, Reference Liu, Spagna and Bartolomeo2022b). While direct cortical electrical stimulation tends to produce visual hallucinatory experiences predominantly when applied to the right temporal lobe, there is a strong lateralization to the left hemisphere for voluntary visual mental imagery. These asymmetries could potentially stem from particular hemispheric networks' predispositions toward constructing mental models of the external environment or verifying them through real-world testing (Bartolomeo & Seidel Malkinson, Reference Bartolomeo and Seidel Malkinson2022). After unilateral brain strokes, in some cases the healthy hemisphere can compensate for the visual deficit (Bartolomeo & Thiebaut de Schotten, Reference Bartolomeo and Thiebaut de Schotten2016). At present, DNN models do not incorporate either hemispheric asymmetries or the potential reorganization of these asymmetries following a stroke.

Third, some otherwise neurotypical individuals show unusually weak or strong visual mental imagery (aphantasia and hyperphantasia) (Keogh, Pearson, & Zeman, Reference Keogh, Pearson and Zeman2021; Milton et al., Reference Milton, Fulford, Dance, Gaddum, Heuerman-Williamson, Jones and Zeman2021). Aphantasic individuals perform visual imagery and visual perceptual tasks with similar accuracy than typical imagers, but with slower response times (Liu & Bartolomeo, Reference Liu and Bartolomeo2023). Consistent with these behavioral results, ultra-high field fMRI shows similar activation patterns between typical imagers and individuals with congenital aphantasia (Liu et al., Reference Liu, Zhan, Hajhajate, Spagna, Dehaene, Cohen and Bartolomeo2023). The fusiform imagery node, a high-level visual region in the left-hemisphere ventral temporal cortex (Spagna et al., Reference Spagna, Hajhajate, Liu and Bartolomeo2021), coactivates with dorsolateral frontoparietal networks in typical imagers, but is functionally isolated from these networks in aphantasic individuals during both imagery and perception. These findings suggest that high-level visual information in the ventral cortical stream is not sufficient to generate a conscious visual experience, and that a functional disconnection from frontoparietal networks may be responsible for the lack of experiential content in visual mental imagery in aphantasic individuals.

Fourth, in line with the previous point on the importance of frontoparietal networks, the way we subjectively experience both perceptions and mental images relies heavily on the interaction with other cognitive processes, such as attention and visual working memory. Despite their importance, these factors are not taken into account in DNN modeling. A recent study using human intracerebral recordings and single-layer recurrent neural network modeling found that the dynamic interactions between specific frontoparietal attentional networks and high-level visual areas play a crucial role in conscious visual perception (Liu et al., Reference Liu, Bayle, Spagna, Sitt, Bourgeois, Lehongre and Bartolomeo2023).

This evidence from the biological human brain can inspire future developments of DNNs in simulating the cognitive architecture of human visual experience. Generative adversarial networks may be promising candidates to drive these efforts forward. For instance, imagery mechanisms could act as the generator of quasi-perceptual experiences, while reality monitoring could serve as the discriminator to distinguish between sensory inputs from real or imagined sources (Gershman, Reference Gershman2019; Lau, Reference Lau2019). Recent studies investigated involuntary visual experiences using generative neural network models, such as in memory replay (van de Ven, Siegelmann, & Tolias, Reference van de Ven, Siegelmann and Tolias2020), intrusive imagery (Cushing et al., Reference Cushing, Dawes, Hofmann, Lau, LeDoux and Taschereau-Dumouchel2023), and adversarial dreaming (Deperrois, Petrovici, Senn, & Jordan, Reference Deperrois, Petrovici, Senn and Jordan2022). Regarding voluntary visual mental imagery, some key strategies may involve modeling the retrieval process of representations pertaining to semantic information and visual features (Liu et al., Reference Liu, Zhan, Hajhajate, Spagna, Dehaene, Cohen and Bartolomeo2023), and incorporating biologically inspired recurrence in visual imagery processing (Lindsay, Mrsic-Flogel, & Sahani, Reference Lindsay, Mrsic-Flogel and Sahani2022).

In conclusion, we suggest that shared representations in visual cortex are not the primary factor in generating and distinguishing distinct visual experiences. Rather, the temporal dynamics and functional connectivity of the process are essential. Current DNNs are inadequate to accurately model the complexity of human visual experience. Biologically inspired generative adversarial networks may provide novel ways of simulating the varieties of human visual experience.

Financial support

J. L. received funding from Dassault Systèmes. The work of P. B. is supported by the Agence Nationale de la Recherche through ANR-16-CE37-0005 and ANR-10-IAIHU-06, and by the Fondation pour la Recherche sur les AVC through FR-AVC-017.

Competing interest

None.

References

Bartolomeo, P., Hajhajate, D., Liu, J., & Spagna, A. (2020). Assessing the causal role of early visual areas in visual mental imagery. Nature Reviews Neuroscience, 21(9), 517. https://doi.org/10.1038/s41583-020-0348-5CrossRefGoogle ScholarPubMed
Bartolomeo, P., Seidel Malkinson, T. (2022). Building models, testing models: Asymmetric roles of SLF III networks?: Comment on “Left and right temporal-parietal junctions (TPJs) as ‘match/mismatch’ hedonic machines: A unifying account of TPJ function” by Doricchi et al. Physics of Life Reviews, 44, 7072. https://doi.org/10/grrsd8CrossRefGoogle ScholarPubMed
Bartolomeo, P., & Thiebaut de Schotten, M. (2016). Let thy left brain know what thy right brain doeth: Inter-hemispheric compensation of functional deficits after brain damage. Neuropsychologia, 93, 407412. https://doi.org/10/f9g9wbCrossRefGoogle ScholarPubMed
Bergmann, J., Morgan, A. T., & Muckli, L. (2019). Two distinct feedback codes in V1 for “real” and “imaginary: Internal experiences. bioRxiv, 664870. https://doi.org/10.1101/664870Google Scholar
Cushing, C. A., Dawes, A. J., Hofmann, S. G., Lau, H., LeDoux, J. E., & Taschereau-Dumouchel, V. (2023). A generative adversarial model of intrusive imagery in the human brain. PNAS Nexus, 2(1), pgac265. https://doi.org/10.1093/pnasnexus/pgac265CrossRefGoogle ScholarPubMed
Deperrois, N., Petrovici, M. A., Senn, W., & Jordan, J. (2022). Learning cortical representations through perturbed and adversarial dreaming. eLife, 11, e76384. https://doi.org/10.7554/eLife.76384CrossRefGoogle ScholarPubMed
Dijkstra, N., Mostert, P., Lange, F. P., Bosch, S., & van Gerven, M. A. (2018). Differential temporal dynamics during visual imagery and perception. eLife, 7, e33904. doi: https://doi.org/10.7554/eLife.33904CrossRefGoogle ScholarPubMed
Gershman, S. J. (2019). The generative adversarial brain. Frontiers in Artificial Intelligence, 2, 486362. doi: https://www.frontiersin.org/articles/10.3389/frai.2019.00018CrossRefGoogle ScholarPubMed
Hahamy, A., Wilf, M., Rosin, B., Behrmann, M., & Malach, R. (2021). How do the blind “see”? The role of spontaneous brain activity in self-generated perception. Brain, 144(1), 340353. https://doi.org/10.1093/brain/awaa384CrossRefGoogle ScholarPubMed
Keogh, R., Pearson, J., & Zeman, A. (2021). Aphantasia: The science of visual imagery extremes. Handbook of Clinical Neurology, 178, 277296. https://doi.org/10.1016/B978-0-12-821377-3.00012-XCrossRefGoogle ScholarPubMed
Lau, H. (2019). Consciousness, metacognition, & perceptual reality monitoring. PsyArXiv. https://doi.org/10.31234/osf.io/ckbyfGoogle Scholar
Lindsay, G. W., Mrsic-Flogel, T. D., & Sahani, M. (2022). Bio-inspired neural networks implement different recurrent visual processing strategies than task-trained ones do. bioRxiv, 2022.03.07.483196. https://doi.org/10.1101/2022.03.07.483196Google Scholar
Liu, J., & Bartolomeo, P. (2023). Probing the unimaginable: The impact of aphantasia on distinct domains of visual mental imagery and visual perception. Cortex, 166, 338347. doi: 10.1016/j.cortex.2023.06.003CrossRefGoogle ScholarPubMed
Liu, J., Bayle, D. J., Spagna, A., Sitt, J. D., Bourgeois, A., Lehongre, K., … Bartolomeo, P. (2023). Fronto-parietal networks shape human conscious report through attention gain and reorienting. Communications Biology, 6, 730. doi: 10.1038/s42003-023-05108-2CrossRefGoogle ScholarPubMed
Liu, J., Spagna, A., & Bartolomeo, P. (2022b). Hemispheric asymmetries in visual mental imagery. Brain Structure and Function, 227(2), 697708. https://doi.org/10.1007/s00429-021-02277-wCrossRefGoogle ScholarPubMed
Liu, J., Zhan, M., Hajhajate, D., Spagna, A., Dehaene, S., Cohen, L., & Bartolomeo, P. (2023). Ultra-high field fMRI of visual mental imagery in typical imagers and aphantasic individuals. bioRxiv. https://doi.org/10.1101/2023.06.14.544909Google Scholar
Milton, F., Fulford, J., Dance, C., Gaddum, J., Heuerman-Williamson, B., Jones, K., … Zeman, A. (2021). Behavioral and neural signatures of visual imagery vividness extremes: Aphantasia versus hyperphantasia. Cerebral Cortex Communications, 2(2), tgab035. https://doi.org/10.1093/texcom/tgab035CrossRefGoogle ScholarPubMed
Muckli, L., De Martino, F., Vizioli, L., Petro, L. S., Smith, F. W., Ugurbil, K., … Yacoub, E. (2015). Contextual feedback to superficial layers of V1. Current Biology, 25(20), 26902695. https://doi.org/10.1016/j.cub.2015.08.057CrossRefGoogle ScholarPubMed
Spagna, A., Hajhajate, D., Liu, J., & Bartolomeo, P. (2021). Visual mental imagery engages the left fusiform gyrus, but not the early visual cortex: A meta-analysis of neuroimaging evidence. Neuroscience & Biobehavioral Reviews, 122, 201217. https://doi.org/10.1016/j.neubiorev.2020.12.029CrossRefGoogle Scholar
van de Ven, G. M., Siegelmann, H. T., & Tolias, A. S. (2020). Brain-inspired replay for continual learning with artificial neural networks. Nature Communications, 11(1), Article 1. https://doi.org/10.1038/s41467-020-17866-2CrossRefGoogle ScholarPubMed