Hostname: page-component-586b7cd67f-g8jcs Total loading time: 0 Render date: 2024-11-23T01:46:15.636Z Has data issue: false hasContentIssue false

Is evidence of language-like properties evidence of a language-of-thought architecture?

Published online by Cambridge University Press:  28 September 2023

Nuhu Osman Attah
Affiliation:
Department of History and Philosophy of Science, University of Pittsburgh, Pittsburgh, PA, USA [email protected], www.nuhuosmanattah.com
Edouard Machery
Affiliation:
Department of History and Philosophy of Science, University of Pittsburgh, Pittsburgh, PA, USA [email protected], www.nuhuosmanattah.com Center for Philosophy of Science, University of Pittsburgh, Pittsburgh, PA, USA [email protected], www.edouardmachery.com African Centre for Epistemology and Philosophy of Science, University of Johannesburg, Johannesburg, South Africa

Abstract

We argue that Quilty-Dunn et al.'s commitment to representational pluralism undermines their case for the language-of-thought hypothesis as the evidence they present is consistent with the operation of the other representational formats that they are willing to accept.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

Quilty-Dunn et al. have convincingly shown that a variety of cognitive domains are characterized by some of the six properties they delineate: (1) Discrete constituents, (2) role-filler independence, (3) predicate–argument structure, (4) logical operators, (5) inferential promiscuity, and (6) abstract conceptual content. (We refer to these as the six “core properties.”) Foregrounding these properties is a worthwhile contribution because it establishes a framework and terminology for discussing features of cognition that hypotheses about representation must explain. We hope that this taxonomy will be expanded and refined: As it stands, some of the properties are defined so vaguely that they are too readily discoverable in cognition. For instance, role-filler independence requires that the same representational constituents can be deployed in different syntactic roles, but both the criterion for sameness of representational constituents and the relevant notion of syntax are left intuitive so that even the swapping of visual features of objects (e.g., misattributing the color of one object to another) counts as a demonstration of role-filler independence. On the other hand, the authors conveniently take successful demonstrations of compositionality in connectionist networks to fall short of role-filler independence because they “fail to preserve identity of the original representational elements” (target article, sect. 2, para. 7), even though no account of representational identity is given.

However, the authors’ aim is not merely to characterize these properties, but to show that they form the homeostatic cluster that marks a language of thought. This ambitious project fails because of the authors’ commitment to representational pluralism. The authors concede that language-like representations and the many other formats of representation that they are happy to accept share some properties: “Many, perhaps all, of these properties are not necessary for a representational scheme to count as a LoT, and some may be shared with other formats” (target article, sect. 2, para. 3). To give examples from the connectionist literature, even simple twentieth-century style connectionist networks form abstract representations (Clark, Reference Clark1993) and modern networks fare even better (Stoianov & Zorzi, Reference Stoianov and Zorzi2012); older networks can also bind values and variables (Smolensky, Reference Smolensky1990); there has even been progress on the use of logical operators (Irving et al., Reference Irving, Szegedy, Alemi, Eén, Chollet, Urban, Lee, Sugiyama, Luxburg, Guyon and Garnett2016; Bansal, Loos, Rabe, Szegedy, & Wilcox, Reference Bansal, Loos, Rabe, Szegedy and Wilcox2019; Dai, Xu, Yu, & Zhou, Reference Dai, Xu, Yu and Zhou2019). And in any case, because neural networks are universal function approximators there is ground for optimism about the prospects of architectures that do not implement a language of thought.

The authors’ representational pluralism undermines the inference to language-like representations from the observation of some of the core properties in some cognitive domain: These properties could simply result from any of the many other representational formats that the authors are willing to accept. The fallacy is similar to the issue with reverse inference (Machery, Reference Machery2014; Poldrack, Reference Poldrack2006): Although the likelihood of observing the core properties if representations are language-like is high, it is fallacious to infer that representations are language-like if these properties are observed because core properties could be observed even if representations are not language-like.

Quilty-Dunn et al. might reply that while some of the properties can be realized by nonlanguage-like representational formats, we are entitled to infer language-like representational structures where they cluster: As they say, such clustering “would be surprising from a theory-neutral point of view, but not from the perspective of LoTH” (target article, sect. 2, para. 13). We see two issues with this reply. First, only some of the six core properties are observed in each of the few cognitive domains discussed in the paper: Three properties are demonstrated by implicit social cognition and four (the maximal number of cooccurring core properties) in the object-files case. Shall we conclude that only a few cognitive domains involve language-like representations? An interesting conclusion surely, but one that is much less exciting than the one touted by Quilty-Dunn et al.

Second, Quilty-Dunn et al. haven't even shown that clustering of the core properties is a unique prediction of the language-of-thought hypothesis. Many of the core properties are in fact coinstantiated in neural networks. For instance, the outputs of a sequence-to-sequence language model like BERT evince (at the very least) role-filler independence and predicate–argument structure (in addition to the general capacity for abstraction demonstrated by neural networks). Evidence suggests that these characteristics are underlain by systematic syntactic and semantic competences (Clark, Khandelwal, Levy, & Manning, Reference Clark, Khandelwal, Levy and Manning2019; Tenney, Das, & Pavlick, Reference Tenney, Das and Pavlick2019). Thus, other architectures are consistent with the clustering of the core properties.

Perhaps the authors think that the burden-of-proof is on their opponents to show that these other formats exist and can account for the apparent clustering. But outside philosophy, such burden-of-proof claims are as weak an argument as it gets. Inferring a language-of-thought architecture on such shaky grounds also runs the risk of slowing research in computational neuroscience on new alternative cognitive architectures that are both neuroscientifically plausible and that can account for the core properties. Finally, and most important, alternatives to language-of-thought cognitive architectures have been investigated for decades, and the properties discussed by Quilty-Dunn et al. are known to result from these (Eliasmith, Reference Eliasmith2013; Eliasmith & Anderson, Reference Eliasmith and Anderson2003; Smolensky, Reference Smolensky1990, Reference Smolensky, Loewer and Rey1991). In none of these cases do the architectures merely implement a language-of-thought.

Finally, Quilty-Dunn et al. rely on the epistemic virtues of explanatory breadth and unification to support the language-of-thought hypothesis: As they say, “The chief aim […] is to showcase LoTH's explanatory breadth and power in light of recent developments in cognitive science” (target article, sect. 1, para. 3). But an appeal to explanatory breadth runs against their pluralistic commitment: If the authors are serious about representational pluralism, it is hard to understand why they believe that explanatory breadth is a virtue or why any unification should be expected.

Although their defense of the language-of-thought hypothesis fails, Quilty-Dunn et al. are onto something important: We should expect cognition to exploit the core properties to solve some types of cognitive challenges, and we should thus predict their occurrence in some cognitive domains. Which tasks are facilitated by these properties and which life forms in the phylogenetic tree had to solve such tasks (and why) are exciting empirical questions.

Competing interest

None.

References

Bansal, K., Loos, S., Rabe, M., Szegedy, C., & Wilcox, S. (2019). HOList: An environment for machine learning of higher order logic theorem proving. Proceedings of the 36th international conference on machine learning (Vol. 97, pp. 454–463).Google Scholar
Clark, A. (1993). Associative engines: Connectionism, concepts, and representational change. MIT Press.CrossRefGoogle Scholar
Clark, K., Khandelwal, U., Levy, O., & Manning, C. D. (2019). What does BERT look at? An analysis of BERT's attention. Proceedings of the 2019 ACL workshop BlackboxNLP: Analyzing and interpreting neural networks for NLP (pp. 276–286). Florence, Italy: Association for Computational Linguistics.Google Scholar
Dai, W. Z., Xu, Q., Yu, Y., & Zhou, Z. H. (2019). Bridging machine learning and logical reasoning by abductive learning. Proceedings of the 33rd international conference on neural information processing systems (pp. 2811–2822). Red Hook, NY: Curran Associates Inc.Google Scholar
Eliasmith, C. (2013). How to build a brain: A neural architecture for biological cognition. Oxford University Press.CrossRefGoogle Scholar
Eliasmith, C., & Anderson, C. H. (2003). Neural engineering: Computation, representation and dynamics in neurobiological systems. MIT Press.Google Scholar
Irving, G., Szegedy, C., Alemi, A. A., Eén, N., Chollet, F., & Urban, J. (2016). DeepMath-deep sequence models for premise selection. In Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, I., & Garnett, R. (Eds.), Advances in neural information processing systems (Vol. 29, pp. 22352243).Google Scholar
Machery, E. (2014). In defense of reverse inference. The British Journal for the Philosophy of Science, 65, 251267.CrossRefGoogle Scholar
Poldrack, R. A. (2006). Can cognitive processes be inferred from neuroimaging data?. Trends in Cognitive Sciences, 10(2), 5963.CrossRefGoogle ScholarPubMed
Smolensky, P. (1990). Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artificial Intelligence, 46(1–2), 159216.CrossRefGoogle Scholar
Smolensky, P. (1991). Connectionism, constituency, and the language of thought. In Loewer, B. M. & Rey, G. (Eds.), Meaning in mind: Fodor and his critics (pp. 201227). Oxford: Blackwell.Google Scholar
Stoianov, I., & Zorzi, M. (2012). Emergence of a “visual number sense” in hierarchical generative models. Nature Neuroscience, 15(2), 194196.CrossRefGoogle Scholar
Tenney, I., Das, D., & Pavlick, E. (2019). BERT rediscovers the classical NLP pipeline. Proceedings of the 57th annual meeting of the association for computational linguistics (pp. 4593–4601). Florence, Italy: Association for Computational Linguistics.Google Scholar