Is human compositionality meta-learned?

Jacob Russin; Sam Whitman McGrath; Ellie Pavlick; Michael J. Frank

doi:10.1017/S0140525X24000189

Is human compositionality meta-learned?

Published online by Cambridge University Press: 23 September 2024

Jacob Russin ,

Sam Whitman McGrath ,

Ellie Pavlick and

Michael J. Frank

Show author details

Jacob Russin: Affiliation:
Department of Computer Science, Brown University, Providence, RI, USA [email protected] [email protected] https://jlrussin.github.io/ https://cs.brown.edu/people/epavlick/ Department of Cognitive and Psychological Sciences, Brown University, Providence, RI, USA
Sam Whitman McGrath: Affiliation:
Department of Philosophy, Brown University, Providence, RI, USA [email protected] https://scholar.google.com/citations?user=B3b7kAYAAAAJ&hl=en
Ellie Pavlick: Affiliation:
Department of Computer Science, Brown University, Providence, RI, USA [email protected] [email protected] https://jlrussin.github.io/ https://cs.brown.edu/people/epavlick/
Michael J. Frank*: Affiliation:
Department of Cognitive and Psychological Sciences, Carney Institute for Brain Science, Brown University, Providence, RI, USA [email protected] http://ski.clps.brown.edu/
*: *Corresponding author.

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Recent studies suggest that meta-learning may provide an original solution to an enduring puzzle about whether neural networks can explain compositionality – in particular, by raising the prospect that compositionality can be understood as an emergent property of an inner-loop learning algorithm. We elaborate on this hypothesis and consider its empirical predictions regarding the neural mechanisms and development of human compositionality.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 47 , 2024 , e162

DOI: https://doi.org/10.1017/S0140525X24000189 [Opens in a new window]
Copyright: Copyright © The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Bergelson, E. (2020). The comprehension boost in early word learning: Older infants are better learners. Child Development Perspectives, 14(3), 142–149. https://doi.org/10.1111/cdep.12373CrossRef Google Scholar PubMed

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P.. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901. https://papers.nips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html Google Scholar

Calderon, C. B., Verguts, T., & Frank, M. J. (2022). Thunderstruck: The ACDC model of flexible sequences and rhythms in recurrent neural circuits. PLoS Computational Biology, 18(2), e1009854. https://doi.org/10.1371/journal.pcbi.1009854CrossRef Google Scholar PubMed

Chan, S. C. Y., Santoro, A., Lampinen, A. K., Wang, J. X., Singh, A., Richemond, P. H., … Hill, F. (2022). Data distributional properties drive emergent in-context learning in transformers. Advances in Neural Information Processing Systems, 35, 18878–18891. https://papers.nips.cc/paper_files/paper/2022/hash/77c6ccacfd9962e2307fc64680fc5ace-Abstract-Conference.html Google Scholar

Chomsky, N. (Ed.). (1957). Syntactic structures. Mouton & Co.CrossRef Google Scholar

Collins, A. G. E., & Frank, M. J. (2013). Cognitive control over learning: Creating, clustering and generalizing task-set structure. Psychological Review, 120(1), 190–229. https://doi.org/10.1037/a0030852CrossRef Google Scholar PubMed

Crescentini, C., Seyed-Allaei, S., De Pisapia, N., Jovicich, J., Amati, D., & Shallice, T. (2011). Mechanisms of rule acquisition and rule following in inductive reasoning. Journal of Neuroscience, 31(21), 7763–7774. https://doi.org/10.1523/JNEUROSCI.4579-10.2011CrossRef Google Scholar PubMed

Fodor, J. A., & Pylyshyn, Z. W. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28(1–2), 3–71. https://doi.org/10.1016/0010-0277(88)90031-5CrossRef Google Scholar PubMed

Frank, M. J., & Badre, D. (2012). Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: Computational analysis. Cerebral Cortex, 22(3), 509–526. https://doi.org/10.1093/cercor/bhr114CrossRef Google Scholar PubMed

Goel, V. (2007). Anatomy of deductive reasoning. Trends in Cognitive Sciences, 11(10), 435–441. https://doi.org/10.1016/j.tics.2007.09.003CrossRef Google Scholar PubMed

Kriete, T., Noelle, D. C., Cohen, J. D., & O'Reilly, R. C. (2013). Indirection and symbol-like processing in the prefrontal cortex and basal ganglia. Proceedings of the National Academy of Sciences of the United States of America, 110(41), 16390–16395. https://doi.org/10.1073/pnas.1303547110CrossRef Google Scholar PubMed

Lake, B. M., & Baroni, M. (2018). Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. In Dy, J. G. & Krause, A. (Eds.), Proceedings of the 35th International Conference on Machine Learning (Vol. 80, pp. 2879–2888). PMLR. http://proceedings.mlr.press/v80/lake18a.html Google Scholar

Lake, B. M., & Baroni, M. (2023). Human-like systematic generalization through a meta-learning neural network. Nature, 623, 1–7. https://doi.org/10.1038/s41586-023-06668-3CrossRef Google Scholar PubMed

Linzen, T., & Baroni, M. (2021). Syntactic structure from deep learning. Annual Review of Linguistics, 7(1), 195–212. https://doi.org/10.1146/annurev-linguistics-032020-051035CrossRef Google Scholar

Marcus, G. F. (1998). Rethinking eliminative connectionism. Cognitive Psychology, 37(3), 243–282. https://doi.org/10.1006/cogp.1998.0694CrossRef Google Scholar PubMed

Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual Review of Neuroscience, 24, 167–202.CrossRef Google Scholar PubMed

Munakata, Y., Snyder, H. R., & Chatham, C. H. (2012). Developing cognitive control: Three key transitions. Current Directions in Psychological Science, 21(2), 71–77. https://doi.org/10.1177/0963721412436807CrossRef Google Scholar PubMed

O'Reilly, R. C., & Frank, M. J. (2006). Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia. Neural Computation, 18(2), 283–328. https://doi.org/10.1162/089976606775093909CrossRef Google Scholar

Piantadosi, S., & Aslin, R. (2016). Compositional reasoning in early childhood. PLoS ONE, 11(9), e0147734. https://doi.org/10.1371/journal.pone.0147734CrossRef Google Scholar PubMed

Piantadosi, S. T., Palmeri, H., & Aslin, R. (2018). Limits on composition of conceptual operations in 9-month-olds. Infancy, 23(3), 310–324. https://doi.org/10.1111/infa.12225CrossRef Google Scholar PubMed

Rougier, N. P., Noelle, D., Braver, T. S., Cohen, J. D., & O'Reilly, R. C. (2005). Prefrontal cortex and the flexibility of cognitive control: Rules without symbols. Proceedings of the National Academy of Sciences of the United States of America, 102(20), 7338–7343.CrossRef Google Scholar PubMed

Russin, J., Jo, J., O'Reilly, R. C., & Bengio, Y. (2020a). Systematicity in a recurrent neural network by factorizing syntax and semantics. Proceedings for the 42nd Annual Meeting of the Cognitive Science Society, 7. https://cognitivesciencesociety.org/cogsci20/papers/0027/0027.pdf Google Scholar

Russin, J., O'Reilly, R. C., & Bengio, Y. (2020b). Deep learning needs a prefrontal cortex. In Bridging AI and Cognitive Science (BAICS) Workshop, ICLR, 2020, 11.Google Scholar

Smolensky, P. (1990). Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artificial Intelligence, 46(1–2), 159–216. https://doi.org/10.1016/0004-3702(90)90007-MCrossRef Google Scholar

Thompson-Schill, S. L. (2005). Dissecting the language organ: A new look at the role of Broca's area in language processing. In Cutler, Anne (Ed.), Twenty-first century psycholinguistics (1st ed., Vol. 1, pp. 1–18). Routledge.Google Scholar

von Oswald, J., Niklasson, E., Schlegel, M., Kobayashi, S., Zucchet, N., Scherrer, N., … Sacramento, J. (2023). Uncovering mesa-optimization algorithms in transformers (arXiv:2309.05858). arXiv. https://doi.org/10.48550/arXiv.2309.05858CrossRef Google Scholar

Webb, T., Frankland, S. M., Altabaa, A., Krishnamurthy, K., Campbell, D., Russin, J., … Cohen, J. D. (2024). The relational bottleneck as an inductive bias for efficient abstraction (arXiv:2309.06629). arXiv. http://arxiv.org/abs/2309.06629 Google Scholar

Webb, T., Holyoak, K. J., & Lu, H. (2022). Emergent analogical reasoning in large language models. Nature Human Behaviour, 7(9). https://doi.org/10.1038/s41562-023-01659-wGoogle Scholar

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., … Zhou, D. (2023). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824–24837. https://papers.nips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf Google Scholar

Werchan, D. M., Collins, A. G. E., Frank, M. J., & Amso, D. (2015). 8-Month-old infants spontaneously learn and generalize hierarchical rules. Psychological Science, 26(6), 805–815. https://doi.org/10.1177/0956797615571442CrossRef Google Scholar PubMed

Werchan, D. M., Collins, A. G. E., Frank, M. J., & Amso, D. (2016). Role of prefrontal cortex in learning and generalizing hierarchical rules in 8-month-old infants. The Journal of Neuroscience, 36(40), 10314–10322. https://doi.org/10.1523/JNEUROSCI.1351-16.2016CrossRef Google Scholar PubMed

Xie, S. M., Raghunathan, A., Liang, P., & Ma, T. (2022). An explanation of in-context learning as implicit Bayesian inference. International Conference on Learning Representations. https://openreview.net/pdf?id=RdJVFCHjUMI Google Scholar

Zhou, D., Schärli, N., Hou, L., Wei, J., Scales, N., Wang, X., … Chi, E. (2022). Least-to-most prompting enables complex reasoning in large language models. The Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=WZH7099tgfM Google Scholar