Hostname: page-component-6587cd75c8-2cm9h Total loading time: 0 Render date: 2025-04-23T11:35:39.721Z Has data issue: false hasContentIssue false

Meta-learned models of cognition

Published online by Cambridge University Press:  23 November 2023

Marcel Binz*
Affiliation:
Max Planck Institute for Biological Cybernetics, Tübingen, Germany [email protected] [email protected] Helmholtz Institute for Human-Centered AI, Munich, Germany [email protected]
Ishita Dasgupta
Affiliation:
Akshay K. Jagadish
Affiliation:
Max Planck Institute for Biological Cybernetics, Tübingen, Germany [email protected] [email protected] Helmholtz Institute for Human-Centered AI, Munich, Germany [email protected]
Matthew Botvinick
Affiliation:
Jane X. Wang
Affiliation:
Eric Schulz
Affiliation:
Max Planck Institute for Biological Cybernetics, Tübingen, Germany [email protected] [email protected] Helmholtz Institute for Human-Centered AI, Munich, Germany [email protected]
*
Corresponding author: Marcel Binz; Email: [email protected]

Abstract

Psychologists and neuroscientists extensively rely on computational models for studying and analyzing the human mind. Traditionally, such computational models have been hand-designed by expert researchers. Two prominent examples are cognitive architectures and Bayesian models of cognition. Although the former requires the specification of a fixed set of computational structures and a definition of how these structures interact with each other, the latter necessitates the commitment to a particular prior and a likelihood function that – in combination with Bayes' rule – determine the model's behavior. In recent years, a new framework has established itself as a promising tool for building models of human cognition: the framework of meta-learning. In contrast to the previously mentioned model classes, meta-learned models acquire their inductive biases from experience, that is, by repeatedly interacting with an environment. However, a coherent research program around meta-learned models of cognition is still missing to date. The purpose of this article is to synthesize previous work in this field and establish such a research program. We accomplish this by pointing out that meta-learning can be used to construct Bayes-optimal learning algorithms, allowing us to draw strong connections to the rational analysis of cognition. We then discuss several advantages of the meta-learning framework over traditional methods and reexamine prior work in the context of these new insights.

Type
Target Article
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Aitchison, J. (1975). Goodness of prediction fit. Biometrika, 62(3), 547554.Google Scholar
Anderson, J. R. (2013a). The adaptive character of thought. Psychology Press.Google Scholar
Anderson, J. R. (2013b). The architecture of cognition. Psychology Press.Google Scholar
Ashby, F. G., & Maddox, W. T. (2005). Human category learning. Annual Review of Psychology, 56(1), 149178.Google Scholar
Bates, C. J., & Jacobs, R. A. (2020). Efficient data compression in perception and perceptual memory. Psychological Review, 127(5), 891.Google Scholar
Baxter, J. (1998). Theoretical models of learning to learn. In Thrun, S. & Pratt, L. (Eds.), Learning to learn (pp. 7194). Springer.Google Scholar
Bellec, G., Salaj, D., Subramoney, A., Legenstein, R., & Maass, W. (2018). Long short-term memory and learning-to-learn in networks of spiking neurons. Advances in Neural Information Processing Systems, 31, 795805.Google Scholar
Bengio, Y., Bengio, S., & Cloutier, J. (1991). Learning a synaptic learning rule. In IJCNN-91-Seattle International Joint Conference on Neural Networks, Seattle, WA, USA (Vol. 2, p. 969).Google Scholar
Benjamin, D. J. (2019). Errors in probabilistic reasoning and judgment biases. In Bernheim, B., DellaVigna, S., & Laibson, D. (Eds.), Handbook of behavioral economics: Applications and foundations (Vol. 2, pp. 69186). North-Holland.Google Scholar
Binmore, K. (2007). Rational decisions in large worlds. Annales d'Economie et de Statistique, No. 86, 2541.Google Scholar
Binz, M., Gershman, S. J., Schulz, E., & Endres, D. (2022). Heuristics from bounded meta-learned inference. Psychological Review, 129(5), 10421077.Google Scholar
Binz, M., & Schulz, E. (2022a). Modeling human exploration through resource-rational reinforcement learning. In Oh, A. H., Agarwal, A., Belgrave, D., & Cho, K. (Eds.), Advances in neural information processing systems (pp. 3175531768). Curran Associates, Inc. https://openreview.net/forum?id=W1MUJv5zaXP.Google Scholar
Binz, M., & Schulz, E. (2022b). Reconstructing the Einstellung effect. Computational Brain & Behavior, 6, 117.Google Scholar
Binz, M., & Schulz, E. (2023). Using cognitive psychology to understand GPT-3. Proceedings of the National Academy of Sciences of the United States of America, 120(6), e2218523120.Google Scholar
Bishop, C. M. (2006). Pattern recognition and machine learning. Springer.Google Scholar
Blundell, C., Uria, B., Pritzel, A., Li, Y., Ruderman, A., Leibo, J. Z., … Hassabis, D. (2016). Model-free episodic control. arXiv preprint arXiv:1606.04460.Google Scholar
Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S., & Cohen, J. D. (2001). Conflict monitoring and cognitive control. Psychological Review, 108(3), 624.Google Scholar
Botvinick, M. M., & Cohen, J. D. (2014). The computational and neural basis of cognitive control: Charted territory and new frontiers. Cognitive Science, 38(6), 12491285.Google Scholar
Bowers, J. S., & Davis, C. J. (2012). Bayesian just-so stories in psychology and neuroscience. Psychological Bulletin, 138(3), 389.Google Scholar
Bowers, J. S., Malhotra, G., Dujmović, M., Montero, M. L., Tsvetkov, C., Biscione, V., … Blything, R. (2022). Deep problems with neural network models of human vision. Behavioral and Brain Sciences, 46, 174.Google Scholar
Bramley, N. R., Dayan, P., Griffiths, T. L., & Lagnado, D. A. (2017). Formalizing Neurath's ship: Approximate algorithms for online causal learning. Psychological Review, 124(3), 301.Google Scholar
Brändle, F., Binz, M., & Schulz, E. (2022a). Exploration beyond bandits. In Cogliati Dezza, I., Schulz, E., & Wu, C. M. (Eds.), The drive for knowledge: The science of human information seeking (pp. 147168). Cambridge University Press. doi:10.1017/9781009026949.008Google Scholar
Brändle, F., Stocks, L. J., Tenenbaum, J. B., Gershman, S. J., & Schulz, E. (2022b). Intrinsically motivated exploration as empowerment. PsyArXiv. January 14.Google Scholar
Brighton, H., & Gigerenzer, G. (2012). Are rational actor models “rational” outside small worlds. In Okasha, S. & Binmore, K. (Eds.), Evolution and rationality: Decisions, co-operation, and strategic behavior (pp. 84109). Cambridge University Press.Google Scholar
Bromberg-Martin, E. S., Matsumoto, M., Hong, S., & Hikosaka, O. (2010). A pallidus–habenula–dopamine pathway signals inferred stimulus values. Journal of Neurophysiology, 104(2), 10681076.Google Scholar
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 18771901.Google Scholar
Chaitin, G. J. (1969). On the simplicity and speed of programs for computing infinite sets of natural numbers. Journal of the ACM (JACM), 16(3), 407422.Google Scholar
Chater, N., & Oaksford, M. (1999). Ten years of the rational analysis of cognition. Trends in Cognitive Sciences, 3(2), 5765.Google Scholar
Chater, N., & Vitányi, P. (2003). Simplicity: A unifying principle in cognitive science? Trends in Cognitive Sciences, 7(1), 1922.Google Scholar
Chomsky, N. (2014). Aspects of the theory of syntax (Vol. 11). MIT Press.Google Scholar
Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., … Fiedel, N. (2022). Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.Google Scholar
Cohen, J. D. (2017). Cognitive control: Core constructs and current considerations. In Egner, T. (Ed.), The Wiley handbook of cognitive control (pp. 128). Wiley Blackwell.Google Scholar
Colas, C., Karch, T., Moulin-Frier, C., & Oudeyer, P.-Y. (2022). Language and culture internalization for human-like autotelic AI. Nature Machine Intelligence, 4(12), 10681076.Google Scholar
Collins, A. G., & Frank, M. J. (2013). Cognitive control over learning: Creating, clustering, and generalizing task-set structure. Psychological Review, 120(1), 190.Google Scholar
Corner, A., & Hahn, U. (2013). Normative theories of argumentation: Are some norms better than others? Synthese, 190(16), 35793610.Google Scholar
Courville, A. C., & Daw, N. D. (2008). The rat as particle filter. In Platt, J., Koller, D., Singer, Y., & Roweis, S. (Eds.), Advances in neural information processing systems (pp. 369376). Curran Associates, Inc.Google Scholar
Cover, T. M. (1999). Elements of information theory. John Wiley.Google Scholar
Cranmer, K., Brehmer, J., & Louppe, G. (2020). The frontier of simulation-based inference. Proceedings of the National Academy of Sciences of the United States of America, 117(48), 3005530062.Google Scholar
Czerlinski, J., Gigerenzer, G., & Goldstein, D. G. (1999). How good are simple heuristics. In Gigerenzer, G. & Todd, P. M. (Eds.), Simple heuristics that make us smart (pp. 97118). Oxford University Press.Google Scholar
Dasgupta, I., & Gershman, S. J. (2021). Memory as a computational resource. Trends in Cognitive Sciences, 25(3), 240251.Google Scholar
Dasgupta, I., Lampinen, A. K., Chan, S. C., Creswell, A., Kumaran, D., McClelland, J. L., & Hill, F. (2022). Language models show human-like content effects on reasoning. arXiv preprint arXiv:2207.07051.Google Scholar
Dasgupta, I., Schulz, E., & Gershman, S. J. (2017). Where do hypotheses come from? Cognitive Psychology, 96, 125.Google Scholar
Dasgupta, I., Schulz, E., Tenenbaum, J. B., & Gershman, S. J. (2020). A theory of learning to infer. Psychological Review, 127(3), 412.Google Scholar
Dasgupta, I., Wang, J., Chiappa, S., Mitrovic, J., Ortega, P., Raposo, D., … Kurth-Nelson, Z. (2019). Causal reasoning from meta-reinforcement learning. arXiv preprint arXiv:1901.08162.Google Scholar
Daw, N. D., Courville, A. C., & Dayan, P. (2008). Semi-rational models of conditioning: The case of trial order. In Chater, N. & Oaksford, M. (Eds.), The probabilistic mind (pp. 431452). Oxford Academic.Google Scholar
Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P., & Dolan, R. J. (2011). Model-based influences on humans’ choices and striatal prediction errors. Neuron, 69(6), 12041215.Google Scholar
Dayan, P., & Kakade, S. (2000). Explaining away in weight space. Advances in Neural Information Processing Systems, 13, 430436.Google Scholar
Dobs, K., Martinez, J., Kell, A. J., & Kanwisher, N. (2022). Brain-like functional specialization emerges spontaneously in deep neural networks. Science Advances, 8(11), eabl8913.Google Scholar
Doya, K. (2002). Metalearning and neuromodulation. Neural Networks, 15(4–6), 495506.Google Scholar
Duan, Y., Schulman, J., Chen, X., Bartlett, P. L., Sutskever, I., & Abbeel, P. (2016). RL2: Fast reinforcement learning via slow reinforcement learning. arXiv preprint arXiv:1611.02779.Google Scholar
Dubey, R., Grant, E., Luo, M., Narasimhan, K., & Griffiths, T. (2020). Connecting context-specific adaptation in humans to meta-learning. arXiv preprint arXiv:2011.13782.Google Scholar
Duff, M. O. (2003). Optimal learning: Computational procedures for Bayes-adaptive Markov decision processes [Unpublished PhD thesis]. University of Massachusetts Amherst.Google Scholar
Farquhar, G., Rocktäschel, T., Igl, M., & Whiteson, S. (2017). TreeQN and ATreeC: Differentiable tree-structured models for deep reinforcement learning. arXiv preprint arXiv:1710.11417.Google Scholar
Feldman, J. (2016). The simplicity principle in perception and cognition. Wiley Interdisciplinary Reviews: Cognitive Science, 7(5), 330340.Google Scholar
Feurer, M., & Hutter, F. (2019). Hyperparameter optimization. In Hutter, F., Kotthoff, L., & Vanschoren, J. (Eds.), Automated machine learning (pp. 333). Springer.Google Scholar
Finn, C., Abbeel, P., & Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning (pp. 1126–1135).Google Scholar
Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127138.Google Scholar
Gauvrit, N., Zenil, H., Delahaye, J.-P., & Soler-Toscano, F. (2014). Algorithmic complexity for short binary strings applied to psychology: A primer. Behavior Research Methods, 46(3), 732744.Google Scholar
Gauvrit, N., Zenil, H., & Tegnér, J. (2017). The information-theoretic and algorithmic approach to human, animal, and artificial cognition. In Dodig-Crnkovic, G. & Giovagnoli, R. (Eds.), Representation and reality in humans, other living organisms and intelligent machines (pp. 117139). Springer.Google Scholar
Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, (6), 721741.Google Scholar
Gershman, S. J. (2015). A unifying probabilistic view of associative learning. PLoS Computational Biology, 11(11), e1004567.Google Scholar
Gershman, S. J. (2018). Deconstructing the human algorithms for exploration. Cognition, 173, 3442.Google Scholar
Gershman, S. J., & Daw, N. D. (2017). Reinforcement learning and episodic memory in humans and animals: An integrative framework. Annual Review of Psychology, 68, 101.Google Scholar
Gerstenberg, T., Goodman, N. D., Lagnado, D. A., & Tenenbaum, J. B. (2021). A counterfactual simulation model of causal judgments for physical events. Psychological Review, 128(5), 936.Google Scholar
Gigerenzer, G., & Gaissmaier, W. (2011). Heuristic decision making. Annual Review of Psychology, 62, 451482.Google Scholar
Gittins, J. C. (1979). Bandit processes and dynamic allocation indices. Journal of the Royal Statistical Society. Series B: Methodological, 41(2), 148177.Google Scholar
Goyal, A., & Bengio, Y. (2022). Inductive biases for deep learning of higher-level cognition. Proceedings of the Royal Society A, 478(2266), 20210068.Google Scholar
Grant, E., Finn, C., Levine, S., Darrell, T., & Griffiths, T. (2018). Recasting gradient-based meta-learning as hierarchical Bayes. In 6th international conference on learning representations, ICLR 2018.Google Scholar
Griffiths, T. L., Callaway, F., Chang, M. B., Grant, E., Krueger, P. M., & Lieder, F. (2019). Doing more with less: Meta-reasoning and meta-learning in humans and machines. Current Opinion in Behavioral Sciences, 29, 2430.Google Scholar
Griffiths, T. L., Chater, N., Kemp, C., Perfors, A., & Tenenbaum, J. B. (2010). Probabilistic models of cognition: Exploring representations and inductive biases. Trends in Cognitive Sciences, 14(8), 357364.Google Scholar
Griffiths, T. L., Chater, N., Norris, D., & Pouget, A. (2012). How the Bayesians got their beliefs (and what those beliefs actually are): Comment on Bowers and Davis (2012). Psychological Bulletin, 138(3), 415422.Google Scholar
Griffiths, T. L., Daniels, D., Austerweil, J. L., & Tenenbaum, J. B. (2018). Subjective randomness as statistical inference. Cognitive Psychology, 103, 85109.Google Scholar
Griffiths, T. L., Kemp, C., & Tenenbaum, J. B. (2008). Bayesian models of cognition. In Sun, R. (Ed.), The Cambridge handbook of computational psychology (pp. 59100). Cambridge University Press.Google Scholar
Griffiths, T. L., & Tenenbaum, J. B. (2006). Optimal predictions in everyday cognition. Psychological Science, 17(9), 767773.Google Scholar
Harlow, H. F. (1949). The formation of learning sets. Psychological Review, 56(1), 51.Google Scholar
Harrison, P., Marjieh, R., Adolfi, F., van Rijn, P., Anglada-Tort, M., Tchernichovski, O., … Jacoby, N. (2020). Gibbs sampling with people. Advances in Neural Information Processing Systems, 33, 1065910671.Google Scholar
Hill, F., Lampinen, A., Schneider, R., Clark, S., Botvinick, M., McClelland, J. L., & Santoro, A. (2020). Environmental drivers of systematicity and generalization in a situated agent. In International conference on learning representations. Retrieved from https://openreview.net/forum?id=SklGryBtwrGoogle Scholar
Hinton, G. E., & Van Camp, D. (1993). Keeping the neural networks simple by minimizing the description length of the weights. In Proceedings of the 6th annual conference on computational learning theory (pp. 5–13).Google Scholar
Hinton, G. E., & Zemel, R. (1993). Autoencoders, minimum description length and Helmholtz free energy. Advances in Neural Information Processing Systems, 6, 310.Google Scholar
Hochreiter, S., Younger, A. S., & Conwell, P. R. (2001). Learning to learn using gradient descent. In International conference on artificial neural networks (pp. 87–94).Google Scholar
Hoppe, D., & Rothkopf, C. A. (2016). Learning rational temporal eye movement strategies. Proceedings of the National Academy of Sciences of the United States of America, 113(29), 83328337.Google Scholar
Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359366.Google Scholar
Jagadish, A. K., Binz, M., Saanum, T., Wang, J. X., & Schulz, E. (2023). Zero-shot compositional reinforcement learning in humans. https://doi.org/10.31234/osf.io/ymve5.Google Scholar
Jensen, K. T., Hennequin, G., & Mattar, M. G. (2023). A recurrent network model of planning explains hippocampal replay and human behavior. bioRxiv, 2023-01.Google Scholar
Jones, M., & Love, B. C. (2011). Bayesian fundamentalism or enlightenment? On the explanatory status and theoretical contributions of Bayesian models of cognition. Behavioral and Brain Sciences, 34(4), 169.Google Scholar
Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37(2), 183233.Google Scholar
Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80(4), 237.Google Scholar
Kanwisher, N., Khosla, M., & Dobs, K. (2023). Using artificial neural networks to ask “why” questions of minds and brains. Trends in Neurosciences, 46(3), 240254.Google Scholar
Kingma, D. P., & Welling, M. (2013). Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114.Google Scholar
Kirsch, L., & Schmidhuber, J. (2021). Meta learning backpropagation and improving it. Advances in Neural Information Processing Systems, 34, 1412214134.Google Scholar
Knill, D. C., & Richards, W. (1996). Perception as Bayesian inference. Cambridge University Press.Google Scholar
Kolmogorov, A. N. (1965). Three approaches to the quantitative definition of information. Problems of Information Transmission, 1(1), 17.Google Scholar
Kool, W., Cushman, F. A., & Gershman, S. J. (2016). When does model-based control payoff? PLoS Computational Biology, 12(8), e1005090.Google Scholar
Körding, K. P., & Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427(6971), 244247.Google Scholar
Kruschke, J. (1990). Alcove: A connectionist model of human category learning. Advances in Neural Information Processing Systems, 3, 649655.Google Scholar
Kumar, S., Correa, C. G., Dasgupta, I., Marjieh, R., Hu, M., Hawkins, R. D., … Griffiths, T. L. (2022a). Using natural language and program abstractions to instill human inductive biases in machines. In Oh, A. H., Agarwal, A., Belgrave, D., & Cho, K. (Eds.), Advances in neural information processing systems (pp. 167180). Curran Associates, Inc. https://openreview.net/forum?id=buXZ7nIqiwE.Google Scholar
Kumar, S., Dasgupta, I., Cohen, J., Daw, N., & Griffiths, T. (2020b). Meta-learning of structured task distributions in humans and machines. In International conference on learning representations.Google Scholar
Kumar, S., Dasgupta, I., Cohen, J. D., Daw, N. D., & Griffiths, T. L. (2020a). Meta-learning of structured task distributions in humans and machines. arXiv preprint arXiv:2010.02317.Google Scholar
Kumar, S., Dasgupta, I., Marjieh, R., Daw, N. D., Cohen, J. D., & Griffiths, T. L. (2022b). Disentangling abstraction from statistical pattern matching in human and machine learning. arXiv preprint arXiv:2204.01437.Google Scholar
Lake, B. M. (2019). Compositional generalization through meta sequence-to-sequence learning. Advances in Neural Information Processing Systems, 32, 97919801.Google Scholar
Lake, B. M., & Baroni, M. (2023). Human-like systematic generalization through a meta-learning neural network. Nature, 623, 17.Google Scholar
Lange, R. T., & Sprekeler, H. (2020). Learning not to learn: Nature versus nurture in silico. arXiv preprint arXiv:2010.04466.Google Scholar
Lengyel, M., & Dayan, P. (2007). Hippocampal contributions to control: The third way. Advances in Neural Information Processing Systems, 20, 889896.Google Scholar
Lewis, D. (1999). Why conditionalize? In Lewis, D. (Ed.) , Papers in metaphysics and epistemology (Vol. 2, pp. 403407). Cambridge University Press. doi:10.1017/CBO9780511625343.024Google Scholar
Li, Z., Zhou, F., Chen, F., & Li, H. (2017). Meta-SGD: Learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835.Google Scholar
Lieder, F., & Griffiths, T. L. (2017). Strategy selection as rational metareasoning. Psychological Review, 124(6), 762.Google Scholar
Liu, Z., Luo, P., Wang, X., & Tang, X. (2015, December). Deep learning face attributes in the wild. In Proceedings of international conference on computer vision (ICCV).Google Scholar
Lucas, C. G., Griffiths, T. L., Williams, J. J., & Kalish, M. L. (2015). A rational model of function learning. Psychonomic Bulletin & Review, 22(5), 11931215.Google Scholar
Lueckmann, J.-M., Boelts, J., Greenberg, D., Goncalves, P., & Macke, J. (2021). Benchmarking simulation-based inference. In International conference on artificial intelligence and statistics (pp. 343–351).Google Scholar
Marr, D. (2010). Vision: A computational investigation into the human representation and processing of visual information. MIT Press.Google Scholar
McClelland, J. L., Botvinick, M. M., Noelle, D. C., Plaut, D. C., Rogers, T. T., Seidenberg, M. S., & Smith, L. B. (2010). Letting structure emerge: Connectionist and dynamical systems approaches to cognition. Trends in Cognitive Sciences, 14(8), 348356.Google Scholar
McClelland, J. L., Hill, F., Rudolph, M., Baldridge, J., & Schütze, H. (2020). Placing language in an integrated understanding system: Next steps toward human-level performance in neural language models. Proceedings of the National Academy of Sciences of the United States of America, 117(42), 2596625974.Google Scholar
McClelland, J. L., McNaughton, B. L., & O'Reilly, R. C. (1995). Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102(3), 419.Google Scholar
McCoy, R. T., Grant, E., Smolensky, P., Griffiths, T. L., & Linzen, T. (2020). Universal linguistic inductive biases via meta-learning. arXiv preprint arXiv:2006.16324.Google Scholar
Mercier, H., & Sperber, D. (2017). The enigma of reason. Harvard University Press.Google Scholar
Miconi, T. (2023). A large parametrized space of meta-reinforcement learning tasks. arXiv preprint arXiv:2302.05583.Google Scholar
Miconi, T., Rawal, A., Clune, J., & Stanley, K. O. (2020). Backpropamine: Training self-modifying neural networks with differentiable neuromodulated plasticity. arXiv preprint arXiv:2002.10585.Google Scholar
Mikulik, V., Delétang, G., McGrath, T., Genewein, T., Martic, M., Legg, S., & Ortega, P. (2020). Meta-trained agents implement Bayes-optimal agents. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. F., & Lin, H. (Eds.), Advances in neural information processing systems (Vol. 33, pp. 1869118703). Curran. Retrieved from https://proceedings.neurips.cc/paper/2020/file/d902c3ce47124c66ce615d5ad9ba304f-Paper.pdfGoogle Scholar
Mitchell, T. M. (1997). Machine learning (Vol. 1). McGraw Hill.Google Scholar
Molano-Mazon, M., Barbosa, J., Pastor-Ciurana, J., Fradera, M., Zhang, R.-Y., Forest, J., … Yang, G. R. (2022). Neurogym: An open resource for developing and sharing neuroscience tasks. https://doi.org/10.31234/osf.io/aqc9n.Google Scholar
Montello, D. R. (2005). Navigation. Cambridge University Press.Google Scholar
Moskovitz, T., Miller, K., Sahani, M., & Botvinick, M. M. (2022). A unified theory of dual-process control. arXiv preprint arXiv:2211.07036.Google Scholar
Müller, S., Hollmann, N., Arango, S. P., Grabocka, J., & Hutter, F. (2021). Transformers can do Bayesian inference. arXiv preprint arXiv:2112.10510.Google Scholar
Murphy, K. P. (2012). Machine learning: A probabilistic perspective. MIT Press.Google Scholar
Newell, A. (1992). Unified theories of cognition and the role of soar. In Soar: A cognitive architecture in perspective (pp. 2579). Springer.Google Scholar
Newell, A., & Simon, H. A. (1972). Human problem solving (Vol. 104, No. 9). Prentice Hall.Google Scholar
Nichol, A., Achiam, J., & Schulman, J. (2018). On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999.Google Scholar
Nosofsky, R. M. (2011). The generalized context model: An exemplar model of classification. In Pothos, E. M. & Wills, A. J. (Eds.), Formal approaches in categorization (pp. 1839). Cambridge University Press.Google Scholar
Oaksford, M., & Chater, N. (2007). Bayesian rationality: The probabilistic approach to human reasoning. Oxford University Press.Google Scholar
Ortega, P. A., Braun, D. A., Dyer, J., Kim, K.-E., & Tishby, N. (2015). Information-theoretic bounded rationality. arXiv preprint arXiv:1512.06789.Google Scholar
Ortega, P. A., Wang, J. X., Rowland, M., Genewein, T., Kurth-Nelson, Z., Pascanu, R., … Legg, S. (2019). Meta-learning of sequential strategies. arXiv preprint arXiv:1905.03030.Google Scholar
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., … Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., & Garnett, R. (Eds.), Advances in neural information processing systems 32 (pp. 80248035). Curran. Retrieved from http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdfGoogle Scholar
Pearl, J. (2009). Causality. Cambridge University Press.Google Scholar
Piloto, L. S., Weinstein, A., Battaglia, P., & Botvinick, M. (2022). Intuitive physics learning in a deep-learning model inspired by developmental psychology. Nature Human Behaviour, 6(9), 12571267.Google Scholar
Pritzel, A., Uria, B., Srinivasan, S., Badia, A. P., Vinyals, O., Hassabis, D., … Blundell, C. (2017). Neural episodic control. In International conference on machine learning (pp. 2827–2836).Google Scholar
Rabinowitz, N. C. (2019). Meta-learners’ learning dynamics are unlike learners’. arXiv preprint arXiv:1905.01320.Google Scholar
Rahwan, I., Cebrian, M., Obradovich, N., Bongard, J., Bonnefon, J.-F., Breazeal, C., … Wellman, M. (2019). Machine behaviour. Nature, 568(7753), 477486.Google Scholar
Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation, 20(4), 873922.Google Scholar
Reed, S., Zolna, K., Parisotto, E., Colmenarejo, S. G., Novikov, A., Barth-Maron, G., … de Freitas, N. (2022). A generalist agent. arXiv preprint arXiv:2205.06175.Google Scholar
Rescorla, M. (2020). An improved Dutch book theorem for conditionalization. Erkenntnis, 87, 129.Google Scholar
Rescorla, R. A. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In Black, A. H. & Prokasy, W. F. (Eds.), Current research and theory (pp. 6499). Appleton-Century-Crofts.Google Scholar
Rich, A. S., & Gureckis, T. M. (2019). Lessons for artificial intelligence from the study of natural stupidity. Nature Machine Intelligence, 1(4), 174180.Google Scholar
Ritter, S., Barrett, D. G., Santoro, A., & Botvinick, M. M. (2017). Cognitive psychology for deep neural networks: A shape bias case study. In International conference on machine learning (pp. 2940–2949).Google Scholar
Ritter, S., Wang, J., Kurth-Nelson, Z., Jayakumar, S., Blundell, C., Pascanu, R., & Botvinick, M. (2018). Been there, done that: Meta-learning with episodic recall. In International conference on machine learning (pp. 4354–4363).Google Scholar
Riveland, R., & Pouget, A. (2022). Generalization in sensorimotor networks configured with natural language instructions. bioRxiv, 2022-02.Google Scholar
Rosenkrantz, R. D. (1992). The justification of induction. Philosophy of Science, 59(4), 527539.Google Scholar
Rumelhart, D. E., McClelland, J. L., & PDP Research Group, . (1988). Parallel distributed processing (Vol. 1). IEEE.Google Scholar
Sanborn, A., & Griffiths, T. (2007). Markov chain Monte Carlo with people. Advances in Neural Information Processing Systems, 20, 12651272.Google Scholar
Sanborn, A. N. (2017). Types of approximation for probabilistic cognition: Sampling and variational. Brain and Cognition, 112, 98101.Google Scholar
Sanborn, A. N., Griffiths, T. L., & Navarro, D. J. (2010). Rational approximations to rational models: Alternative algorithms for category learning. Psychological Review, 117(4), 1144.Google Scholar
Sanborn, A. N., & Silva, R. (2013). Constraining bridges between levels of analysis: A computational justification for locally Bayesian learning. Journal of Mathematical Psychology, 57(3–4), 94106.Google Scholar
Sancaktar, C., Blaes, S., & Martius, G. (2022). Curious exploration via structured world models yields zero-shot object manipulation. In Oh, A. H., Agarwal, A., Belgrave, D., & Cho, K. (Eds.) , Advances in neural information processing systems (pp. 2417024183). Curran Associates, Inc. https://openreview.net/forum?id=NnuYZ1el24C.Google Scholar
Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., & Lillicrap, T. (2016). Meta-learning with memory-augmented neural networks. In International conference on machine learning (pp. 1842–1850).Google Scholar
Savage, L. J. (1972). The foundations of statistics. Courier.Google Scholar
Schaul, T., & Schmidhuber, J. (2010). Metalearning. Scholarpedia, 5(6), 4650 (revision #91489). doi:10.4249/scholarpedia.4650Google Scholar
Schlag, I., Irie, K., & Schmidhuber, J. (2021). Linear transformers are secretly fast weight programmers. In International conference on machine learning (pp. 9355–9366).Google Scholar
Schmidhuber, J. (1987). Evolutionary principles in self-referential learning, or on learning how to learn: The meta-meta-… hook. Unpublished doctoral dissertation, Technische Universität München.Google Scholar
Schramowski, P., Turan, C., Andersen, N., Rothkopf, C. A., & Kersting, K. (2022). Large pre-trained language models contain human-like biases of what is right and wrong to do. Nature Machine Intelligence, 4(3), 258268.Google Scholar
Schulz, E., Bhui, R., Love, B. C., Brier, B., Todd, M. T., & Gershman, S. J. (2019). Structured, uncertainty-driven exploration in real-world consumer choice. Proceedings of the National Academy of Sciences of the United States of America, 116(28), 1390313908.Google Scholar
Schulz, E., & Dayan, P. (2020). Computational psychiatry for computers. iScience, 23(12), 101772.Google Scholar
Schulz, E., & Gershman, S. J. (2019). The algorithmic architecture of exploration in the human brain. Current Opinion in Neurobiology, 55, 714.Google Scholar
Schulz, E., Tenenbaum, J. B., Duvenaud, D., Speekenbrink, M., & Gershman, S. J. (2017). Compositional inductive biases in function learning. Cognitive Psychology, 99, 4479. doi:10.1016/j.cogpsych.2017.11.002Google Scholar
Schulze Buschoff, L. M., Schulz, E., & Binz, M. (2023). The acquisition of physical knowledge in generative neural networks. In Proceedings of the 40th international conference on machine learning (pp. 30321–30341).Google Scholar
Shepard, R. N. (1987). Toward a universal law of generalization for psychological science. Science, 237(4820), 13171323.Google Scholar
Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical networks for few-shot learning. Advances in Neural Information Processing Systems, 30, 40804090.Google Scholar
Solomonoff, R. J. (1964). A formal theory of inductive inference. Part I. Information and Control, 7(1), 122.Google Scholar
Stopper, C. M., Maric, T., Montes, D. R., Wiedman, C. R., & Floresco, S. B. (2014). Overriding phasic dopamine signals redirects action selection during risk/reward decision making. Neuron, 84(1), 177189.Google Scholar
Strouse, D., McKee, K., Botvinick, M., Hughes, E., & Everett, R. (2021). Collaborating with humans without human data. Advances in Neural Information Processing Systems, 34, 1450214515.Google Scholar
Tamar, A., Wu, Y., Thomas, G., Levine, S., & Abbeel, P. (2016). Value iteration networks. Advances in Neural Information Processing Systems, 29, 21542162.Google Scholar
Tauber, S., Navarro, D. J., Perfors, A., & Steyvers, M. (2017). Bayesian models of cognition revisited: Setting optimality aside and letting data drive psychological theory. Psychological Review, 124(4), 410.Google Scholar
Team, A. A., Bauer, J., Baumli, K., Baveja, S., Behbahani, F., Bhoopchand, A., … Zhang, L. (2023). Human-timescale adaptation in an open-ended task space. arXiv preprint arXiv:2301.07608.Google Scholar
Team, O. E. L., Stooke, A., Mahajan, A., Barros, C., Deck, C., Bauer, J., … Czarnecki, W. M. (2021). Open-ended learning leads to generally capable agents. arXiv preprint arXiv:2107.12808.Google Scholar
Tenenbaum, J. (2021). Joshua Tenenbaum's homepage. Retrieved from http://web.mit.edu/cocosci/josh.htmlGoogle Scholar
Thrun, S., & Pratt, L. (1998). Learning to learn: Introduction and overview. In Thrun, S. & Pratt, L. (Eds.), Learning to learn (pp. 317). Springer.Google Scholar
Todorov, E., Erez, T., & Tassa, Y. (2012). Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems (pp. 5026–5033).Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 60006010.Google Scholar
Vinyals, O., Blundell, C., Lillicrap, T., & Wierstra, D. (2016). Matching networks for one shot learning. Advances in Neural Information Processing Systems, 29, 36373645.Google Scholar
Wang, J. X. (2021). Meta-learning in natural and artificial intelligence. Current Opinion in Behavioral Sciences, 38, 9095.Google Scholar
Wang, J. X., King, M., Porcel, N., Kurth-Nelson, Z., Zhu, T., Deck, C., … Botvinick, M. (2021). Alchemy: A structured task distribution for meta-reinforcement learning. CoRR, abs/2102.02926. Retrieved from https://arxiv.org/abs/2102.02926Google Scholar
Wang, J. X., Kurth-Nelson, Z., Kumaran, D., Tirumala, D., Soyer, H., Leibo, J. Z., … Botvinick, M. (2018). Prefrontal cortex as a meta-reinforcement learning system. Nature Neuroscience, 21(6), 860868.Google Scholar
Wang, J. X., Kurth-Nelson, Z., Tirumala, D., Soyer, H., Leibo, J. Z., Munos, R., … Botvinick, M. (2016). Learning to reinforcement learn. arXiv preprint arXiv:1611.05763.Google Scholar
Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A., & Cohen, J. D. (2014). Humans use directed and random exploration to solve the explore–exploit dilemma. Journal of Experimental Psychology: General, 143(6), 2074.Google Scholar
Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 13411390.Google Scholar
Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 6782.Google Scholar
Wu, C. M., Schulz, E., Speekenbrink, M., Nelson, J. D., & Meder, B. (2018). Generalization guides human exploration in vast decision spaces. Nature Human Behaviour, 2(12), 915924.Google Scholar
Yang, G. R., Joglekar, M. R., Song, H. F., Newsome, W. T., & Wang, X.-J. (2019). Task representations in neural networks trained to perform many cognitive tasks. Nature Neuroscience, 22(2), 297306.Google Scholar
Yang, Y., & Piantadosi, S. T. (2022). One model for the learning of language. Proceedings of the National Academy of Sciences of the United States of America, 119(5), e2021865119.Google Scholar
Yu, T., Quillen, D., He, Z., Julian, R., Hausman, K., Finn, C., & Levine, S. (2020). Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning. In Conference on robot learning (pp. 1094–1100).Google Scholar
Zednik, C., & Jäkel, F. (2016). Bayesian reverse-engineering considered as a research strategy for cognitive science. Synthese, 193(12), 39513985.Google Scholar
Zenil, H., Marshall, J. A., & Tegnér, J. (2015). Approximations of algorithmic and structural complexity validate cognitive-behavioural experimental results. arXiv preprint arXiv:1509.06338.Google Scholar