Curiosity-Driven Exploration

doi:10.1017/9781009026949.004

Chapter 3 - Curiosity-Driven Exploration

Diversity of Mechanisms and Functions

from Part I - What Drives Humans to Seek Information?

Published online by Cambridge University Press: 19 May 2022

Alexandr Ten ,

Pierre-Yves Oudeyer and

Clément Moulin-Frier

Edited by

Irene Cogliati Dezza ,

Eric Schulz and

Charley M. Wu

Show author details

Irene Cogliati Dezza: Affiliation:
University College London
Eric Schulz: Affiliation:
Max-Planck-Institut für biologische Kybernetik, Tübingen
Charley M. Wu: Affiliation:
Eberhard-Karls-Universität Tübingen, Germany

Book contents

Get access

Summary

Intrinsically motivated information-seeking, also called curiosity-driven exploration, is widely believed to be a key ingredient for autonomous learning in the real world. Such forms of spontaneous exploration have been studied in multiple independent lines of computational research, producing a diverse range of algorithmic models that capture different aspects of these processes. These algorithms resolve some of the limitations of neurocognitive theories by formally describing computational functions and algorithmic implementations of intrinsically motivated learning. Moreover, they reveal a high diversity of effective forms of intrinsically motivated information-seeking that can be characterized along different mechanistic and functional dimensions. This chapter aims at reviewing different classes of algorithms and highlighting several important dimensions of variation among them. Identifying these dimensions provides means for structuring a comprehensive taxonomy of approaches. We believe this exercise to be useful in working toward a general computational account of information-seeking. Such an account should facilitate the proposition of new hypotheses about information-seeking in humans and complement the existing psychological theory of curiosity.

Keywords

Curiosity exploration intrinsic motivation algorithms AI

Type: Chapter
Information: The Drive for Knowledge
The Science of Human Information Seeking
, pp. 53 - 76

DOI: https://doi.org/10.1017/9781009026949.004 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Andreae, P. M., & Andreae, J. H. (1978). A teachable machine in the real world. International Journal of Man-Machine Studies, 10(3), 301–312.Google Scholar

Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., … & Zaremba, W. (2018). Hindsight experience replay. arXiv preprint arXiv:1707.01495.Google Scholar

Aubret, A., Matignon, L., & Hassas, S. (2019). A survey on intrinsic motivation in reinforcement learning. arXiv preprint arXiv:1908.06976.Google Scholar

Baker, B., Kanitscheider, I., Markov, T., Wu, Y., Powell, G., McGrew, B., & Mordatch, I. (2020). Emergent tool use from multi-agent autocurricula. arXiv preprint arXiv:1909.07528.Google Scholar

Baranes, A., & Oudeyer, P. Y. (2009). R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), 155–169.Google Scholar

Baranes, A., & Oudeyer, P. Y. (2013). Active learning of inverse models with intrinsically motivated goal exploration in robots. Robotics and Autonomous Systems, 61(1), 49–73.CrossRef Google Scholar

Barron, A. B., Hebets, E. A., Cleland, T. A., Fitzpatrick, C. L., Hauber, M. E., & Stevens, J. R. (2015). Embracing multiple definitions of learning. Trends in Neurosciences, 38(7), 405–407.Google Scholar

Bazhydai, M., Twomey, K., & Westermann, G. (2021). Curiosity and Exploration. In Benson, J. B. (Ed.), Encyclopedia of Infant and Early Childhood Development (2nd ed.). Elsevier, pp. 370–378.Google Scholar

Bellemare, M., Srinivasan, S., Ostrovski, G., Schaul, T., Saxton, D., & Munos, R. (2016). Unifying count-based exploration and intrinsic motivation. Advances in Neural Information Processing Systems, 29, 1471–1479.Google Scholar

Benureau, F. C., & Oudeyer, P. Y. (2016). Behavioral diversity generation in autonomous exploration through reuse of past experience. Frontiers in Robotics and AI, 3, 8.Google Scholar

Berseth, G., Geng, D., Devin, C., Rhinehart, N., Finn, C., Jayaraman, D., & Levine, S. (2021). SMiRL: Surprise Minimizing Reinforcement Learning in Unstable Environments. arXiv preprint arXiv:1912.05510.Google Scholar

Bougie, N., & Ichise, R. (2020). Skill-based curiosity for intrinsically motivated reinforcement learning. Machine Learning, 109(3), 493–512.Google Scholar

Bougie, N., & Ichise, R. (2021). Fast and slow curiosity for high-level exploration in reinforcement learning. Applied Intelligence, 51(2), 1086–1107.Google Scholar

Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., & Efros, A. A. (2018). Large-scale study of curiosity-driven learning. arXiv preprint arXiv:1808.04355.Google Scholar

Caligiore, D., Ferrauto, T., Parisi, D., Accornero, N., Capozza, M., & Baldassarre, G. (2008). Using motor babbling and hebb rules for modeling the development of reaching with obstacles and grasping. In International Conference on Cognitive Systems (Vol. 13, pp. 22–23). www.researchgate.net/publication/227945187_Using_Motor_Babbling_and_Hebb_Rules_for_Modeling_the_Development_of_Reaching_with_Obstacles_and_Grasping.Google Scholar

Chu, J., & Schulz, L. E. (2020). Play, curiosity, and cognition. Annual Review of Developmental Psychology, 2, 317–343.Google Scholar

Clement, B., Oudeyer, P. Y., & Lopes, M. (2016). A Comparison of Automatic Teaching Strategies for Heterogeneous Student Populations. Proceedings of the 9th International Conference on Educational Data Mining, Raleigh, USA.Google Scholar

Clement, B., Roy, D., Oudeyer, P. Y., & Lopes, M. (2015). Multi-armed bandits for intelligent tutoring systems. arXiv preprint arXiv:1310.3174.Google Scholar

Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research, 4, 129–145.Google Scholar

Colas, C., Fournier, P., Chetouani, M., Sigaud, O., & Oudeyer, P. Y. (2019, May). CURIOUS: intrinsically motivated modular multi-goal reinforcement learning. In International conference on machine learning (pp. 1331–1340). PMLR. http://proceedings.mlr.press/v97/colas19a.html.Google Scholar

Colas, C., Karch, T., Lair, N., Dussoux, J. M., Moulin-Frier, C., Dominey, P. F., & Oudeyer, P. Y. (2020). Language as a cognitive tool to imagine goals in curiosity-driven exploration. arXiv preprint arXiv:2002.09253.Google Scholar

Colas, C., Karch, T., Sigaud, O., & Oudeyer, P. Y. (2021). Intrinsically motivated goal-conditioned reinforcement learning: a short survey. arXiv preprint arXiv:2012.09830.Google Scholar

Colas, C., Sigaud, O., & Oudeyer, P. Y. (2018). Gep-pg: Decoupling exploration and exploitation in deep reinforcement learning algorithms. In International conference on machine learning (pp. 1039–1048). PMLR. http://proceedings.mlr.press/v80/colas18a.html.Google Scholar

Delmas, A., Clement, B., Oudeyer, P. Y., & Sauzéon, H. (2018). Fostering health education with a serious game in children with asthma: pilot studies for assessing learning efficacy and automatized learning personalization. In Frontiers in Education (Vol. 3, p. 99). Frontiers. https://doi.org/10.3389/feduc.2018.00099.Google Scholar

Dobzhansky, T. (1973). Nothing in biology makes sense except in the light of evolution. The American Biology Teacher, 75(2), 87–91.Google Scholar

Etcheverry, M., Moulin-Frier, C., & Oudeyer, P. Y. (2021). Hierarchically organized latent modules for exploratory search in morphogenetic systems. arXiv preprint arXiv:2007.01195.Google Scholar

Florensa, C., Held, D., Geng, X., & Abbeel, P. (2018, July). Automatic goal generation for reinforcement learning agents. In International conference on machine learning (pp. 1515–1528). PMLR. http://proceedings.mlr.press/v80/florensa18a.html.Google Scholar

Fogarty, L., & Creanza, N. (2017). The niche construction of cultural complexity: interactions between innovations, population size and the environment. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1735), 20160428.Google Scholar

Forestier, S., Portelas, R., Mollard, Y., & Oudeyer, P. Y. (2017). Intrinsically motivated goal exploration processes with automatic curriculum learning. arXiv preprint arXiv:1708.02190.Google Scholar

Gershman, S. J. (2019). Uncertainty and exploration. Decision, 6(3), 277.CrossRef Google Scholar PubMed

Gopnik, A. (2020). Childhood as a solution to explore–exploit tensions. Philosophical Transactions of the Royal Society B, 375(1803), 20190502.Google Scholar

Gottlieb, J., & Oudeyer, P. Y. (2018). Towards a neuroscience of active sampling and curiosity. Nature Reviews Neuroscience, 19(12), 758–770.Google Scholar

Gottlieb, J., Oudeyer, P. Y., Lopes, M., & Baranes, A. (2013). Information-seeking, curiosity, and attention: computational and neural mechanisms. Trends in Cognitive Sciences, 17(11), 585–593.Google Scholar

Grizou, J., Points, L. J., Sharma, A., & Cronin, L. (2020). A curious formulation robot enables the discovery of a novel protocell behavior. Science Advances, 6(5), eaay4237.Google Scholar

Gross, M. E., Zedelius, C. M., & Schooler, J. W. (2020). Cultivating an understanding of curiosity as a seed for creativity. Current Opinion in Behavioral Sciences, 35, 77–82.CrossRef Google Scholar

Haber, N., Mrowca, D., Fei-Fei, L., & Yamins, D. L. (2018). Learning to play with intrinsically-motivated self-aware agents. arXiv preprint arXiv:1802.07442.Google Scholar

Harlow, H. F., Harlow, M. K., & Meyer, D. R. (1950). Learning motivated by a manipulation drive. Journal of Experimental Psychology, 40(2), 228.Google Scholar

Hidi, S., & Renninger, K. A. (2006). The four-phase model of interest development. Educational Psychologist, 41(2), 111–127.Google Scholar

Holm, L., Wadenholt, G., & Schrater, P. (2019). Episodic curiosity for avoiding asteroids: Per-trial information gain for choice outcomes drive information seeking. Scientific Reports, 9(1), 1–16.Google Scholar

Hull, C. L. (1943). Principles of behavior: An introduction to behavior theory. Appleton-Century.Google Scholar

Jaderberg, M., Mnih, V., Czarnecki, W. M., Schaul, T., Leibo, J. Z., Silver, D., & Kavukcuoglu, K. (2016). Reinforcement learning with unsupervised auxiliary tasks. arXiv preprint arXiv:1611.05397.Google Scholar

Jaques, N., Lazaridou, A., Hughes, E., Gulcehre, C., Ortega, P., Strouse, D. J., … & De Freitas, N. (2019, May). Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In International Conference on Machine Learning (pp. 3040–3049). PMLR. http://proceedings.mlr.press/v97/jaques19a.html.Google Scholar

Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255–260.Google Scholar

Jordan, M. I., & Rumelhart, D. E. (1992). Forward models: Supervised learning with a distal teacher. Cognitive Science, 16(3), 307–354.Google Scholar

Kaplan, F., & Oudeyer, P. Y. (2007). In search of the neural circuits of intrinsic motivation. Frontiers in Neuroscience, 1, 17.CrossRef Google Scholar PubMed

Kim, K., Sano, M., De Freitas, J., Haber, N., & Yamins, D. (2020). Active world model learning with progress curiosity. In International conference on machine learning (pp. 5306–5315). PMLR. https://proceedings.mlr.press/v119/kim20e.html.Google Scholar

Laversanne-Finot, A., Péré, A., & Oudeyer, P. Y. (2018). Curiosity driven exploration of learned disentangled goal spaces. In Conference on Robot Learning (pp. 487–504). PMLR. https://proceedings.mlr.press/v87/laversanne-finot18a.html.Google Scholar

Laversanne-Finot, A., Péré, A., & Oudeyer, P. Y. (2021). Intrinsically motivated exploration of learned goal spaces. Frontiers in Neurorobotics, 14, 109.Google Scholar

Lefort, M., & Gepperth, A. (2015). Active learning of local predictable representations with artificial curiosity. In 2015 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) (pp. 228–233). IEEE. https://ieeexplore.ieee.org/abstract/document/7346145.Google Scholar

Leibo, J. Z., Hughes, E., Lanctot, M., & Graepel, T. (2019). Autocurricula and the emergence of innovation from social interaction: A manifesto for multi-agent intelligence research. arXiv preprint arXiv:1903.00742.Google Scholar

Lin, B., Cecchi, G., Bouneffouf, D., Reinen, J., & Rish, I. (2019). A story of two streams: Reinforcement learning models from human behavior and neuropsychiatry. arXiv preprint arXiv:1906.11286.Google Scholar

Linke, C., Ady, N. M., White, M., Degris, T., & White, A. (2020). Adapting behavior via intrinsic reward: A survey and empirical study. Journal of Artificial Intelligence Research, 69, 1287–1332.Google Scholar

Loewenstein, G. (1994). The psychology of curiosity: A review and reinterpretation. Psychological Bulletin, 116(1), 75.Google Scholar

Lucas, R. E. (2004). The industrial revolution: Past and future. Economic Education Bulletin, 44(8), 1–8. www.aier.org/wp-content/uploads/2013/11/EEB-8.04-IndustRev.pdf.Google Scholar

Maturana, H. R., & Varela, F. J. (1980). Autopoiesis and cognition: The realization of the living (Vol. 42). Springer Science & Business Media.CrossRef Google Scholar

McClelland, J. L. (2009). The place of modeling in cognitive science. Topics in Cognitive Science, 1(1), 11–38.Google Scholar

Mirolli, M., & Baldassarre, G. (2013). Functions and mechanisms of intrinsic motivations. In Baldassarre, G & Mirolli, M (Eds.), Intrinsically Motivated Learning in Natural and Artificial Systems (pp. 49–72). Springer.Google Scholar

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., … & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533.Google Scholar

Moulin-Frier, C., Nguyen, S. M., & Oudeyer, P. Y. (2014). Self-organization of early vocal development in infants and machines: the role of intrinsic motivation. Frontiers in Psychology, 4, 1006.CrossRef Google Scholar PubMed

Moulin-Frier, C., & Oudeyer, P. Y. (2013, August). Exploration strategies in developmental robotics: A unified probabilistic framework. In 2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL) (pp. 1–6). IEEE. https://doi.org/10.1109/DevLrn.2013.6652535.Google Scholar

Nair, A., Pong, V., Dalal, M., Bahl, S., Lin, S., & Levine, S. (2018). Visual reinforcement learning with imagined goals. arXiv preprint arXiv:1807.04742.Google Scholar

Nake, F. (1976). Ästhetik als Informationsverarbeitung: Grundlagen und Anwendungen der Informatik im Bereich ästhetischer Produktion und Kritik. Journal of Aesthetics and Art Criticism, 34(3).Google Scholar

Nguyen, S. M., & Oudeyer, P. Y. (2012). Active choice of teachers, learning strategies and goals for a socially guided intrinsic motivation learner. Paladyn, 3(3), 136–146.Google Scholar

Eleni Nisioti, Katia Jodogne-del Litto, Clément Moulin-Frier. Grounding an Ecological Theory of Artificial Intelligence in Human Evolution. NeurIPS 2021 - Conference on Neural Information Processing Systems / Workshop: Ecological Theory of Reinforcement Learning, Dec 2021, virtual event, France. (hal-03446961v2)Google Scholar

Oller, D. K. (2000). The emergence of the speech capacity. Psychology Press.Google Scholar

Oudeyer, P. Y. (2018). Computational theories of curiosity-driven learning. arXiv preprint arXiv:1802.10546.Google Scholar

Oudeyer, P. Y., & Kaplan, F. (2006). Discovering communication. Connection Science, 18(2), 189–206.Google Scholar

Oudeyer, P. Y., & Kaplan, F. (2009). What is intrinsic motivation? A typology of computational approaches. Frontiers in Neurorobotics, 1, 6.Google Scholar

Oudeyer, P. Y., Kaplan, F., & Hafner, V. V. (2007). Intrinsic motivation systems for autonomous mental development. IEEE Transactions on Evolutionary Computation, 11(2), 265–286.Google Scholar

Oudeyer, P. Y., & Smith, L. B. (2016). How evolution may work through curiosity‐driven developmental process. Topics in Cognitive Science, 8(2), 492–502.Google Scholar

Pan, M., Huang, A., Wang, G., Zhang, T., & Li, X. (2020). Reinforcement learning based curiosity-driven testing of android applications. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (pp. 153–164). https://doi.org/10.1145/3395363.3397354.CrossRef Google Scholar

Pathak, D., Agrawal, P., Efros, A. A., & Darrell, T. (2017). Curiosity-driven exploration by self-supervised prediction. In International conference on machine learning (pp. 2778–2787). PMLR. http://proceedings.mlr.press/v70/pathak17a.html.Google Scholar

Poli, F., Serino, G., Mars, R. B., & Hunnius, S. (2020). Infants tailor their attention to maximize learning. Science Advances, 6(39), eabb5053.Google Scholar

Pong, V. H., Dalal, M., Lin, S., Nair, A., Bahl, S., & Levine, S. (2020). Skew-fit: State-covering self-supervised reinforcement learning. arXiv preprint arXiv:1903.03698.Google Scholar

Potts, R. (2013). Hominin evolution in settings of strong environmental variability. Quaternary Science Reviews, 73, 1–13.Google Scholar

Reinke, C., Etcheverry, M., & Oudeyer, P. Y. (2020). Intrinsically motivated discovery of diverse patterns in self-organizing systems. arXiv preprint arXiv:1908.06663.Google Scholar

Rolf, M., Steil, J. J., & Gienger, M. (2010). Goal babbling permits direct learning of inverse kinematics. IEEE Transactions on Autonomous Mental Development, 2(3), 216–229.Google Scholar

Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. American Psychologist, 55(1), 68.Google Scholar

Saegusa, R., Metta, G., Sandini, G., & Sakka, S. (2009). Active motor babbling for sensorimotor learning. In 2008 IEEE International Conference on Robotics and Biomimetics (pp. 794–799). IEEE. https://doi.org/10.1109/ROBIO.2009.4913101.Google Scholar

Santucci, V. G., Baldassarre, G., & Mirolli, M. (2013). Which is the best intrinsic motivation signal for learning multiple skills? Frontiers in Neurorobotics, 7, 22.Google Scholar

Schaul, T., Quan, J., Antonoglou, I., & Silver, D. (2016). Prioritized experience replay. arXiv preprint arXiv:1511.05952.Google Scholar

Schmidhuber, J. (1991a). A possibility for implementing curiosity and boredom in model-building neural controllers. In Proc. of the International Conference on Simulation of Adaptive Behavior: From Animals to Animats (pp. 222–227). https://doi.org/10.7551/mitpress/3115.003.0030.Google Scholar

Schmidhuber, J. (1991b). Curious model-building control systems. In Proc. International Joint Conference on Neural Networks, Singapore City, November 18–21, 1991, (pp. 1458–1463). www.scirp.org/(S(lz5mqp453ed%20snp55rrgjct55))/reference/referencespapers.aspx?referenceid=1385254.Google Scholar

Schueller, W., Loreto, V., & Oudeyer, P. Y. (2018). Complexity reduction in the negotiation of new lexical conventions. arXiv preprint arXiv:1805.05631.Google Scholar

Singh, S., Lewis, R. L., Barto, A. G., & Sorg, J. (2010). Intrinsically motivated reinforcement learning: An evolutionary perspective. IEEE Transactions on Autonomous Mental Development, 2(2), 70–82.CrossRef Google Scholar

Stout, A., & Barto, A. G. (2010, August). Competence progress intrinsic motivation. In 2010 IEEE 9th International Conference on Development and Learning (pp. 257–262). IEEE. http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=837FA22F4803E257348D38A7397B2774?doi=10.1.1.224.71.Google Scholar

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.Google Scholar

Takahashi, K., Ogata, T., Nakanishi, J., Cheng, G., & Sugano, S. (2017). Dynamic motion learning for multi-DOF flexible-joint robots using active–passive motor babbling through deep learning. Advanced Robotics, 31(18), 1002–1015.Google Scholar

Tang, H., Houthooft, R., Foote, D., Stooke, A., Chen, X., Duan, Y., … & Abbeel, P. (2017). #Exploration: A study of count-based exploration for deep reinforcement learning. Advances in Neural Information Processing Systems. Presented at the 31st Conference on Neural Information Processing Systems (NIPS), 2017. In 31st Conference on Neural Information Processing Systems (NIPS) (Vol. 30, pp. 1–18).Google Scholar

Ten, A., Kaushik, P., Oudeyer, P. Y., & Gottlieb, J. (2021). Humans monitor learning progress in curiosity-driven exploration. Nature Communications, 12(1), 1–10.Google Scholar

Thrun, S. (1995). A lifelong learning perspective for mobile robot control. In Intelligent robots and systems (pp. 201–214). Elsevier Science BV.Google Scholar

Twomey, K. E., & Westermann, G. (2018). Curiosity‐based learning in infants: a neurocomputational approach. Developmental Science, 21(4), e12629.Google Scholar