Action learning and grounding in simulated human–robot interactions

Oliver Roesler; Ann Nowé

doi:10.1017/S0269888919000079

Action learning and grounding in simulated human–robot interactions

Part of: Adaptive Learning Agents 2018

Published online by Cambridge University Press: 12 November 2019

Oliver Roesler

and

Ann Nowé

Show author details

Oliver Roesler*: Affiliation:
Artificial Intelligence Lab, Vrije Universiteit Brussel, Pleinlaan 9, 1050 Brussels, Belgium; e-mails: [email protected], [email protected]
Ann Nowé: Affiliation:
Artificial Intelligence Lab, Vrije Universiteit Brussel, Pleinlaan 9, 1050 Brussels, Belgium; e-mails: [email protected], [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In order to enable robots to interact with humans in a natural way, they need to be able to autonomously learn new tasks. The most natural way for humans to tell another agent, which can be a human or robot, to perform a task is via natural language. Thus, natural human–robot interactions also require robots to understand natural language, i.e. extract the meaning of words and phrases. To do this, words and phrases need to be linked to their corresponding percepts through grounding. Afterward, agents can learn the optimal micro-action patterns to reach the goal states of the desired tasks. Most previous studies investigated only learning of actions or grounding of words, but not both. Additionally, they often used only a small set of tasks as well as very short and unnaturally simplified utterances. In this paper, we introduce a framework that uses reinforcement learning to learn actions for several tasks and cross-situational learning to ground actions, object shapes and colors, and prepositions. The proposed framework is evaluated through a simulated interaction experiment between a human tutor and a robot. The results show that the employed framework can be used for both action learning and grounding.

Type: Adaptive and Learning Agents
Information: The Knowledge Engineering Review , Volume 34 , 2019 , e13

DOI: https://doi.org/10.1017/S0269888919000079 [Opens in a new window]
Copyright: © Cambridge University Press, 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Abdo, N., Spinello, L., Burgard, W. & Stachniss, C. 2014. Inferring what to imitate in manipulation actions by using a recommender system. In IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China.CrossRef Google Scholar

Akhtar, N. & Montague, L. 1999. Early lexical acquisition: the role of cross-situational learning. First Language 19(57), 347–358.CrossRef Google Scholar

Aly, A., Taniguchi, A. & Taniguchi, T. 2017. A generative framework for multimodal learning of spatial concepts and object categories: an unsupervised part-of-speech tagging and 3D visual perception based approach. In IEEE International Conference on Development and Learning and the International Conference on Epigenetic Robotics (ICDL-EpiRob), Lisbon, Portugal, September 2017.Google Scholar

Argall, B. D., Chernova, S., Veloso, M. & Browning, B. 2009. A survey of robot learning from demonstration. Robotics and Autonomous Systems 57, 469–483.CrossRef Google Scholar

Blythe, R. A., Smith, K. & Smith, A. D. M. 2010. Learning times for large lexicons through cross-situational learning. Cognitive Science 34, 620–642.CrossRef Google Scholar PubMed

Carey, S. 1978. The child as word-learner. In Linguistic Theory and Psychological Reality , Halle, M., Bresnan, J. & Miller, G. A. (eds). MIT Press, 265–293.Google Scholar

Carey, S. & Bartlett, E. 1978. Acquiring a single new word. Papers and Reports on Child Language Development 15, 17–29.Google Scholar

Dawson, C. R., Wright, J., Rebguns, A., Escárcega, M. V., Fried, D. & Cohen, P. R. 2013. A generative probabilistic framework for learning spatial language. In IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL), Osaka, Japan, August 2013.Google Scholar

Fisher, C., Hall, D. G., Rakowitz, S. & Gleitman, L. 1994. When it is better to receive than to give: syntactic and conceptual constraints on vocabulary growth. Lingua 92, 333–375.CrossRef Google Scholar

Flanagan, R., Bowman, M. C. & Johansson, R. S. 2006. Control strategies in object manipulation tasks. Current Opinion in Neurobiology 16, 650–659.CrossRef Google Scholar PubMed

Fontanari, J. F., Tikhanoff, V., Cangelosi, A., Ilin, R. & Perlovsky, L. I. 2009a. Cross-situational learning of object-word mapping using neural modeling fields. Neural Networks 22(5–6), 579–585.CrossRef Google Scholar PubMed

Fontanari, J. F., Tikhanoff, V., Cangelosi, A. & Perlovsky, L. I. 2009b. A cross-situational algorithm for learning a lexicon using neural modeling fields. In International Joint Conference on Neural Networks (IJCNN), Atlanta, GA, USA, June 2009.Google Scholar

Gillette, J., Gleitman, H., Gleitman, L. & Lederer, A. 1999. Human simulations of vocabulary learning. Cognition 73, 135–176.CrossRef Google Scholar PubMed

Gu, S., Holly, E., Lillicrap, T. & Levine, S. 2017. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In IEEE International Conference on Robotics and Automation (ICRA), Singapore, May–June 2017.Google Scholar

Gudimella, A., Story, R., Shaker, M., Kong, R., Brown, M., Shnayder, V. & Campos, M. 2017. Deep reinforcement learning for dexterous manipulation with concept networks. CoRR. https://arxiv.org/abs/1709.06977.Google Scholar

Harnad, S. 1990. The symbol grounding problem. Physica D 42, 335–346.CrossRef Google Scholar

International Federation of Robotics. 2017. World robotics 2017 - service robots.Google Scholar

Kemp, C. C., Edsinger, A. & Torres-Jara, E. 2007. Challenges for robot manipulation in human environments. IEEE Robotics & Automation Magazine 14(1), 20–29.CrossRef Google Scholar

Ng, A. Y., Harada, D., & Russell, S. 1999. Policy invariance under reward transformations: theory and application to reward shaping. In Proceedings of the Sixteenth International Conference on Machine Learning (ICML), Bratko, I. & Dzeroski, S. (eds), 99, 278–287.Google Scholar

Pinker, S. 1989. Learnability and Cognition. MIT Press.Google Scholar

Popov, I., Heess, N., Lillicrap, T., Hafner, R., Barth-Maron, G., Vecerik, M., Lampe, T., Tassa, Y., Erez, T. & Riedmiller, M. 2017. Data-efficient deep reinforcement learning for dexterous manipulation. CoRR. https://arxiv.org/abs/1704.03073.Google Scholar

Puterman, M. L. 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley and Sons, Inc.CrossRef Google Scholar

Roesler, O., Aly, A., Taniguchi, T. & Hayashi, Y. 2018. A probabilistic framework for comparing syntactic and semantic grounding of synonyms through cross-situational learning. In ICRA ’18 Workshop on Representing a Complex World: Perception, Inference, and Learning for Joint Semantic, Geometric, and Physical Understanding, Brisbane, Australia, May 2018.Google Scholar

Roesler, O., Aly, A., Taniguchi, T. & Hayashi, Y. 2019. Evaluation of word representations in grounding natural language instructions through computational human–robot interaction. In Proceedings of the 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI), Daegu, South Korea, March 2019.Google Scholar

Rusu, R. B., Bradski, G., Thibaux, R. & Hsu, J. 2010. Fast 3D recognition and pose using the viewpoint feature histogram. In Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan, October 2010, 2155–2162.Google Scholar

She, L., Yang, S., Cheng, Y., Jia, Y., Chai, J. Y. & Xi, N. 2014. Back to the blocks world: learning new actions through situated human-robot dialogue. In Proceedings of the SIGDIAL 2014 Conference, Philadelphia, USA, June 2014, 89–97.Google Scholar

Siskind, J. M. 1996. A computational study of cross-situational techniques for learning word-to-meaning mappings. Cognition 61, 39–91.CrossRef Google Scholar PubMed

Smith, A. D. M., & Smith, K. 2012. Cross-Situational Learning. Springer US, 864–866. ISBN 978-1-4419-1428-6. doi: 10.1007/978-1-4419-1428-6_1712. https://doi.org/10.1007/978-1-4419-1428-6_1712.Google Scholar

Smith, K., Smith, A. D. M. & Blythe, R. A. 2011. Cross-situational learning: an experimental study of word-learning mechanisms. Cognitive Science 35(3), 480–498.CrossRef Google Scholar

Smith, L. & Yu, C. 2008. Infants rapidly learn word-referent mappings via cross-situational statistics. Cognition 106, 1558–1568.CrossRef Google Scholar PubMed

Steels, L. & Loetzsch, M. 2012. The grounded naming game. In Experiments in Cultural Language Evolution, Steels, L. (ed). John Benjamins, 41–59.CrossRef Google Scholar

Stulp, F., Theodorou, E. A. & Schaal, S. 2012. Reinforcement learning with sequences of motion primitives for robust manipulation. IEEE Transactions on Robotics (T-RO) 28(6), 1360–1370.CrossRef Google Scholar

Sutton, R. S. & Barto, A. G. 1998. Reinforcement Learning: An Introduction. MIT Press.Google Scholar

Taniguchi, A., Taniguchi, T. & Cangelosi, A. 2017. Cross-situational learning with Bayesian generative models for multimodal category and word learning in robots. Frontiers in Neurorobotics 11.CrossRef Google Scholar

Tellex, S., Kollar, T., Dickerson, S., Walter, M. R., Banerjee, A. G., Teller, S. & Roy, N. 2011. Approaching the symbol grounding problem with probabilistic graphical models. AI Magazine 32(4), 64–76.CrossRef Google Scholar

Vogt, P. 2012. Exploring the robustness of cross-situational learning under Zipfian distributions. Cognitive Science 36(4), 726–739.CrossRef Google Scholar PubMed

Article contents

Action learning and grounding in simulated human–robot interactions

Abstract

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests