AI-Based Learning Approach with Consideration of Safety Criteria on Example of a Depalletization Robot

Mark Jocas; Philip Kurrek; Firas Zoghlami; Mario Gianni; Vahid Salehi

doi:10.1017/dsi.2019.210

AI-Based Learning Approach with Consideration of Safety Criteria on Example of a Depalletization Robot

Part of: Industry 4.0

Published online by Cambridge University Press: 26 July 2019

Mark Jocas ,

Mario Gianni and

Mark Jocas*: Affiliation:
Munich University of Applied Sciences;
Philip Kurrek: Affiliation:
Munich University of Applied Sciences;
Firas Zoghlami: Affiliation:
Munich University of Applied Sciences;
Mario Gianni: Affiliation:
University of Plymouth
Vahid Salehi: Affiliation:
Munich University of Applied Sciences;

Article contents

Abstract
References

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Robotic systems need to achieve a certain level of process safety during the performance of the task and at the same time ensure compliance with safety criteria for the expected behaviour. To achieve this, the system must be aware of the risks related to the performance of the task in order to be able to take these into account accordingly. Once the safety aspects have been learned from the system, the task performance must no longer influence them. To achieve this, we present a concept for the design of a neural network that combines these characteristics. This enables the learning of safe behaviour and the fixation of it. The subsequent training of the task execution no longer influences safety and achieves targeted results in comparison to a conventional neural network.

Keywords

Artificial intelligence Industry 4.0 Machine learning

Type: Article
Information: Proceedings of the Design Society: International Conference on Engineering Design , Volume 1 , Issue 1 , July 2019 , pp. 2041 - 2050

DOI: https://doi.org/10.1017/dsi.2019.210 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright: © The Author(s) 2019

References

Abbeel, Pieter, Coates, Adam and Ng, Andrew Y. “Autonomous helicopter aerobatics through apprenticeship learning”, The International Journal of Robotics Research, Vol. 29 No. 13, pp. 1608–1639, 2010. URL https://doi.org/10.1177/0278364910371999.Google Scholar

Achiam, Joshua, Held, David, Tamar, Aviv and Abbeel, Pieter. “Constrained policy optimization”, CoRR, abs/1705.10528, 2017. URL http://arxiv.org/abs/1705.10528.Google Scholar

Alshiekh, Mohammed, Bloem, Roderick, Ehlers, Rüdiger, Könighofer, Bettina, Niekum, Scott and Topcu, Ufuk. “Safe reinforcement learning via shielding”, CoRR, abs/1708.08611, 2017. URL http://arxiv.org/abs/1708.08611.Google Scholar

Arulkumaran, Kai, Deisenroth, Marc Peter, Brundage, Miles and Bharath, Anil Anthony. “Deep reinforcement learning: A brief survey”, IEEE Signal Processing Magazine, Vol. 34 No. 6, pp. 26–38, nov 2017. URL https://doi.org/10.1109%2Fmsp.2017.2743240.Google Scholar

Babcock, James, Kramár, János and Yampolskiy, Roman V. “Guidelines for artificial intelligence containment”, CoRR, abs/1707.08476, 2017. URL http://arxiv.org/abs/1707.08476.Google Scholar

Gao, Yang, Xu, Huazhe, Lin, Ji, Yu, Fisher, Levine, Sergey and Darrell, Trevor. “Reinforcement learning from imperfect demonstrations”, CoRR, abs/1802.05313, 2018. URL http://arxiv.org/abs/1802.05313.Google Scholar

Garcıa, Javier and Fernández, Fernando. “A comprehensive survey on safe reinforcement learning”, Journal of Machine Learning Research, Vol. 16 No. 1, pp. 1437–1480, 2015.Google Scholar

Hans, Alexander, Schneegaß, Daniel, Schäfer, Anton Maximilian and Udluft, Steffen. “Safe exploration for reinforcement learning”, In ESANN, pages 143–148, 2008.Google Scholar

Koenig, N. and Howard, A. “Design and use paradigms for gazebo, an open-source multi-robot simulator”, In 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566). IEEE, 2004. URL https://doi.org/10.1109%2Firos.2004.1389727.Google Scholar

Lipton, Zachary C., Gao, Jianfeng, Li, Lihong, Chen, Jianshu and Deng, Li. “Combating reinforcement learning's sisyphean curse with intrinsic fear”, CoRR, abs/1611.01211, 2016. URL http://arxiv.org/abs/1611.01211.Google Scholar

Majumdar, Anirudha, Singh, Sumeet, Mandlekar, Ajay and Pavone, Marco. “Risk-sensitive inverse reinforcement learning via coherent risk models”, In Robotics: Science and Systems XIII. Robotics: Science and Systems Foundation, jul 2017. URL https://doi.org/10.15607%2Frss.2017.xiii.069.Google Scholar

Menda, Kunal, Driggs-Campbell, Katherine Rose and Kochenderfer, Mykel J. “Dropoutdagger: A bayesian approach to safe imitation learning”, CoRR, abs/1709.06166, 2017. URL http://arxiv.org/abs/1709.06166.Google Scholar

Mnih, Volodymyr, Badia, Adria Puigdomenech, Mirza, Mehdi, Graves, Alex, Lillicrap, Timothy, Harley, Tim, Silver, David and Kavukcuoglu, Koray. “Asynchronous methods for deep reinforcement learning”, In International Conference on Machine Learning, pp. 1928–1937, 2016.Google Scholar

Moldovan, Teodor Mihai and Abbeel, Pieter. “Safe exploration in markov decision processes”, CoRR, abs/1205.4810, 2012. URL http://arxiv.org/abs/1205.4810.Google Scholar

Quigley, Morgan, Conley, Ken, Gerkey, Brian, Faust, Josh, Foote, Tully, Leibs, Jeremy, Wheeler, Rob and Ng, Andrew Y. “Ros: an open-source robot operating system”, In ICRA workshop on open source software, Vol. 3, p. 5. Kobe, Japan, 2009.Google Scholar

Riedl, Mark O. and Harrison, Brent. “Enter the matrix: A virtual world approach to safely interruptable autonomous systems”, CoRR, abs/1703.10284, 2017. URL http://arxiv.org/abs/1703.10284.Google Scholar

Saunders, William, Sastry, Girish, Stuhlmüller, Andreas and Evans, Owain. “Trial without error: Towards safe reinforcement learning via human intervention”, CoRR, abs/1707.05173, 2017. URL http://arxiv.org/abs/1707.05173.Google Scholar

Shrivastava, Ashish, Pfister, Tomas, Tuzel, Oncel, Susskind, Joshua, Wang, Wenda and Webb, Russell. “Learning from simulated and unsupervised images through adversarial training”, In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, jul 2017. URL https://doi.org/10.1109%2Fcvpr.2017.241.Google Scholar

Sutton, Richard S. “Dyna, an integrated architecture for learning, planning, and reacting”, ACM SIGART Bulletin, Vol. 2 No. 4, pp. 160–163, jul 1991. https://doi.org/10.1145/122344.122377. URL https://doi.org/10.11452F122344.122377.Google Scholar

Thomas, Philip, Theocharous, Georgios and Ghavamzadeh, Mohammad. “High confidence policy improvement”, In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pp. 2380–2388, 2015.Google Scholar

Zamora, Iker, Lopez, Nestor Gonzalez, Vilches, Victor Mayoral and Cordero, Alejandro Hernández. “Extending the openai gym for robotics: a toolkit for reinforcement learning using ROS and gazebo”, CoRR, abs/1608.05742, 2016. URL http://arxiv.org/abs/1608.05742.Google Scholar

Article contents

AI-Based Learning Approach with Consideration of Safety Criteria on Example of a Depalletization Robot

Abstract

Keywords

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests