Hostname: page-component-669899f699-7xsfk Total loading time: 0 Render date: 2025-04-25T20:51:43.583Z Has data issue: false hasContentIssue false

Safety supervision framework for legged robots through safety verification and fall protection

Published online by Cambridge University Press:  10 April 2025

Ming Sun
Affiliation:
Department of Automation, Shanghai Jiao Tong University, Shanghai, China
Yue Gao*
Affiliation:
MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China Shanghai Innovation Institute, Shanghai, China
*
Corresponding author: Yue Gao; Email: [email protected]

Abstract

Safety is an essential requirement as well as a major bottleneck for legged robots in the real world. Particularly for learning-based methods, their trial-and-error nature and unexplainable policy have raised widespread concerns. Existing methods usually treat this challenge as a trade-off between safety assurance and task performance. One reason for this drawback stems from the inaccurate inference for the robot’s safety. In this paper, we re-examine the segmentation of the robot’s state space in terms of safety. According to the current state and the prediction of the state transition trajectory, the states of legged robots are classified into safe, recoverable, unsafe, and failure, and a safety verification method is introduced to online infer the robot’s safety. Then, task, recovery, and fall protection policies are trained to ensure the robot’s safety in different states, forming a safety supervision framework independently from the learning algorithm. To validate the proposed method and framework, experiment results are conducted both in the simulation and on the real-world robot, indicating improvements in terms of safety and efficiency.

Type
Research Article
Copyright
© The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Rudin, N., Hoeller, D., Reist, P. and Hutter, M., “Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning,” In: Conference on Robot Learning (PMRL, 2022), pp. 91100.Google Scholar
Margolis, G. B. and Agrawal, P., “Walk These Ways: Tuning Robot Control for Generalization with Multiplicity of Behavior,” In: Conference on Robot Learning (PMLR, 2023), pp. 2231.Google Scholar
Nahrendra, I. M. A., Yu, B. and Myung, H., “Dreamwaq: Learning Robust Quadrupedal Locomotion with Implicit Terrain Imagination via Deep Reinforcement Learning,” In: IEEE International Conference on Robotics and Automation (IEEE, 2023) pp. 50785084.Google Scholar
Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V. and Hutter, M., “Learning quadrupedal locomotion over challenging terrain,” Sci. Rob. 547(47), eabc5986 (2020).CrossRefGoogle Scholar
Miki, T., Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V. and Hutter, M., “Learning robust perceptive locomotion for quadrupedal robots in the wild,” Sci. Rob. 762(62), eabk2822 (2022).CrossRefGoogle Scholar
Ada, S. E., Ugur, E. and Akin, H. L., “Generalization in transfer learning: Robust control of robot locomotion,” Robotica 40(11), 38113836 (2022).CrossRefGoogle Scholar
Grandia, R., Taylor, A. J., Ames, A. D. and Hutter, M., “Multi-layered safety for legged robots via control barrier functions and model predictive control,” In: IEEE International Conference on Robotics and Automation (IEEE, 2021) pp. 83528358.Google Scholar
Liao, Q., Li, Z., Thirugnanam, A., Zeng, J. and Sreenath, K., “Walking in Narrow Spaces: Safety-Critical Locomotion Control for Quadrupedal Robots with Duality-based Optimization,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IEEE, 2023) pp. 27232730.Google Scholar
García, J. and Shafie, D., “Teaching a humanoid robot to walk faster through safe reinforcement learning,” Eng. Appl. Artif. Intell. 88, 103360 (2020).CrossRefGoogle Scholar
Yang, T.-Y., Zhang, T., Luu, L., Ha, S., Tan, J. and Yu, W., “Safe Reinforcement Learning for Legged Locomotion,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IEEE, 2022) pp. 24542461.Google Scholar
Ibarz, J., Tan, J., Finn, C., Kalakrishnan, M., Pastor, P. and Levine, S., “How to train your robot with deep reinforcement learning: Lessons we have learned,” Int. J. Rob. Res. 40(4-5), 698721 (2021).CrossRefGoogle Scholar
Thananjeyan, B., Balakrishna, A., Nair, S., Luo, M., Srinivasan, K., Hwang, M., Gonzalez, J. E., Ibarz, J., Finn, C. and Goldberg, K., “Recovery rl: Safe reinforcement learning with learned recovery zones,” IEEE Rob. Autom. Lett. 6(3), 49154922 (2021).CrossRefGoogle Scholar
Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V. and Hutter, M., “Learning agile and dynamic motor skills for legged robots,” Sci. Rob. 4(26), eaau5872 (2019).CrossRefGoogle ScholarPubMed
Tan, J., Zhang, T., Coumans, E., Iscen, A., Bai, Y., Hafner, D., Bohez, S. and Vanhoucke, V., “Sim-to-Real: Learning Agile Locomotion for Quadruped Robots,” In: Robotics: Science and Systems (RSS, 2018), pp. 111.CrossRefGoogle Scholar
Huang, X., Li, Z., Xiang, Y., Ni, Y., Chi, Y., Li, Y., Yang, L., Peng, X. B. and Sreenath, K., “Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IEEE, 2023) pp. 27152722.Google Scholar
Yu, W., Liu, C. K. and Turk, G., “Protective Policy Transfer,” In: IEEE International Conference on Robotics and Automation (IEEE, 2021) pp. 1059510602.Google Scholar
He, T., Zhang, C., Xiao, W., He, G., Liu, C. and Shi, G., “Agile but safe: Learning collision-free high-speed legged locomotion,” arXiv preprint arXiv: 2401.17583 (2024).CrossRefGoogle Scholar
Hsu, S. C., Xu, X. and Ames, A. D., “Control Barrier Function Based Quadratic Programs with Application to Bipedal Robotic Walking,” In: American Control Conference (IEEE, 2015) pp. 45424548.Google Scholar
Nguyen, Q., Hereid, A., Grizzle, J. W., Ames, A. D. and Sreenath, K., “3D Dynamic Walking on Stepping Stones with Control Barrier Functions,” In: 55th Conference on Decision and Control (IEEE, 2016) pp. 827834.Google Scholar
Garcıa, J. and Fernández, F., “A comprehensive survey on safe reinforcement learning,” J. Mach. Learn. Res. 16(1), 14371480 (2015).Google Scholar
Brunke, L., Greeff, M., Hall, A. W., Yuan, Z., Zhou, S., Panerati, J. and Schoellig, A. P., “Safe learning in robotics: From learning-based control to safe reinforcement learning,” Annu. Rev. Control Rob. Auton. Syst. 5(1), 411444 (2022).CrossRefGoogle Scholar
Chow, Y., Nachum, O., Duenez-Guzman, E. and Ghavamzadeh, M., “A Lyapunov-Based Approach to Safe Reinforcement Learning,” In: Advances in Neural Information Processing Systems (2018).Google Scholar
Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S. and Topcu, U., “Safe Reinforcement Learning via Shielding,” In: AAAI Conference on Artificial Intelligence (AAAI, 2018), pp. 26692678.CrossRefGoogle Scholar
Gu, S., Yang, L., Du, Y., Chen, G., Walter, F., Wang, J., Yang, Y. and Knoll, A., “A review of safe reinforcement learning: Methods, theory and applications,” IEEE Trans. Pattern Anal. Mach. Intell. 46(12), 1121611235 (2022) arXiv preprint arXiv: 2205.10330.CrossRefGoogle Scholar
Martínez, D., Alenya, G. and Torras, C., “Safe Robot Execution in Model-based Reinforcement Learning,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IEEE, 2015) pp. 64226427.Google Scholar
Li, Z., Kalabić, U. and Chu, T., “Safe Reinforcement Learning: Learning with Supervision Using a Constraint-Admissible Set,” In: Annual American Control Conference (IEEE, 2018) pp. 63906395.Google Scholar
Lee, S. H. and Goswami, A., “Fall on backpack: Damage minimization of humanoid robots by falling on targeted body segments,” J. Comput. Nonlinear Dyn. 8(2), 021005 (2013).CrossRefGoogle Scholar
Kojio, Y., Noda, S., Sugai, F., Kojima, K., Kakiuchi, Y., Okada, K. and Inaba, M., “Dynamic fall recovery motion generation on biped robot with shell protector,” IEEE Rob. Autom. Lett. 6(4), 67416748 (2021).Google Scholar
Ma, Y., Farshidian, F. and Hutter, M., “Learning Arm-Assisted Fall Damage Reduction and Recovery for Legged Mobile Manipulators,” In: International Conference on Robotics and Automation (IEEE, 2023) pp. 1214912155.Google Scholar
Ding, W., Chen, X., Yu, Z., Meng, L., Ceccarelli, M. and Huang, Q., “Fall Protection of Humanoids Inspired by Human Fall Motion,” In: International Conference on Humanoid Robots (IEEE, 2018) pp. 827833.Google Scholar
Wang, Y., Xu, M., Shi, G. and Zhao, D., “Guardians as You Fall: Active Mode Transition for Safe Falling,” In: International Automated Vehicle Validation Conference (IEEE, 2024) pp. 18.Google Scholar
Altman, E., Constrained Markov Decision Processes (Routledge, 2021).CrossRefGoogle Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. and Klimov, O., “Proximal policy optimization algorithms,” arXiv preprint arXiv: 1707.06347 (2017).Google Scholar
Unitree. Unitree go1 (2024) [Online]. Available: https://www.unitree.com/cn/go1/.Google Scholar
Makoviychuk, V., Wawrzyniak, L., Guo, Y., Lu, M., Storey, K., Macklin, M., Hoeller, D., Rudin, N., Allshire, A., Handa, A. and State, G., “Isaac gym: High performance gpu-based physics simulation for robot learning,” arXiv preprint arXiv: 2108.10470 (2021).Google Scholar
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W. and Abbeel, P., “Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World,” In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IEEE, 2017) pp. 2330.Google Scholar