Zero-shot sim-to-real transfer of reinforcement learning framework for robotics manipulation with demonstration and force feedback

Yuanpei Chen; Chao Zeng; Zhiping Wang; Peng Lu; Chenguang Yang

doi:10.1017/S0263574722001230

Zero-shot sim-to-real transfer of reinforcement learning framework for robotics manipulation with demonstration and force feedback

Published online by Cambridge University Press: 07 September 2022

Chao Zeng ,

Peng Lu and

Yuanpei Chen: Affiliation:
College of Automation Science and Engineering, South China University of Technology, Guangzhou, China
Chao Zeng: Affiliation:
School of Automation, Guangdong University of Technology, Guangzhou, China
Zhiping Wang: Affiliation:
School of Electronic Engineering, Dongguan University of Technology, Dongguan, China
Peng Lu: Affiliation:
Department of Mechanical Engineering, The University of Hong Kong, Hong Kong, China
Chenguang Yang*: Affiliation:
College of Automation Science and Engineering, South China University of Technology, Guangzhou, China
*: *Corresponding author. E-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In the field of robot reinforcement learning (RL), the reality gap has always been a problem that restricts the robustness and generalization of algorithms. We propose Simulation Twin (SimTwin) : a deep RL framework that can help directly transfer the model from simulation to reality without any real-world training. SimTwin consists of a RL module and an adaptive correct module. We train the policy using the soft actor-critic algorithm only in a simulator with demonstration and domain randomization. In the adaptive correct module, we design and train a neural network to simulate the human error correction process using force feedback. Subsequently, we combine the above two modules through digital twin to control real-world robots, correct simulator parameters by comparing the difference between simulator and reality automatically, and then generalize the correct action through the trained policy network without additional training. We demonstrate the proposed method in an open cabinet task; the experiments show that our framework can reduce the reality gap without any real-world training.

Keywords

reinforcement learning sim-to-real transfer digital twin

Type: Research Article
Information: Robotica , Volume 41 , Issue 3 , March 2023 , pp. 1015 - 1024

DOI: https://doi.org/10.1017/S0263574722001230 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Zhao, W., Queralta, J. P. and Westerlund, T., “Sim-to-real Transfer in Deep Reinforcement Learning for Robotics: A Survey,” In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI) (IEEE, 2020) pp. 737–744.CrossRef Google Scholar

Gupta, A., Devin, C., Liu, Y., Abbeel, P. and Levine, S., “Learning invariant feature spaces to transfer skills with reinforcement learning,” arXiv preprint arXiv:1703.02949 (2017).Google Scholar

Wang, W., Zheng, V. W., Yu, H. and Miao, C., “A survey of zero-shot learning: Settings, methods, and applications,” ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 1–37 (2019).Google Scholar

Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W. and Abbeel, P., “Domain Randomization for Transferring Deep Neural Networks From Simulation to the Real World,” In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2017) pp. 23–30.CrossRef Google Scholar

Peng, X. B., Andrychowicz, M., Zaremba, W. and Abbeel, P., “Sim-to-real Transfer of Robotic Control with Dynamics Randomization,” In: 2018 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2018) pp. 3803–3810.CrossRef Google Scholar

Nguyen, H. and La, H., “Review of Deep Reinforcement Learning for Robot Manipulation,” In: 2019 Third IEEE International Conference on Robotic Computing (IRC) (IEEE, 2019) pp. 590–595.CrossRef Google Scholar

Chebotar, Y., Handa, A., Makoviychuk, V., Macklin, M., Issac, J., Ratliff, N. and Fox, D., “Closing the Sim-to-real Loop: Adapting Simulation Randomization with Real World Experience,” In: 2019 International Conference on Robotics and Automation (ICRA) (IEEE, 2019) pp. 8973–8979.CrossRef Google Scholar

Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. and Wierstra, D., “Continuous control with deep reinforcement learning,” arXiv preprint arXiv: 1509.02971 (2015).Google Scholar

Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., P. Abbeel and S. Levine, “Soft actor-critic algorithms and applications,” arXiv preprint arXiv: 1812.05905 (2018).Google Scholar

Hester, T., Vecerik, M., Pietquin, O., Lanctot, M., Schaul, T., Piot, B., Horgan, D., Quan, J., Sendonaris, A., Osband, I., Dulac-Arnold, G., Agapiou, J., Leibo, J. and A. Gruslys, “Deep Q-Learning From Demonstrations,” In: Thirty-Second AAAI Conference on Artificial Intelligence (2018).CrossRef Google Scholar

Christiano, P., Shah, Z., Mordatch, I., Schneider, J., Blackwell, T., Tobin, J., Abbeel, P. and Zaremba, W., “Transfer from simulation to real world through learning deep inverse dynamics model,” arXiv preprint arXiv: 1610.03518 (2016).Google Scholar

Bi, T., Sferrazza, C. and D’Andrea, R., “Zero-shot sim-to-real transfer of tactile control policies for aggressive swing-up manipulation,” IEEE Robot. Automat. Lett. 6(3), 5761–5768 (2021).CrossRef Google Scholar

Guha, A. and Annaswamy, A., “Mrac-rl: a framework for on-line policy adaptation under parametric model uncertainty,” arXiv preprint arXiv:2011.10562 (2020).Google Scholar

Zeng, C., Su, H., Li, Y., Guo, J. and Yang, C., “An approach for robotic leaning inspired by biomimetic adaptive control,” IEEE Trans. Ind. Inform. 18(3), 1479–1488 (2022).CrossRef Google Scholar

Zeng, C., Li, Y., Guo, J., Huang, Z., Wang, N. and Yang, C., “A unified parametric representation for robotic compliant skills with adaptation of impedance and force,” IEEE/ASME Trans. Mechatron. 27(2), 623–633 (2021).Google Scholar

Martín-Martín, R., Lee, M. A., Gardner, R., Savarese, S., Bohg, J. and Garg, A., “Variable Impedance Control in End-Effector Space: An Action Space for Reinforcement Learning in Contact-Rich Tasks,” In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2019) pp. 1010–1017.CrossRef Google Scholar

Beltran-Hernandez, C. C., Petit, D., Ramirez-Alpizar, I. G., Nishi, T., Kikuchi, S., Matsubara, T. and Harada, K., “Learning force control for contact-rich manipulation tasks with rigid position-controlled robots,” IEEE Robot. Automat. Lett. 5(4), 5709–5716 (2020).CrossRef Google Scholar

Wu, J., Yang, Y., Cheng, X., Zuo, H. and Cheng, Z., “The Development of Digital Twin Technology Review,” In: 2020 Chinese Automation Congress (CAC) (IEEE, 2020) pp. 4901–4906.CrossRef Google Scholar

Xia, K., Sacco, C., Kirkpatrick, M., Saidy, C., Nguyen, L., Kircaliali, A. and Harik, R., “A digital twin to train deep reinforcement learning agent for smart manufacturing plants: environment, interfaces and intelligence,” J. Manuf. Syst. 58(3), 210–230 (2021).CrossRef Google Scholar

Mayr, M., Chatzilygeroudis, K., Ahmad, F., Nardi, L. and Krueger, V., “Learning of Parameters in Behavior Trees for Movement Skills,” In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2021) pp. 7572–7579.CrossRef Google Scholar

Andrychowicz, O. M., Baker, B., Chociej, M., Jozefowicz, R., McGrew, B., Pachocki, J., Petron, A., Plappert, M., Powell, G., A. Ray, J. Schneider, S. Sidor, J. Tobin, P. Welinder, L. Weng and W. Zaremba, “Learning dexterous in-hand manipulation,” Int. J. Robot. Res. 39(1), 3–20 (2020).CrossRef Google Scholar

Makoviychuk, V., Wawrzyniak, L., Guo, Y., Lu, M., Storey, K., Macklin, M., Hoeller, D., Rudin, N., Allshire, A., A. Handa and G. State, “Isaac gym: High performance GPU-based physics simulation for robot learning,” arXiv preprint arXiv: 2108.10470 (2021).Google Scholar

Article contents

Zero-shot sim-to-real transfer of reinforcement learning framework for robotics manipulation with demonstration and force feedback

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests