Hostname: page-component-cd9895bd7-fscjk Total loading time: 0 Render date: 2024-12-24T17:44:38.139Z Has data issue: false hasContentIssue false

A model-free deep reinforcement learning approach for control of exoskeleton gait patterns

Published online by Cambridge University Press:  15 December 2021

Lowell Rose
Affiliation:
Autonomous Systems and Biomechatronics Laboratory, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Canada
Michael C. F. Bazzocchi*
Affiliation:
Autonomous Systems and Biomechatronics Laboratory, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Canada Department of Mechanical and Aeronautical Engineering, Clarkson University, Potsdam, NY, USA
Goldie Nejat
Affiliation:
Autonomous Systems and Biomechatronics Laboratory, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Canada Toronto Rehabilitation Institute, Toronto, Canada
*
*Corresponding author. E-mail: [email protected]

Abstract

Lower-body exoskeleton control that adapts to users and provides assistance-as-needed can increase user participation and motor learning and allow for more effective gait rehabilitation. Adaptive model-based control methods have previously been developed to consider a user’s interaction with an exoskeleton; however, the predefined dynamics models required are challenging to define accurately, due to the complex dynamics and nonlinearities of the human-exoskeleton interaction. Model-free deep reinforcement learning (DRL) approaches can provide accurate and robust control in robotics applications and have shown potential for lower-body exoskeletons. In this paper, we present a new model-free DRL method for end-to-end learning of desired gait patterns for over-ground gait rehabilitation with an exoskeleton. This control technique is the first to accurately track any gait pattern desired in physiotherapy without requiring a predefined dynamics model and is robust to varying post-stroke individuals’ baseline gait patterns and their interactions and perturbations. Simulated experiments of an exoskeleton paired to a musculoskeletal model show that the DRL method is robust to different post-stroke users and is able to accurately track desired gait pattern trajectories both seen and unseen in training.

Type
Research Article
Copyright
© The Author(s), 2021. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Krueger, H., Koot, J., Hall, R. E., O’Callaghan, C., Bayley, M. and Corbett, D., “Prevalence of individuals experiencing the effects of stroke in Canada: Trends and projections,” Stroke 46(8), 22262231 (2015).CrossRefGoogle ScholarPubMed
Lo, K., Stephenson, M. and Lockwood, C., “Effectiveness of robotic assisted rehabilitation for mobility and functional ability in adult stroke patients: A systematic review protocol,” JBI Database Syst. Rev. Implementation Rep. 15(12), 30493091 (2017).CrossRefGoogle ScholarPubMed
Rupal, B. S., Rafique, S., Singla, A., Singla, E., Isaksson, M. and Virk, G. S., “Lower-limb exoskeletons: Research trends and regulatory guidelines in medical and non-medical applications,” Int. J. Adv. Robot. Syst. 14(6), 127 (2017).CrossRefGoogle Scholar
Hobbs, B. and Artemiadis, P., “A review of robot-assisted lower-limb stroke therapy: Unexplored paths and future directions in gait rehabilitation,” Front. Neurorobot. 14, Article 19 (1–16) (2020).CrossRefGoogle ScholarPubMed
Selves, C., Stoquart, G. and Lejeune, T., “Gait rehabilitation after stroke: Review of the evidence of predictors, clinical outcomes and timing for interventions,” Acta Neurol. Belg. 120(4), 783790 (2020).CrossRefGoogle ScholarPubMed
Louie, D. R. and Eng, J. J., “Powered robotic exoskeletons in post-stroke rehabilitation of gait: A scoping review,” J. Neuroeng. Rehabil. 13(1), 53 (2016).CrossRefGoogle ScholarPubMed
Federici, S., Meloni, F., Bracalenti, M. and De Filippis, M. L., “The effectiveness of powered, active lower limb exoskeletons in neurorehabilitation: A systematic review,” NeuroRehabilitation 37(3), 321340 (2015).CrossRefGoogle ScholarPubMed
Wu, X., Liu, D.-X., Liu, M., Chen, C. and Guo, H., “Individualized gait pattern generation for sharing lower limb exoskeleton robot,” IEEE Trans. Autom. Sci. Eng., 15(4), 14591470 (2018).CrossRefGoogle Scholar
Mendoza-Crespo, R., Torricelli, D., Huegel, J. C., Gordillo, J. L., Pons, J. L. and Soto, R., “An adaptable human-like gait pattern generator derived from a lower limb exoskel eton,” Front. Robot. AI 6, Article 36 (1–14) (2019).CrossRefGoogle Scholar
Chen, B. et al., “Recent developments and challenges of lower extremity exoskeletons,” J. Orthop. Transl. 5, 2637 (2016).Google ScholarPubMed
Young, A. J. and Ferris, D. P., “State of the art and future directions for lower limb robotic exoskeletons,” IEEE Trans. Neural Syst. Rehabil. Eng. 25(2), 171182 (2017).CrossRefGoogle ScholarPubMed
Bortole, M. et al., “The H2 robotic exoskeleton for gait rehabilitation after stroke: Early findings from a clinical study Wearable robotics in clinical testing,” J. Neuroeng. Rehabil. 12(1), 54 (2015).CrossRefGoogle Scholar
Lv, G., Zhu, H. and Gregg, R. D., “On the design and control of highly backdrivable lower-limb exoskeletons: A discussion of past and ongoing work,” IEEE Control Syst. 38(6), 88113 (2018).CrossRefGoogle ScholarPubMed
McDaid, A., Kora, K., Xie, S., Lutz, J. and Battley, M., “Human-Inspired Robotic Exoskeleton (HuREx) for Lower Limb Rehabilitation,” 2013 IEEE International Conference on Mechatronics and Automation, IEEE ICMA 2013 (2013) pp. 19–24.Google Scholar
Zhang, J. et al., “Human-in-the-loop optimization of exoskeleton assistance during walking,” Science 356(6344), 12801284 (2017).CrossRefGoogle ScholarPubMed
Wu, G., Wang, C., Wu, X., Wang, Z., Ma, Y. and Zhang, T., “Gait Phase Prediction for Lower Limb Exoskeleton Robots,” 2016 IEEE International Conference on Information and Automation (2016) pp. 1924.Google Scholar
Luu, T. P., Low, K. H., Qu, X., Lim, H. B. and Hoon, K. H., “An individual-specific gait pattern prediction model based on generalized regression neural networks,” Gait Posture 39(1), 443448 (2014).CrossRefGoogle ScholarPubMed
Horst, F., Lapuschkin, S., Samek, W., MÜller, K.-R. and SchÖllhorn, W. I., “Explaining the unique nature of individual gait patterns with deep learning,” Sci. Rep. 9(1), 2391 (2019).CrossRefGoogle ScholarPubMed
Lim, H. B., Luu, T. P., Hoon, K. H. and Low, K. H., “Natural Gait Parameters Prediction for Gait Rehabilitation via Artificial Neural Network,” 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (2010) pp. 53985403.Google Scholar
Tucker, M. R. et al., “Control strategies for active lower extremity prosthetics and orthotics: A review,” J. Neuroeng. Rehabil. 12(1), 129 (2015).CrossRefGoogle ScholarPubMed
Moreno, J. C., Figueiredo, J. and Pons, J. L., “Chapter 7 - Exoskeletons for lower-limb rehabilitation,” In: Rehabilitation Robotics (R. Colombo and V. Sanguineti, eds.) (Elsevier, Imprint: Academic Press), ISBN: 978-0-12-811995-2. Available: https://doi.org/10.1016/C2016-0-02285-4.CrossRefGoogle Scholar
Lotze, M., Braun, C., Birbaumer, N., Anders, S. and Cohen, L. G., “Motor learning elicited by voluntary drive,” Brain 126(4), 866872 (2003).CrossRefGoogle ScholarPubMed
Yan, T., Cempini, M., Oddo, M. and Vitiello, N., “Review of assistive strategies in powered lower-limb orthoses and exoskeletons,” Rob. Auton. Syst. 64, 120136 (2015).CrossRefGoogle Scholar
Brahmi, B., Saad, M., Luna, C. O., Archambault, P. S. and Rahman, M. H., “Passive and active rehabilitation control of human upper-limb exoskeleton robot with dynamic uncertainties,” Robotica 36(11), 17571779 (2018).CrossRefGoogle Scholar
Long, Y., Du, Z. J., Wang, W. D. and Dong, W., “Robust sliding mode control based on GA optimization and CMAC compensation for lower limb exoskeleton,” Appl. Bionics Biomech. 2016, Article 5017381 (1–13) (2016).CrossRefGoogle Scholar
Sado, F., Yap, H. J., Ariffin, R., Ghazilla, R. A. R. and Ahmad, N., “Exoskeleton robot control for synchronous walking assistance in repetitive manual handling works based on dual unscented Kalman filter,” PLoS One 13(7), 1–36 (2018).CrossRefGoogle ScholarPubMed
Sanz-Merodio, D., Cestari, M., Arevalo, J. C., Carrillo, X. A. and Garcia, E., “Generation and control of adaptive gaits in lower-limb exoskeletons for motion assistance,” Adv. Robot. 28(5), 329338 (2014).CrossRefGoogle Scholar
Banala, S. K., Agrawal, S. K., Kim, S. H. and Scholz, J. P., “Novel gait adaptation and neuromotor training results using an active leg exoskeleton,” IEEE/ASME Trans. Mechatron. 15(2), 216–225 (2010).CrossRefGoogle Scholar
Wang, X., Li, X., Wang, J., Fang, X. and Zhu, X., “Data-driven model-free adaptive sliding mode control for the multi degree-of-freedom robotic exoskeleton,” Inf. Sci. (Ny). 327, 246257 (2016).CrossRefGoogle Scholar
Hwangbo, J. et al., “Learning agile and dynamic motor skills for legged robots,” Sci. Robot. 4(26), 1–13 (2019).CrossRefGoogle ScholarPubMed
Rajeswaran, A. et al., “Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations,” ArXiv1709.10087 (2018).CrossRefGoogle Scholar
Qu, Z. et al., “Research on Fuzzy Adaptive Impedance Control of Lower Extremity Exoskeleton,” Proceedings of 2019 IEEE International Conference on Mechatronics and Automation (2019) pp. 939944.Google Scholar
Serrancoli, G. et al., “Subject-exoskeleton contact model calibration leads to accurate interaction force predictions,” IEEE Trans. Neural Syst. Rehabil. Eng. 27(8), 15971605 (2019).CrossRefGoogle ScholarPubMed
Yuan, Y., Li, Z., Zhao, T. and Gan, D., “DMP-based motion generation for a walking exoskeleton robot using reinforcement learning,” IEEE Trans. Ind. Electron.(2019).CrossRefGoogle Scholar
Huang, R., Cheng, H., Qiu, J. and Zhang, J., “Learning physical human-robot interaction with coupled cooperative primitives for a lower exoskeleton,” IEEE Trans. Autom. Sci. Eng. 16(4), 19 (2019).CrossRefGoogle Scholar
Pong, V., Gu, S., Dalal, M. and Levine, S., “Temporal Difference Models: Model-Free Deep RL for Model-based Control,” 6th International Conference on Learning Representations. ICLR 2018 - Conference Track Proceedings (2018) pp. 114.Google Scholar
Bingjing, G., Jianhai, H., Xiangpan, L. and Lin, Y., “Human–robot interactive control based on reinforcement learning for gait rehabilitation training robot,” Int. J. Adv. Robot. Syst. 16(2), 1–16 (2019).CrossRefGoogle Scholar
Zhang, Y., Li, S., Nolan, K. J. and Zanotto, D., “Adaptive Assist-as-needed Control Based on Actor-Critic Reinforcement Learning,” IEEE International Conference on Intelligent Robots and Systems (2019) pp. 40664071.Google Scholar
Hamaya, M., Matsubara, T., Noda, T., Teramae, T. and Morimoto, J., “Learning Assistive Strategies from a Few User-Robot Interactions: Model-based Reinforcement Learning Approach,” Proceedings - IEEE International Conference on Robotics and Automation (2016) pp. 33463351.Google Scholar
Khan, S. G., Tufail, M., Shah, S. H. and Ullah, I., “Reinforcement learning based compliance control of a robotic walk assist device,Adv. Robot ., 33(24), 12811292 (2019).CrossRefGoogle Scholar
Lillicrap, T. P. et al., “Continuous Control with Deep Reinforcement Learning,” 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings (2016).Google Scholar
Rose, L., Bazzocchi, M. C. F. and Nejat, G., “End-to-End Deep Reinforcement Learning for Exoskeleton Control,” Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics (2020).CrossRefGoogle Scholar
Maggioni, S., Reinert, N., LÜnenburger, L. and Melendez-Calderon, A., “An adaptive and hybrid end-point/joint impedance controller for lower limb exoskeletons,” Front. Robot. AI 5, 104 (2018).CrossRefGoogle ScholarPubMed
Wang, L., Van Asseldonk, E. H. F. and Van Der Kooij, H., “Model Predictive Control-based Gait Pattern Generation for Wearable Exoskeletons,” IEEE International Conference on Rehabilitation Robotics (2011) pp. 16.Google Scholar
Harib, O. et al., “Feedback Control of an Exoskeleton for Paraplegics: Toward Robustly Stable Hands-free Dynamic Walking,” ArXiv1802.08322 (2018).Google Scholar
Lo Castro, D., Zhong, C. H., Braghin, F. and Liao, W. H., “Lower Limb Exoskeleton Control via Linear Quadratic Regulator and Disturbance Observer,” 2018 IEEE International Conference on Robotics and Biomimetics, ROBIO 2018 (2018) pp. 1743–1748.Google Scholar
Kumar, V. C. V., Ha, S., Sawicki, G. and Liu, C. K., “Learning a Control Policy for Fall Prevention on an Assistive Walking Device,” ArXiv1909.10488 (2019).CrossRefGoogle Scholar
Di Febbo, D. et al., “Reinforcement Learning Control of Functional Electrical Stimulation of the upper limb : a feasibility study,” Annual Conference of the International Functional Electrical Stimulation Society (2018) pp. 111114.Google Scholar
Lyu, M., Chen, W. H., Ding, X. and Wang, J., “Knee exoskeleton enhanced with artificial intelligence to provide assistance-as-needed,” Rev. Sci. Instrum. 90(9), 094101-1–094101-13 (2019).CrossRefGoogle ScholarPubMed
Xu, J. et al., “A multi-channel reinforcement learning framework for robotic mirror therapy,” IEEE Robot. Autom. Lett. 5(4), 53855392 (2020).CrossRefGoogle Scholar
Zhang, X., Wang, H., Tian, Y., Peyrodie, L. and Wang, X., “Model-free based neural network control with time-delay estimation for lower extremity exoskeleton,” Neurocomputing 272, 178188 (2018). Available: https://doi.org/10.1016/j.neucom.2017.06.055. CrossRefGoogle Scholar
Yang, P., Sun, J., Wang, J., Zhang, G., and Zhang, Y., “Model-Free Based Back-Stepping Sliding Mode Control for Wearable Exoskeletons,” 25th IEEE International Conference on Automation and Computing (2019).CrossRefGoogle Scholar
Zhang, J., Cheah, C. C. and Collins, S. H., “Chapter 5 - Torque control in legged locomotion,” In: Bioinspired Legged Locomotion: Models, Concepts, Control and Applications (Elsevier, Imprint: Butterworth-Heinemann, 2017) pp. 347–400, ISBN: 978-0-12-803766-9.Google Scholar
Torricelli, D. et al., “A subject-specific kinematic model to predict human motion in exoskeleton-assisted gait,” Front. Neurorobot. 12, Article 18 (1–11) (2018).CrossRefGoogle ScholarPubMed
van Hasselt, H., “Reinforcement Learning in Continuous State and Action Spaces,” In: Adaptation, Learning, and Optimization, vol. 12 (2012) pp. 207–251.Google Scholar
Bin Peng, X. and van de Panne, M., “Learning Locomotion Skills Using Deep RL: Does the Choice of Action Space Matter?,” Proceedings - SCA 2017 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, vol. 1 (2017).Google Scholar
Lee, S., Park, M., Lee, K. and Lee, J., “Scalable muscle-actuated human simulation and control,” ACM Trans. Graph. 38(4) (2019).CrossRefGoogle Scholar
Bin Peng, X., Berseth, G., Yin, K. and Van De Panne, M., “DeepLoco: Dynamic locomotion skills using hierarchical deep reinforcement learning,” ACM Trans. Graph. 36(4) (2017).CrossRefGoogle Scholar
Anand, A. S., Zhao, G., Roth, H. and Seyfarth, A., “A Deep Reinforcement Learning Based Approach Towards Generating Human Walking Behavior with a Neuromuscular Model,” 2019 IEEE-RAS 19th International Conference on Humanoid Robots (2020) pp. 537543.Google Scholar
Gil, C. R., Calvo, H. and Sossa, H., “Learning an efficient gait cycle of a biped robot based on reinforcement learning and artificial neural networks,” Appl. Sci. 9(3), Article 502 (1–24) (2019).CrossRefGoogle Scholar
GarcÍa, J. and Shafie, D., “Teaching a humanoid robot to walk faster through Safe Reinforcement Learning,” Eng. Appl. Artif. Intell. 88, Article 103360 (1–10) (2020). Available: https://doi.org/10.1016/j.engappai.2019.103360. CrossRefGoogle Scholar
Liu, C., Lonsberry, A., Nandor, M., Audu, M., Lonsberry, A. and Quinn, R., “Implementation of deep deterministic policy gradients for controlling dynamic bipedal walking,” Biomimetics 4(1), 28 (2019).CrossRefGoogle ScholarPubMed
Plappert, M., “keras-rl,” GitHub, 2016. [Online]. Available: https://github.com/keras-rl/keras-rl. [Accessed: 24-Apr-2020].Google Scholar
KidziŃski, Ł. et al., “Learning to Run Challenge: Synthesizing Physiologically Accurate Motion Using Deep Reinforcement Learning,” In: The NIPS ’17 Competition: Building Intelligent Systems, The Springer Series on Challenges in Machine Learning (Escalera, S. and Weimer, M., eds.) (Springer, Cham, 2018) pp. 101120.Google Scholar
Mnih, V. et al., “Playing Atari with Deep Reinforcement Learning,” (2013), pp. 1–9. Available: https://arxiv.org/pdf/1312.5602.pdf?source=post_page. Google Scholar
Uhlenbeck, G. E. and Ornstein, L. S., “On the theory of the Brownian motion,” Phys. Rev. 36(5), 823841 (1930).CrossRefGoogle Scholar
Fujimoto, S., Van Hoof, H. and Meger, D., “Addressing Function Approximation Error in Actor-Critic Methods,” 35th International Conference on Machine Learning, ICML 2018, vol. 4 (2018) pp. 25872601.Google Scholar
Kingma, D. P. and Ba, J. L., “Adam: A Method for Stochastic Optimization,” 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings (2015).Google Scholar
Seth, A. et al., “OpenSim: Simulating musculoskeletal dynamics and neuromuscular control to study human and animal movement,” PLOS Comput. Biol. 14(7), e1006223 (2018).CrossRefGoogle ScholarPubMed
Coll Pujals, D., Simulation of the Assistance of an Exoskeleton on Lower Limbs Joints Using Opensim (Polytechnic University of Catalonia, 2017).Google Scholar
Rose, L., Bazzocchi, M. C. F., de Souza, C., Vaughan-Graham, J., Patterson, K. and Nejat, G., “A Framework for Mapping and Controlling Exoskeleton Gait Patterns in both Simulation and Real -World,” Proceedings of the 2020 Design of Medical Devices Conference (2020).CrossRefGoogle Scholar
Hunt, K. H. and Crossley, F. R. E., “Coefficient of restitution interpreted as damping in vibroimpact,” J. Appl. Mech. Trans. ASME 42(2), 440–445 (1975).CrossRefGoogle Scholar
Thelen, D., Seth, A., Anderson, F. C. and Delp, S. L., “OpenSim Models Gait 2392 and 2354 Documentation,” SimTK. [Online]. Available: https://simtk-confluence.stanford.edu:8443/display/OpenSim/Gait+2392+and+2354+Models. [Accessed: 09-May-2020].Google Scholar
Thelen, D. G., “Adjustment of muscle mechanics model parameters to simulate dynamic contractions in older adults,” J. Biomech. Eng. 125(1), 7077 (2003).CrossRefGoogle ScholarPubMed
Mosconi, D., Nunes, P. F. and Siqueira, A. A. G., “Modeling and control of an active knee orthosis using a computational model of the musculoskeletal system,” J. Mechatronics Eng. 1(3), 12 (2018).CrossRefGoogle Scholar
Brockman, G. et al., “OpenAI Gym,” ArXiv1606.01540 (2016).Google Scholar
Sherman, M. A., Seth, A. and Delp, S. L., “Simbody: Multibody dynamics for biomedical research,” Procedia IUTAM 2, 241261 (2011). Available: https://doi.org/10.1016/j.piutam.2011.04.023. CrossRefGoogle ScholarPubMed
Bovi, G., Rabuffetti, M., Mazzoleni, P. and Ferrarin, M., “A multiple-task gait analysis approach: Kinematic, kinetic and EMG reference data for healthy young and adult subjects,” Gait Posture 33(1), 613 (2011).CrossRefGoogle Scholar
Horst, F., Lapuschkin, S., Samek, W., MÜller, K. R. and SchÖllhorn, W. I., “Explaining the unique nature of individual gait patterns with deep learning,” Sci. Rep. 9(1), Article 2391 (1–13) (2019).CrossRefGoogle ScholarPubMed
Moore, J. K., Hnat, S. K. and van den Bogert, A. J., “An elaborate data set on human gait and the effect of mechanical perturbations,” PeerJ 2015(3), 1–21 (2015). Available: https://peerj.com/articles/918/#. Google Scholar
Wang, W., Chen, J., Ji, Y., Jin, W., Liu, J. and Zhang, J., “Evaluation of lower leg muscle activities during human walking assisted by an ankle exoskeleton,” IEEE Trans. Ind. Inf. 16(11), 71687176 (2020).Google Scholar
Neckel, N. D., Blonien, N., Nichols, D. and Hidler, J., “Abnormal joint torque patterns exhibited by chronic stroke subjects while walking with a prescribed physiological gait pattern,” J. Neuroeng. Rehabil. 5(1), 19 (2008).CrossRefGoogle ScholarPubMed
Huitema, R. B., Hof, A. L., Mulder, T., Brouwer, W. H., Dekker, R. and Postema, K., “Functional recovery of gait and joint kinematics after right hemispheric stroke,” Arch. Phys. Med. Rehabil. 85(12), 1982–1988 (2004).Google Scholar
Knarr, B. A., Kesar, T. M., Reisman, D. S., Binder-Macleod, S. A. and Higginson, J. S., “Changes in the activation and function of the ankle plantar flexor muscles due to gait retraining in chronic stroke survivors.,” J. Neuroeng. Rehabil. 10, 12 (2013).CrossRefGoogle ScholarPubMed
Lencioni, T., Carpinella, I., Rabuffetti, M., Marzegan, A. and Ferrarin, M., “Human kinematic, kinetic and EMG data during different walking and stair ascending and descending tasks,” Sci. Data 6(1), 110 (2019).CrossRefGoogle ScholarPubMed
Di Febbo, D. et al., “Does Reinforcement Learning Outperform PID in the Control of FES-Induced Elbow Flex-Extension?,” 2018 IEEE International Symposium on Medical Measurements and Applications Proceedings (2018) pp. 16.Google Scholar
Nguyen, V. Q., LaPre, A. K., Price, M. A., Umberger, B. R. and Sup, F. C., “Inclusion of actuator dynamics in simulations of assisted human movement ,” Int. J. Numer. Methods Biomed. Eng. 36(5), 113 (2020). Available: https://doi.org/10.1002/cnm.3334. CrossRefGoogle ScholarPubMed
Li, S., Francisco, G. E. and Zhou, P., “Post-stroke hemiplegic gait: New perspective and insights,” Front. Physiol . 9, Article 1021 (1–8) (2018). Available: https://doi.org/10.3389/fphys.2018.01021. CrossRefGoogle ScholarPubMed
Nwankpa, C., Ijomah, W., Gachagan, A. and Marshall, S., “Activation functions: Comparison of trends in practice and research for deep learning,” arXiv preprint arXiv:1811.03378 (2018).Google Scholar
Liu, N., Cai, Y., Lu, T., Wang, R. and Wang, S., “Real–sim–real transfer for real-world robot control policy learning with deep reinforcement learning,” Appl. Sci. 10(5), 1555 (2020).CrossRefGoogle Scholar
Yu, W., Kumar, V. C., Turk, G. and Liu, C. K., “Sim-to-Real Transfer for Biped Locomotion” International Conference on Intelligent Robots and Systems (IROS) (2019).CrossRefGoogle Scholar
Peng, X. B., Andrychowicz, M., Zaremba, W. and Abbeel, P., “Sim-to-Real Transfer Of Robotic Control with Dynamics Randomization,” IEEE International Conference on Robotics and Automation (ICRA) (2018) pp. 38033810.Google Scholar
Zhao, W., Queralta, J. P. and Westerlund, T., “Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: A Survey,” IEEE Symposium Series on Computational Intelligence (SSCI) (2020) pp. 737744.Google Scholar
Julian, R. C., Heiden, E., He, Z., Zhang, H., Schaal, S., J. J.Lim, G. S. Sukhatme and K. Hausman, “Scaling simulation-to-real transfer by learning a latent space of robot skills,” Int. J. Rob. Res. 39(10–11), 12591278 (2020).CrossRefGoogle Scholar