Immersive remote telerobotics: foveated unicasting and remote visualization for intuitive interaction

Yonas T. Tefera; Yaesol Kim; Sara Anastasi; Paolo Fiorini; Darwin G. Caldwell; Nikhil Deshpande

doi:10.1017/S0263574724001784

Immersive remote telerobotics: foveated unicasting and remote visualization for intuitive interaction

Part of: The 40th Anniversary of Robotica

Published online by Cambridge University Press: 30 October 2024

Yonas T. Tefera

Yaesol Kim ,

Sara Anastasi ,

Paolo Fiorini ,

Darwin G. Caldwell and

Nikhil Deshpande

Show author details

Yonas T. Tefera*: Affiliation:
Advanced Robotics, Istituto Italiano di Tecnologia (IIT), Genova, Italy
Yaesol Kim: Affiliation:
Advanced Robotics, Istituto Italiano di Tecnologia (IIT), Genova, Italy
Sara Anastasi: Affiliation:
Istituto Nazionale per l’Assicurazione contro gli Infortuni sul Lavoro (INAIL), Rome, Italy
Paolo Fiorini: Affiliation:
Department of Computer Science, University of Verona, Verona, Italy
Darwin G. Caldwell: Affiliation:
Advanced Robotics, Istituto Italiano di Tecnologia (IIT), Genova, Italy
Nikhil Deshpande: Affiliation:
Advanced Robotics, Istituto Italiano di Tecnologia (IIT), Genova, Italy
*: Corresponding author: Yonas Teodros Tefera; Email: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Precise and efficient performance in remote robotic teleoperation relies on intuitive interaction. This requires both accurate control actions and complete perception (vision, haptic, and other sensory feedback) of the remote environment. Especially in immersive remote teleoperation, the complete perception of remote environments in 3D allows operators to gain improved situational awareness. Color and Depth (RGB-D) cameras capture remote environments as dense 3D point clouds for real-time visualization. However, providing enough situational awareness needs fast, high-quality data transmission from acquisition to virtual reality rendering. Unfortunately, dense point-cloud data can suffer from network delays and limits, impacting the teleoperator’s situational awareness. Understanding how the human eye works can help mitigate these challenges. This paper introduces a solution by implementing foveation, mimicking the human eye’s focus by smartly sampling and rendering dense point clouds for an intuitive remote teleoperation interface. This provides high resolution in the user’s central field, which gradually reduces toward the edges. However, this systematic visualization approach in the peripheral vision may benefit or risk losing information and burdening the user’s cognitive load. This work investigates these advantages and drawbacks through an experimental study and describes the overall system, with its software, hardware, and communication framework. This will show significant enhancements in both latency and throughput, surpassing 60% and 40% improvements in both aspects when compared with state-of-the-art research works. A user study reveals that the framework has minimal impact on the user’s visual quality of experience while helping to reduce the error rate significantly. Further, a 50% reduction in task execution time highlights the benefits of the proposed framework in immersive remote telerobotics applications.

Keywords

telerobotics telepresence gaze tracking mixed reality 3D reconstruction point clouds

Type: Research Article
Information: Robotica , Volume 42 , Special Issue 12: The 40th Anniversary of Robotic , December 2024 , pp. 4223 - 4248

DOI: https://doi.org/10.1017/S0263574724001784 [Opens in a new window]
Copyright: © The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Cobos-Guzman, S., Torres, J. and Lozano, R., “Design of an underwater robot manipulator for a telerobotic system,” Robotica 31(6), 945–953 (2013).CrossRef Google Scholar

Hokmi, S., Haghi, S. and Farhadi, A., “Remote monitoring and control of the 2-DoF robotic manipulators over the internet,” Robotica 40(12), 4475–4497 (2022).CrossRef Google Scholar

Muscolo, G. G., Marcheschi, S., Fontana, M. and Bergamasco, M., “Dynamics modeling of human-machine control interface for underwater teleoperation - erratum,” Robotica 40(4), 1255–1255 (2022).CrossRef Google Scholar

Naceri, A., Mazzanti, D., Bimbo, J., Tefera, Y. T., Prattichizzo, D., Caldwell, D. G., Mattos, L. S. and Deshpande, N., “The Vicarios Virtual reality interface for remote robotic teleoperation,” J. Intell. Robot. Syst. 101(80), 1–16 (2021).CrossRef Google Scholar

Tefera, Y., Mazzanti, D., Anastasi, S., Caldwell, D., Fiorini, P. and Deshpande, N., “Towards Foveated Rendering for Immersive Remote Telerobotics,” In: The International Workshop on Virtual Augmented, and Mixed-Reality for Human-Robot Interactions at HRI 2022, Boulder, USA (2022) pp. 1–4.Google Scholar

Mallem, M., Chavand, F. and Colle, E., “Computer-assisted visual perception in teleoperated robotics,” Robotica 10(2), 93–103 (1992).CrossRef Google Scholar

Mossel, A. and Kröter, M., “Streaming and Exploration of Dynamically Changing Dense 3D Reconstructions in Immersive Virtual Reality,” In: 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), IEEE, Merida, Mexico (2016) pp. 43–48.Google Scholar

Slawiñski, E. and Mut, V., “Control scheme including prediction and augmented reality for teleoperation of mobile robots,” Robotica 28(1), 11–22 (2010).CrossRef Google Scholar

Stotko, P., Krumpen, S., Schwarz, M., Lenz, C., Behnke, S., Klein, R. and Weinmann, M., “A VR System for Immersive Teleoperation and Live Exploration with a Mobile Robot,” In: IEEE/RSJ IROS, Macau, China (2019) pp. 3630–3637.Google Scholar

Stotko, P., Krumpen, S., Hullin, M. B., Weinmann, M. and Klein, R., “SLAMCast: Large-scale, real-time 3D reconstruction and streaming for immersive multi-client live telepresence,” IEEE Trans. Vis. Comput. Graph. 25a(5), 2102–2112 (2019).CrossRef Google Scholar

Kamezaki, M., Yang, J., Iwata, H. and Sugano, S., “A basic framework of virtual reality simulator for advancing disaster response work using teleoperated work machines,” J. Robot. Mechatron. 26(4), 486–495 (2014).CrossRef Google Scholar

Dima, E., Brunnström, K., Sjöström, M., Andersson, M., Edlund, J., Johanson, M. and Qureshi, T., “View Position Impact on QoE in an Immersive Telepresence System for Remote Operation,” In: 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), IEEE, Berlin, Germany (2019) pp. 1–3.Google Scholar

Rosen, E., Whitney, D., Fishman, M., Ullman, D. and Tellex, S., “Mixed Reality as a Bidirectional Communication Interface for Human-Robot Interaction,” In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA (2020) pp. 11431–11438.Google Scholar

Stauffert, J.-P., Niebling, F. and Latoschik, M. E., “Latency and cybersickness: Impact, causes, and measures. A review,” Front. Virtual Real. 1, 31 (2020).CrossRef Google Scholar

Orts-Escolano, S., Rhemann, C., Fanello, S., Chang, W., Kowdle, A., Degtyarev, Y., Kim, D., Davidson, P. L., Khamis, S., Dou, M., Tankovich, V., Loop, C., Cai, Q., Chou, P. A., Mennicken, S., Valentin, J., Pradeep, V., Wang, S., Kang, S. B., Kohli, P., Lutchyn, Y., Keskin, C. and Izadi, S.. “Holoportation: Virtual 3D Teleportation in Real-Time,” In: 29th Annual Symposium on User Interface Software and Technology (UIST), New York, NY, USA, Association for Computing Machinery (2016) pp. 741–754.Google Scholar

Hendrickson, A., “Organization of the Adult Primate Fovea,” In: Macular Degeneration (Penfold, P. L. and Provis, J. M, eds.) (Springer Berlin Heidelberg, Berlin, Heidelberg, 2005) pp. 1–23.Google Scholar

Guenter, B., Finch, M., Drucker, S., Tan, D. and Snyder, J., “Foveated 3D graphics,” ACM Trans. Graph. 31(6), 1–10 (2012).CrossRef Google Scholar

Maimone, A. and Fuchs, H., “Encumbrance-Free Telepresence System with Real-Time 3D Capture and Display Using Commodity Depth Cameras,” In: 10th IEEE International Symposium on Mixed and Augmented Reality, IEEE,Basel, Switzerland (2011) pp. 137–146.Google Scholar

Ni, D., Song, A., Xu, X., Li, H., Zhu, C. and Zeng, H., “3D-point-cloud registration and real-world dynamic modelling-based virtual environment building method for teleoperation,” Robotica 35(10), 1958–1974 (2017).CrossRef Google Scholar

Fairchild, A. J., Campion, S. P., García, A. S., Wolff, R., Fernando, T. and Roberts, D. J., “A mixed reality telepresence system for collaborative space operation,” IEEE Trans. Circ. Syst. Vid. Technol. 27(4), 814–827 (2017).CrossRef Google Scholar

Weinmann, M., Stotko, P., Krumpen, S. and Klein, R., “Immersive VR-Based Live Telepresence for Remote Collaboration and Teleoperation,” In: Wissenschaftlich-Technische Jahrestagung der DGPF, Stuttgart, Germany (2020) pp. 391–399.Google Scholar

Su, Y.-P., Chen, X.-Q., Zhou, C., Pearson, L. H., Pretty, C. G. and Chase, J. G., “Integrating virtual, mixed, and augmented reality into remote robotic applications: A brief review of extended reality-enhanced robotic systems for intuitive telemanipulation and telemanufacturing tasks in hazardous conditions,” Appl. Sci. 13(22), 12129 (2023).CrossRef Google Scholar

Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D.and Davison, A., “Kinectfusion: Real-Time 3D Reconstruction and Interaction Using a Moving Depth Camera,” In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, Association for Computing Machinery, New York, NY, USA (2011) pp. 559–568.Google Scholar

Whelan, T., Leutenegger, S., Salas-Moreno, R., Glocker, B. and Davison, A., “ElasticFusion: Dense SLAM without a pose graph,” In Robot.: Sci. Syst. 11, 1697–1716 (2015).Google Scholar

Schöps, T., Sattler, T. and Pollefeys, M., “SurfelMeshing: Online surfel-based mesh reconstruction,” IEEE Trans. Pattern Anal. Mach. Intell. 42(10), 2494–2507 (2020).CrossRef Google Scholar

Mekuria, R., Blom, K. and Cesar, P., “Design, implementation, and evaluation of a point cloud codec for tele-immersive video,” IEEE Trans. Circ. Syst. Vid. Technol. 27(4), 828–842 (2016).CrossRef Google Scholar

Schwarz, S., Sheikhipour, N., Sevom, V. F. and Hannuksela, M. M., “Video coding of dynamic 3D point cloud data,” APSIPA Trans. Signal Info. Process. 8(1), 31–43 (2019).Google Scholar

Huang, Y., Peng, J., Kuo, C.-C. J. and Gopi, M., “A generic scheme for progressive point cloud coding,” IEEE Trans. Vis. Comput. Graph. 14(2), 440–453 (2008).CrossRef Google Scholar PubMed

Shi, Y., Venkatram, P., Ding, Y. and Ooi, W. T., “Enabling Low Bit-Rate mpeg v-pcc Encoded Volumetric Video Streaming with 3D Sub-Sampling,” In: Proceedings of the 14th Conference on ACM Multimedia Systems, New York, NY, USA (2023) pp. 108–118.Google Scholar

Van Der Hooft, J., Wauters, T., De Turck, F., Timmerer, C. and Hellwagner, H., “Towards 6DoF http Adaptive Streaming Through Point Cloud Compression,” In: Proceedings of the 27th ACM International Conference on Multimedia, New York, NY, USA (2019) pp. 2405–2413.Google Scholar

De Pace, F., Gorjup, G., Bai, H., Sanna, A., Liarokapis, M. and Billinghurst, M., “Leveraging Enhanced Virtual Reality Methods and Environments for Efficient, Intuitive, and Immersive Teleoperation of Robots,” In: 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Xi’an, China (2021) pp. 12967–12973.Google Scholar

Huey, E.. The Psychology and Pedagogy of Reading: With a Review of the History of Reading and Writing and of Methods, Texts, and Hygiene in Reading, M.I.T. Press paperback series (M.I.T. Press, New York, NY, USA, 1968).Google Scholar

Fitts, P. M., Jones, R. E. and Milton, J. L., “Eye movements of aircraft pilots during instrument-landing approaches,” Aeronaut. Eng. Rev. 9(2), 24–29 (1949).Google Scholar

Yarbus, A. L., Eye Movements and Vision (Springer Berlin/Heidelberg, Germany, 1967).CrossRef Google Scholar

Stein, N., Niehorster, D. C., Watson, T., Steinicke, F., Rifai, K., Wahl, S. and Lappe, M., “A comparison of eye tracking latencies among several commercial head-mounted displays,” i-Perception 12(1), 2041669520983338 (2021).CrossRef Google Scholar PubMed

Stengel, M., Grogorick, S., Eisemann, M. and Magnor, M., “Adaptive image-space sampling for gaze-contingent real-time rendering,” Comput. Graph. Forum 35(4), 129–139 (2016).CrossRef Google Scholar

Bruder, V., Schulz, C., Bauer, R., Frey, S., Weiskopf, D. and Ertl, T., “Voronoi-Based Foveated Volume Rendering,” In: EuroVis (Short Papers) pp. 67–71 (2019).Google Scholar

Charlton, A., What is foveated rendering? Explaining the VR technology key to lifelike realism (2021) (Accessed: 05-Sep-2021).Google Scholar

Kaplanyan, A. S., Sochenov, A., Leimkühler, T., Okunev, M., Goodall, T. and Rufo, G., “Deepfovea: Neural reconstruction for foveated rendering and video compression using learned statistics of natural videos,” ACM Trans. Graph. 38(6), 1–13 (2019).CrossRef Google Scholar

Lungaro, P., Sjöberg, R., Valero, A. J. F., Mittal, A. and Tollmar, K., “Gaze-aware streaming solutions for the next generation of mobile VR experiences,” IEEE Trans. Vis. Comput. Graph. 24(4), 1535–1544 (2018).CrossRef Google Scholar PubMed

Ananpiriyakul, T., Anghel, J., Potter, K. and Joshi, A., “A gaze-contingent system for foveated multiresolution visualization of vector and volumetric data,” Electron. Imaging 32(1), 374-1–374-11 (2020).CrossRef Google Scholar

Schütz, M., Krösl, K. and Wimmer, M., “Real-Time Continuous Level of Detail Rendering of Point Clouds,” In: 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Osaka, Japan (2019) pp. 103–110.Google Scholar

Guyton, A. C. and Hall, J. E., Guyton and Hall Textbook of Medical Physiology (Elsevier, New York, NY, USA, 2011) pp. 597–608.Google Scholar

Quinn, N., Csincsik, L., Flynn, E., Curcio, C. A., Kiss, S., Sadda, S. R., Hogg, R., Peto, T. and Lengyel, I., “The clinical relevance of visualising the peripheral retina,” Prog. Retin. Eye Res. 68, 83–109 (2019).CrossRef Google Scholar PubMed

Sherman, W. R. and Craig, A. B., “Chapter 3 - the Human in the Loop,” In: Understanding Virtual Reality, The Morgan Kaufmann Series in Computer Graphics, (Sherman, W. R. and Craig, A. B., eds.) (Second Edition) (Morgan Kaufmann, Boston, 2018) pp. 108–188.CrossRef Google Scholar

Hyönä, J., “Foveal and Parafoveal Processing during Reading,” In: The Oxford Handbook of Eye Movements (Liversedge, S. P., Gilchrist, I. and Everling, S., eds.) (Oxford University Press, New York, NY, USA, 2011) pp. 820–838.Google Scholar

Ishiguro, Y. and Rekimoto, J., “Peripheral Vision Annotation: Noninterference Information Presentation Method for Mobile Augmented Reality,” In Proceedings of the 2nd Augmented Human International Conference, New York, NY, USA (2011) pp. 1–5.Google Scholar

Strasburger, H., Rentschler, I. and Jüttner, M., “Peripheral vision and pattern recognition: A review,” J. Vision 11(5), 13–13 (2011).CrossRef Google Scholar PubMed

Simpson, M. J., “Mini-review: Far peripheral vision,” Vision Res. 140, 96–105 (2017).CrossRef Google Scholar PubMed

Gordon, J. and Abramov, I., “Color vision in the peripheral retina II Hue and saturation,” JOSA 67(2), 202–207 (1977).CrossRef Google Scholar PubMed

Weymouth, F. W., “Visual sensory units and the minimal angle of resolution,” Am. J. Ophthalmol. 46(1), 102–113 (1958).CrossRef Google Scholar PubMed

Eckstein, M. P., “Visual search: A retrospective,” J. Vision 11(5), 14–14 (2011).CrossRef Google Scholar PubMed

Olk, B., Dinu, A., Zielinski, D. J. and Kopper, R., “Measuring visual search and distraction in immersive virtual reality,” Roy. Soc. Open. Sci. 5(5), 172331 (2018).CrossRef Google Scholar PubMed

Handa, A., Whelan, T., McDonald, J. B. and Davison, A. J., “A Benchmark for RGB-D Visual Odometry, 3D Reconstruction and SLAM,” In: IEEE International Conference on Robotics and Automation, ICRA, Hong Kong, China (2014) pp. 1524–1531.Google Scholar

Sanders, C., Practical Packet Analysis: Using Wireshark to Solve Real-World Network Problems (No Starch Press, San Francisco, CA, USA, 2017) pp. 1–155.Google Scholar

M. 3DG and Requirements 2017. Call for Proposals for Point Cloud Compression V2. Technical report, MPEG 3DG and Requirements, Hobart, AU.Google Scholar

Girardeau-Montaut, D., “Cloudcompare-open source project,” OpenSource Project, Paris, France 588 (2011) pp. 2–37.Google Scholar

International Telecommunication Union, Recommendation ITU-T P.919: Subjective Test Methodologies for 360° Video On Head-Mounted Displays (ITU, Geneva, Switzerland, 2020), pp. 1–38.Google Scholar

Richardson, J. T., “The use of latin-square designs in educational and psychological research,” Educ. Res. Rev. 24, 84–97 (2018).CrossRef Google Scholar

Bruder, V., Müller, C., Frey, S. and Ertl, T., “On evaluating runtime performance of interactive visualizations,” IEEE Trans. Vis. Comput. Graph. 26(9), 2848–2862 (2019).CrossRef Google Scholar PubMed

Article contents

Immersive remote telerobotics: foveated unicasting and remote visualization for intuitive interaction

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests