1. Introduction
Developing robots that are ultimately useful and acceptable to humans has always been one of the major motivations for research in robotics. Potentially, robots can alleviate humans from performing dangerous jobs or working in hazardous conditions. They can handle lifting heavy weights, toxic substances, and repetitive tasks. Inspired by this, in labs and research centers across the world, interdisciplinary teams of experts coordinate their everyday efforts to pursue the goal of developing intelligent robotic systems that fulfill this scope. It is their duty and dream to push the boundary of robotics as a science, overcoming the current theoretical and technological limits, and making robots work closer to humans in our everyday living spaces. In this article, we review the main results achieved in this direction during the last decade by the robotics research carried out at PRISMA Lab of the University of Naples Federico II. The lab has been active in robotics research for 35 years now, and its team is internationally recognized in the community for its achievements. Given this long-standing expertise, the research work carried out at PRISMA Lab is tied to a solid basis and aims to bring groundbreaking results that have far-reaching impacts.
Over the years, the team effort has been directed mainly toward six rapidly growing areas (and related sub-areas) of robotics that are dynamic manipulation and locomotion, aerial robotics, physical human-robot interaction (HRI), artificial intelligence (AI) and cognitive robotics, industrial robotics, and medical robotics (see Fig. 1). The six research areas listed above fulfill in different ways the primary scope of supporting humans in their daily activities. Advanced manipulation skills allow robots to naturally act in anthropic environments by exploiting available affordances that are typically designed for humans. In this context, dynamic and non-prehensile manipulation techniques allow robots to extend their manipulative capabilities as described in Sec. 2. Surprisingly, many methodologies used for non-prehensile manipulation also apply to legged robot locomotion. Motivated by this, the same section provides insights from recent legged robotics research. Aerial robots have been developed to perform tasks in high altitudes and difficult-to-access scenarios that cannot be easily reached or are too dangerous for human operators. To this end, the capability of interacting with the environment was recently integrated into control frameworks for aerial robots as can be seen in Sec. 3. Robots can support humans by substituting or by cooperating with them either proximally or remotely. In both cases, issues related to the interaction between a human and a robot may arise. As detailed in Sec. 4, physical HRI techniques must be considered to guarantee a safe and dependable behavior of collaborative robots (or cobots), for example, by designing suitable control schemes for reactive collision avoidance, compliance, and task-based interaction. In addition, in both human-robot cooperation and autonomous task execution, robots exhibiting cognitive capabilities are beneficial. We tackle the issue of deploying robots in dynamic and human-populated environments by integrating AI-based methods with cognitive control frameworks into robotic systems to allow flexible execution, planning, and monitoring of structured tasks as proposed in Sec. 5. The manipulation and AI methodologies were recently adopted in the field of industrial robotics by considering logistics as a main application as can be seen in Sec. 6. In this case, intelligent robotic systems are deployed to alleviate human operators from the execution of tedious and repetitive tasks. Differently, in the medical field, robots are directly conceived and programmed to extend human capabilities by performing super-precise surgical operations or acting as limb substitutes as described in Sec. 7.
In the following sections, we report the main achievements in each of these areas, highlighting the adopted methodologies and the key contributions with respect to the state of the art on the topic. Finally, potential future research directions in each field are discussed in Sec. 8. Thus, the main contributions of this paper can be listed as follows:
-
• We present a thorough review of the most recent work in the above-mentioned six research areas dealt with by the PRISMA Lab, highlighting the adopted methodologies and the key results achieved in the fields;
-
• For each research area, we propose an overview of the field, reporting both seminal and state-of-the-art works, and identify potential future research directions on the topics.
2. Dynamic manipulation and locomotion
The ways robots use to transport themselves or objects around share many similarities. Robots realize manipulation and locomotion tasks by physically establishing contacts and regulating the exchange of forces with the world around them [Reference Suomalainen, Karayiannidis and Kyrki1]. With the technological advancements in both sensing and actuation speed, it is now possible to manipulate an object speedily and achieve stable locomotion across challenging terrains [Reference Yang, Zhang, Zeng, Agrawal and Sreenath2]. In dynamic manipulation and locomotion, an important role is played by forces and accelerations, which are used together with kinematics, statics, and quasi-static forces to achieve the task. Dynamic non-prehensile manipulation of an object extends its feasible movements exploiting motion primitives such as rolling [Reference Serra, Ruggiero, Donaire, Buonocore, Lippiello and Siciliano3], pushing [Reference Chai, Peng and Tsao4], throwing, and tossing [Reference Satici, Ruggiero, Lippiello and Siciliano5], that inherently use the dynamics of both the robot and the manipulated object [Reference Ruggiero, Lippiello and Siciliano6]. Non-prehensile manipulation, specifically juggling, exhibits connections with legged locomotion regarding the hybrid nature of the related dynamics, the zero-moment-point stability [Reference Sardain and Bessonnet7], and the dynamic balancing conditions [Reference Farid, Siciliano and Ruggiero8]. It was observed that the stability conditions for non-prehensile dynamic object manipulation and the support phase of a walking biped share the same set of equations. This fundamental concept can be leveraged to seamlessly transfer sensing, planning, and control frameworks developed for one field to the other. Among such control frameworks, energy-based control approaches can be exploited for both dynamic non-prehensile manipulation tasks and locomotion ones. The key role played by energy during biped locomotion was enlightened in passive-dynamic walking [Reference McGeer9]. Consequently, several control frameworks exploiting energy-related concepts were proposed through the years [Reference Holm and Spong10–Reference Spong, Holm and Lee12] to realize specific gaits with the sought features. Locomotion considered in the aforementioned papers occurs in ideal conditions, that is, in the absence of external forces acting on legs. On the other hand, the investigation of resilience to external disturbances has been a prominent focus over the years, encompassing both quadruped and biped robots. This emphasis stems from the crucial ability of legged robots to navigate challenging terrain, where the irregularity of the ground may result in an early impact of the foot, leading to external forces affecting the system [Reference Mao, Gao, Tian and Zhao13]. A momentum-based observer detecting the anticipated foot touchdown was presented in [Reference Bledt, Wensing, Ingersoll and Kim14] while disturbances applied on the center of mass only were considered in [Reference Fahmi, Mastalli, Focchi and Semini15], neglecting the presence of external forces acting on swing legs. While using an observer for external wrenches on the center of mass or stance feet can enhance locomotion on uneven terrains, it does not prevent the robot from falling after a significant impact on the swing leg. This collision results in a deviation of the foot from the planned motion, potentially causing the touchdown to occur far from the intended foothold. This, in turn, reduces the support polygon, destabilizing the robot. In severe cases, the swing leg might not touch the ground or collide with another leg, leading to a robot fall. Consequently, there is a need to estimate external forces acting on swing legs and compensate for these disturbances. In the following sections, we report an overview of the main achievements of the two research fields whereas Table I provides a summary of the recent contributions related to these aspects.
2.1. Dynamic non-prehensile manipulation
Manipulation pertains to making an intentional change in the environment or to objects that are being manipulated. When realized without completely restraining the object, manipulation is denoted as non-prehensile [Reference Ruggiero, Lippiello and Siciliano6]. The object is then subject to unilateral constraints and, in order to reach the goal, the dynamics both of the object and of the hand manipulating it, together with the related kinematics, static and quasi-static forces, must be exploited [Reference Ruggiero, Lippiello and Siciliano6]. The literature on the topic states that the conventional way to cope with a non-prehensile dynamic manipulation task is to split it into simpler subtasks, usually referred to as non-prehensile manipulation primitives, that is rolling, dynamic grasp, sliding, pushing, throwing, etc.
Seminal works carried out in this direction investigate the non-prehensile rolling manipulation problem, where a single object rolls on the surface of a controlled manipulator. In [Reference Ryu, Ruggiero and Lynch26], backstepping was used to derive a control technique to stabilize a disk-on-disk rolling manipulation system. The goal was to stabilize by controlling a circular object on the top of a circular hand in the vertical plane. The effect of shapes in the input-state linearization of the considered non-prehensile planar rolling dynamic manipulation systems was later investigated in [Reference Lippiello, Ruggiero and Siciliano40]. Given the shapes of both the object and the manipulator, a state transformation was found allowing the possibility to exploit linear controls to stabilize the system.
In tray-based non-prehensile manipulation (see Fig. 2 – upper row), the tasks of interest for the robotic system are opposite: (1) reconfigure objects in the hand by allowing them to intentionally slide or roll in the right direction; (2) transport objects placed on the tray while preventing them from sliding and falling. In the first case, the pose reconfiguration of a spherical object rolling on a tray-shaped hand, which is in turn actuated by a robot manipulator, was investigated in [Reference Serra, Ruggiero, Donaire, Buonocore, Lippiello and Siciliano3, Reference Serra, Ferguson, Ruggiero, Siniscalco, Petit, Lippiello and Siciliano27]: the control law is derived following an interconnection-and-damping-assignment passivity-based approach using a port-Hamiltonian (pH) dynamic model of the system. Full pose regulation of the sphere was achieved thanks to a purposely developed planner. In the second case, the objective is to prevent objects’ sliding induced by inertial forces while carrying the object from one place to another. Adaptive tray orientation was shown to help achieve higher linear accelerations during the tracking of a fast trajectory, minimizing the occurrence of object slipping. The idea behind this is to let the tray surface completely counteract the net force acting on the object. A quadratic program was used to compute the optimal robot manipulator torque control input to enforce non-sliding conditions for the object with adaptive tray orientation while also considering the system’s kinematic and dynamic constraints in [Reference Subburaman, Selvaggio and Ruggiero21]. Instead, keeping the tray in the upright configuration, a jerk-based model predictive non-sliding manipulation control was proposed in [Reference Selvaggio, Garg, Ruggiero, Oriolo and Siciliano20] for the same task showing superior performance: considering the rate-of-change of the joint torque as the output of the controller, a smooth torque control profile is obtained while allowing direct control of the contact forces. Tray-based non-prehensile manipulation was recently used to develop a shared control teleoperation framework for users to safely transport objects using a remotely located robot [Reference Selvaggio, Cacace, Pacchierotti, Ruggiero and Giordano19]. The proposed shared control approach shapes the motion commands imparted by the user to the remote robot and automatically regulates the end-effector orientation to more robustly prevent the object from sliding over the tray. Tray-based non-prehensile manipulation with a mobile manipulator dynamically balancing objects on its end-effector without grasping them was presented in [Reference Heins and Schoellig41]. A whole-body constrained model predictive controller for a mobile manipulator that balances objects and avoids collisions was developed for the considered task. More recently, researchers have focused on fast slosh-free fluid transportation [Reference Muchacho, Laha, Figueredo and Haddadin42]. Here the goal was to generate slosh-free trajectories by controlling the pendulum model of the liquid surface with constrained quadratic program optimization to obtain valid control inputs. This online technique allowed the motion generator to be used for real-time non-prehensile slosh-free teleoperation of liquids [Reference Muchacho, Bien, Laha, Naceri, Figueredo and Haddadin43].
In those cases in which the object is too heavy or too large to be grasped, pushing an object is a simple solution widely adopted by humans, and the same concept can be thus transferred to robots (see Fig. 2 – bottom row). A technique to manipulate an object with a non-holonomic mobile robot using the pushing non-prehensile manipulation primitive was presented in [Reference Bertoncelli, Ruggiero and Sabattini16]. Such a primitive involves unilateral constraints associated with the friction between the robot and the manipulated object. Violating this constraint produces the slippage of the object during the manipulation. A linear time-varying model predictive control was designed to properly include the unilateral constraint within the control action. The framework can be extended in the case of multi-robots: a task-oriented contact placement optimization strategy for object pushing that allows calculating optimal contact points minimizing the amplitude of forces required to execute the task was presented in [Reference Bertoncelli, Selvaggio, Ruggiero and Sabattini17].
Many of the proposed methods handle flat objects with primitive geometric shapes moving quasi-statically on high-friction surfaces, yet they usually make use of complex analytical models or utilize specialized physics engines to predict the outcomes of various interactions. On the other hand, an experience-based approach, which does not require any explicit analytical model or the help of a physics engine was proposed in [Reference Meriçli, Veloso and Akın44] where a mobile robot simply experiments with pushable complex 3D real-world objects to observe and memorize their motion characteristics together with the associated motion uncertainties resulting from varying initial caster wheel orientations and potential contacts between the robot and the object. A probabilistic method for autonomous learning of an approximate dynamics model for these objects was presented in [Reference Novin, Yazdani, Merryweather and Hermans45]. In this method, the dynamic parameters were learned using a small dataset consisting of force and motion data from interactions between the robot and objects. Based on these concepts, a rearrangement algorithm that relies on only a few known straight-line pushes for some novel object and requires no analytical models, force sensors, or large training datasets was proposed in [Reference Chai, Peng and Tsao4]. The authors experimentally verified the performance of their algorithm by rearranging several types of objects by pushing them to any target planar pose.
Research on other non-prehensile manipulation primitives further includes sliding (for pizza-baking applications) [Reference Gutiérrez-Giles, Ruggiero, Lippiello and Siciliano28], throwing [Reference Satici, Ruggiero, Lippiello and Siciliano5], stretching a deformable object [Reference Kim, Ruggiero, Lippiello, Siciliano, Siciliano and Ruggiero29], and related ones [Reference Ruggiero, Kim, Gutiérrez-Giles, Satici, Donaire, Cacace, Buonocore, Fontanelli, Lippiello, Siciliano, Gusikhin and Madani30, Reference Ruggiero, Petit, Serra, Satici, Cacace, Donaire, Ficuciello, Buonocore, Fontanelli, Lippiello, Villani and Siciliano31].
2.2. Legged robotics
Motivated by the connection between bipedal locomotion and non-prehensile manipulation [Reference Farid, Siciliano and Ruggiero8], the methodology proposed initially in [Reference Serra, Ruggiero, Donaire, Buonocore, Lippiello and Siciliano3] to achieve the stabilization of non-prehensile planar rolling manipulation tasks was subsequently extended to tackle the gait-generation problem of a simple compass-like biped robot in [Reference Arpenti, Ruggiero and Lippiello34]. The common control framework is based on a modification of the well-known interconnection-and-damping-assignment passivity-based control (IDA-PBC) of pH systems, where an appropriate parameterization of the inertia matrix was proposed to avoid the explicit solution of the matching partial differential equations (PDEs) arising during control synthesis. Due to the critical role played by energy exchange during walking, the methodology was profitably applied to passive-dynamic walking. Thanks to the novel control strategy, new gaits were generated, which are manifestly different from the passive gait. The result was a controlled planar walker moving manifestly slower or faster (depending on control tuning) than the open-loop system while preserving the system’s passivity due to the closed-loop pH structure.
An alternative constructive methodology, improving some issues present in [Reference Serra, Ruggiero, Donaire, Buonocore, Lippiello and Siciliano3], was proposed in [Reference Arpenti, Ruggiero and Lippiello35]. In line with the same problem, the effect of dissipative forces deployed in the controller on gait generation was investigated in [Reference Nacusse, Arpenti, Ruggiero and Lippiello36]. There, two alternative control methodologies exploiting dissipative forces, termed simultaneous interconnection-and-damping-assignment passivity-based control (SIDA-PBC) and energy pumping-and-damping passivity-based control (EPD-PBC), respectively, demonstrated better results in achieving slow gaits, characterized by small step lengths and large step periods, compared to the performance of the IDA-PBC. SIDA-PBC carries out the energy shaping and the damping injection simultaneously, thanks to dissipative forces in the desired dynamics, differently from IDA-PBC, where these two control actions are carried out in two distinct steps. On the other hand, EPD-PBC proved to be an efficient control strategy to face another control task belonging to the realm of legged locomotion, namely the gait robustification problem, that is, the enlargement of the basin of attraction of the limit cycle associated with the natural passive gait of the compass-like biped [Reference Arpenti, Donaire, Ruggiero and Lippiello32]. This was achieved by alternating energy injection and dissipation into/from the system to stabilize the walker at the target energy value corresponding to the natural gait. Moreover, the EPD-PBC methodology was also used with the IDA-PBC approach, showing that not only the natural passive gaits but also the gaits generated through energy shaping can be robustified using the proposed design [Reference Arpenti, Donaire, Ruggiero and Lippiello32]. This work was carried out within the hybrid zero dynamics (HZD) framework which also served as a starting point for the development of a tracking controller based on IDA-PBC able to guarantee the exponentially fast convergence of suitably defined output dynamics to the HZD manifold [Reference Arpenti, Donaire, Ruggiero and Lippiello33]. The proposed strategy conferred robustness concerning parametric uncertainties to the closed-loop system by assigning desired error dynamics described through the pH formalism, thus preserving passivity.
On the quadrupedal locomotion side, an estimator of external disturbances independently acting on stance and swing legs was proposed in [Reference Morlando, Teimoorzadeh and Ruggiero39]. Based on the system’s momentum, the estimator was leveraged along with a suitable motion planner for the trajectory of the robot’s center of mass and an optimization problem based on the modulation of ground reaction forces in a whole-body control strategy. Such a control architecture allows the locomotion of a legged robot inside an unstructured environment where collisions could happen and where irregularities in the terrain cause disturbances on legs. When significant forces act on both the center of mass and the robot’s legs, momentum-based observers are insufficient. Therefore, the work in [Reference Morlando and Ruggiero38] proposed a “hybrid” observer, an estimator that combines a momentum-based observer for the angular term and an acceleration-based observer for the translational one, employing directly measurable values from the sensors. An approach based on two observers was also proposed in [Reference Morlando, Lippiello and Ruggiero37], where a framework to control a quadruped robot tethered to a visually impaired person was presented, as illustrated in Fig. 3 (left). Finally, in [Reference Morlando, Selvaggio and Ruggiero18], the problem of non-prehensile object transportation through a legged manipulator is faced, arriving at a perfect combination of the topics seen in this section. An alternative whole-body control architecture was devised to prevent the sliding of the object placed on the tray at the manipulator’s end-effector while retaining the quadruped robot balance during walking, as shown in Fig. 3 (right). Both contact forces between the tray and the object and between the legs and the ground were kept within their respective friction cones by solving a quadratic optimization problem while achieving the sought transportation task.
3. Aerial robotics
Aerial robotics has been consolidated in the last decade as a research topic of interest for modeling and control, perception, planning, manipulation, and design. As such, it constitutes an effective technological solution for various applications such as inspection and maintenance, search and rescue, transportation and delivery, monitoring and patrolling, or 3D mapping. The maturity level reached in this field has led to the rise of several applications of aerial robots, with a focus on high altitude and challenging access scenarios that human operators cannot easily reach. The time, risk, and cost associated with conventional solutions involving the deployment of heavy vehicles and infrastructures motivate the development of aerial robots capable of quickly reaching these workspaces and performing visual or contact inspection operations. The research community faced two main problems during the deployment of reliable autonomous aerial robots. Firstly, conventional Vertical Takeoff and Landing (VToL) devices, like multirotor Unmanned Aerial Vehicles (UAVs) with parallel axes, faced challenges due to underactuation, impacting stabilization and trajectory tracking. Commonly, a hierarchical controller [Reference Mahony and Hamel46, Reference Nonami, Kendoul, Suzuki and Wang47] addresses this with time-scale separation between linear and angular dynamics. Position and yaw angle of VToL UAVs are flat outputs [Reference Spica, Franchi, Oriolo, Bülthoff and Giordano48], allowing trajectory tracking and solving the underactuated problem. Secondly, as UAV aerodynamic models are complex, these require robust control designs. Most designs incorporated integral action to handle disturbances and cope with uncertainties (e.g., battery level). Adaptive controls [Reference Antonelli, Cataldi, Giordano, Chiaverini and Franchi49–Reference Roberts and Tayebi51], force observers [Reference Yüksel, Secchi, Bülthoff and Franchi52], and passivity-based controllers [Reference Egeland and Godhavn53] enhanced robustness. PH methods [Reference Yüksel, Secchi, Bülthoff and Franchi52] and passive backstepping [Reference Ha, Zuo, Choi and Lee54] were explored for improved control. For further exploration, comprehensive literature reviews can be found in [Reference Valvanis55, Reference Valvanis and Vachtsevanos56] among the others.
Nowadays, the goal is the development of a new generation of flying service robots capable of supporting human beings in all those activities requiring the ability to interact actively and safely in the air. Challenging fields include inspecting buildings and large infrastructures, sample picking, and remote aerial manipulation. The latter is intended as the grasping, transporting, positioning, assembly and disassembly of mechanical parts, measurement instruments, and any objects performed with aerial vehicles. Indeed, UAVs are currently migrating from passive tasks like inspection, surveillance, monitoring, remote sensing, and so on, to active tasks like grasping and manipulation. UAVs must have the proper tools to accomplish manipulation tasks in the air. The two most adopted solutions are either to mount a gripper or a multi-fingered hand directly on the aerial vehicle, for example, a flying hand, or to equip the UAV with one or more robotic arms, for example, an unmanned aerial manipulator (UAM) as shown in Fig. 4. The UAM could be an efficient solution providing an aerial vehicle capable of performing dexterous manipulation tasks. Surveys regarding aerial manipulation can be found in refs. [Reference Oller, Tognon, Suarez, Lee and Franchi57, Reference Ruggiero, Lippiello and Ollero58].
In the following sections, an overview of the work carried out in aerial vehicle control and aerial manipulation is revised. Table II provides a summary of the recent contributions related to these aspects.
3.1. Control of aerial vehicles
Model-based control of VToL UAVs leverages many simplifications by neglecting several aerodynamic effects whose presence affects the performance of tracking and regulation control problems. Therefore, researchers always seek robustification techniques to improve related problems.
An estimator of unmodeled dynamics and external wrench acting on the VToL UAV and based on the system’s momentum was employed in [Reference Ruggiero, Cacace, Sadeghian and Lippiello59] to compensate for such disturbances. This estimator can be inserted in standard hierarchical controllers commanding UAVs with a flat propeller configuration. Another estimator, based on a robust extended-state observer, was designed in [Reference Sotos, Cacace, Ruggiero and Lippiello60]. In this case, a UAV with passively tilted propellers was considered. In the case of a UAV with actively tilted propellers, instead, a robust controller is devised in [Reference Sotos, Ruggiero and Lippiello61]. The proposed technique is model-free and based on a hyperbolic controller globally attracting the error signals to an ultimate bound about the origin despite external disturbances.
In the case of a quadrotor, the loss or damage of one propeller can be dramatic for the aerial vehicle’s stable flight. The techniques developed in refs. [Reference Lippiello, Ruggiero and Serra62, Reference Lippiello, Ruggiero and Serra63] can be employed to perform an emergency landing. While both are supposed to turn off the propeller as opposed to the damaged one, resulting in a bi-rotor configuration in which the yaw is uncontrolled, the former considers a PID approach, while the latter a backstepping approach to track the emergency landing trajectory in the Cartesian space.
3.2. Aerial manipulation
Four elements mainly constitute a UAM: $i)$ the UAV floating base; $ii)$ the robotic arm(s); $iii)$ the gripper(s) or multi-fingered hand(s) attached at the end-effector of the arm(s); iv) the necessary sensory system. During the flight, the mounted robot arm provides even more issues since its dynamics depend on the actual configuration state of the whole system. There are two approaches to addressing planning and control problems for a UAM. The former is a “centralized” approach in which the UAV and the robotic arm are considered a unique entity. Thus the planning and the controller are designed from the complete kinematic and dynamic models. The latter approach considers the UAV and the robotic arm as separate independent systems. The effects of the arm on the aerial vehicle can be then considered external disturbances and vice versa [Reference D’Ago, Selvaggio, Suarez, Gañán, Buonocore, Di Castro, Lippiello, Ollero and Ruggiero64, Reference Ruggiero, Trujillo, Cano, Ascorbe, Viguria, Peréz, Lippiello, Ollero and Siciliano65].
Aerial manipulation is now almost a reality in inspection and maintenance applications, particularly non-destructive test (NDT) measurements (see Fig. 4). In this scenario, ultrasonic probes are used to retrieve the wall thickness of a surface to prove the integrity of the material without compromising its internal structure. These tests are performed by placing the inspection probe in fixed contact with the surface under examination. Currently, NDT measurements are performed by humans who must climb a high scaffolding to reach the inspection location with the use of tools like man-lifts, cranes, or rope-access systems. Therefore, improving NDT inspection operations is fundamental to raising human safety and decreasing the economic costs of inspection procedures. The platforms presented in refs. [Reference Cacace, Fontanelli and Lippiello66, Reference Cacace, Silva, Fontanelli and Lippiello67] are possible solutions to address NDT measurements in challenging plants. There, a robotic arm was used for pipe inspection. Besides this, UAMs can interact with humans and help them in daily activities, becoming efficient aerial coworkers, particularly for working at height in inspection and maintenance activities that still require human intervention. Therefore, as long as the application range of drones increases, the possibility of sharing the human workspace also increases. Hence, it becomes paramount to understand how the interaction between humans and drones is established. The work in [Reference Cuniato, Cacace, Selvaggio, Ruggiero and Lippiello68] went in this direction thanks to implementing a hardware-in-the-loop simulator for human cooperation with an aerial manipulator. The simulator provided the user with realistic haptic feedback for a human-aerial manipulator interaction activity. The forces exchanged between the hardware interface and the human/environment were measured and supplied to a dynamically simulated aerial manipulator. In turn, the simulated aerial platform fed back its position to the hardware allowing the human to feel and evaluate the interaction effects. Besides human-aerial manipulator cooperation, the simulator contributed to developing and testing autonomous control strategies in aerial manipulation.
Autonomous aerial manipulation tasks can be accomplished also thanks to the use of exteroceptive sensing for an image-based visual impedance control that allows realizing physical interaction of a dual-arm UAM equipped with a camera and a force/torque sensor [Reference Lippiello, Fontanelli and Ruggiero69]. The design of a hierarchical task-composition framework for controlling a UAM, which integrates the main benefits of both image-based and position-based control schemes into a unified hybrid-control framework, was presented in [Reference Lippiello, Cacace, Santamaria-Navarro, Andrade-Cetto, Trujillo, Esteves and Viguria25]. Aerial manipulation tasks enabled by the proposed methods include the autonomous installation of clip bird diverters on high-voltage lines through a drone equipped with a sensorized stick to realize a compliant interaction with the environment [Reference D’Angelo, Pagano, Ruggiero and Lippiello70]. Besides enabling safer human operations, such application realize the huge impact of reducing collisions with wires by $50$ to $90\%$ saving tens of thousands of birds’ lives during their migrations.
4. Physical human-robot interaction
Performing physical actions robots can help humans in their jobs of daily lives [Reference Selvaggio, Cognetti, Nikolaidis, Ivaldi and Siciliano71]. This is useful in several applications ranging from physical assistance to disabled or elderly people to reduction of risks and fatigue at work. However, an intuitive, safe, and reliable interaction must be established for the robot to become an ideal proximal or remote assistant/collaborator. In the following sections, we are going to review recent work in this direction. Table III provides a summary of the recent contributions in this field.
4.1. Proximal collaborative execution of structured tasks
While collaborative robotic platforms ensuring safe and compliant physical HRI are spreading in service robotics applications, the collaborative execution of structured collaborative tasks still poses relevant research challenges [Reference Johannsmeier and Haddadin72]. An effective and fluent human-robot collaboration during the execution of structured activities should support both cognitive and physical interaction. In these settings, operators and robots continuously estimate their reciprocal intentions to decide whether to commit to shared activities, when to switch towards different task, or how to regulate compliant interactions during co-manipulation operations. In refs. [Reference Cacace, Caccavale, Finzi and Grieco73, Reference Cacace, Caccavale, Finzi and Lippiello74], we addressed these issues by proposing a human-robot collaborative framework which seamlessly integrates task monitoring, task orchestration, and task-situated interpretation of the human physical guidance (see Fig. 5 (e)) during the joint execution of hierarchically structured manipulation activities. In this setting, task orchestration and adaptation occur simultaneously with the interpretation of the human interventions. Depending on the assigned tasks, the supervisory framework enables potential subtasks, targets, and trajectories, while the human guidance is monitored by LSTM networks that classify the physical interventions of the operator. When the human guidance is assessed as aligned with the planned activities, the robotic system can keep executing the current activities, while suitably adjusting subtasks, targets, or motion trajectories following the corrections provided by the operator. Within this collaborative framework, different modalities of human-robot collaboration (human-guided, task-guided, balanced) were explored and assessed in terms of their effectiveness and user experience during the interaction.
4.2. Remote collaboration via shared control
Physical interactions between humans and robots are exploited to perform common or independent tasks. When the two parts work together to achieve a common goal, the robotic system may integrate some degree of autonomy aimed to help the human in executing the task, ensuring better performance, safety, and ergonomics. We refer to these as shared control or shared autonomy scenarios, with the latter considered as the case in which the autonomy level is possibly varying [Reference Selvaggio, Cognetti, Nikolaidis, Ivaldi and Siciliano71]. Broadly speaking there is the spectrum of possible interactions between humans and robots, from robots having full autonomy to none at all [Reference Goodrich and Schultz75]. As full autonomy still poses a problem for robotic systems when dealing with unknown or complex tasks in unstructured and uncertain scenarios [Reference Yang, Cambias, Cleary, Daimler, Drake, Dupont, Hata, Kazanzides, Martel, Patel, Santos and Taylor76], shared control comes useful to improve the task performance while not increasing the human operator workload [Reference Kanda and Ishiguro77]. Research about shared control focuses on the extent of human intervention in the control of artificial systems, splitting the workload between the two [Reference Schilling, Burgard, Muelling, Wrede and Ritter78]. The extent of human intervention, and thus robot autonomy, has been usually classified into discrete levels [Reference Bruemmer, Dudenhoeffer and Marble79–Reference Kortenkamp, Keirn-Schreckenghost and Bonasso81], with fewer studies considering a continuous domain [Reference Anderson, Peters, Iagnemma and Overholt82, Reference Desai and Yanco83]. Commonly, shared control techniques aim to fully or partially replace a function, such as identifying objects in cluttered environments [Reference Pitzer, Styer, Bersch, DuHadway and Becker84], while others start from a fully autonomous robot and give control to the user only in difficult situations [Reference Dias, Kannan, Browning, Jones, Argall, Dias, Zinck, Veloso and Stentz80, Reference Kortenkamp, Keirn-Schreckenghost and Bonasso81, Reference Sellner, Simmons and Singh85]. Some studies assist the operator by predicting their intent while selecting among different targets [Reference Dragan and Srinivasa86, Reference Javdani, Srinivasa and Bagnell87], while others exploit haptic feedback/guidance techniques while moving toward a specific target [Reference Aarno, Ekvall and Kragic88, Reference Crandall and Goodrich89].
Shared control/autonomy may take several forms and make use of a wide spectrum of methodologies depending on the application scenario. For example, when a human has to perform a complex manipulation task in a remote area by means of a dual-arm system, shared control methods may be designed to reduce the number of degrees of freedom controlled by the user while ensuring the task’s feasibility [Reference Selvaggio, Abi-Farraj, Pacchierotti, Giordano and Siciliano90]. In this way, the task execution becomes inherently less demanding both physically and cognitively. With the same aim, the autonomy and the human may be in charge of tasks having different priorities. In these cases, the tasks are usually organized hierarchically in a stack. Also in this case, controlling only one task, involving a minimum number of degrees of freedom, the human control of the robotic system becomes less fatigued [Reference Selvaggio, Giordano, Ficuciello and Siciliano91]. In remote applications, the user’s perception and awareness of the environment are usually hindered by the limited field of view provided by the remotely installed vision sensors (see Fig. 6 (a)). For this reason, it is beneficial to exploit additional communication channels (besides the visual one) to convey information about the state of the remote system/environment.
Haptic guidance is usually employed in this case to increase the awareness of the robotic system state by displaying computed forces through a haptic device, which is also used to send commands to the robotic system. Haptic guidance may inform the user about the proximity to the system’s constraints (e.g., joint limits, singularities, collisions, etc.), suggesting motion directions that are free from constraints and safe for the task execution [Reference Selvaggio, Abi-Farraj, Pacchierotti, Giordano and Siciliano90, Reference Selvaggio, Giordano, Ficuciello and Siciliano91]. This may also be used to direct the user towards grasping poses that avoid constraints during post-grasping task trajectories [Reference Selvaggio, A.Ghalamzan, Moccia, Ficuciello and Siciliano94]. In addition to this, haptic guidance in the form of virtual fixtures may be employed when the application requires following paths with high precision, such as in hazardous industrial scenarios [Reference Selvaggio, Notomista, Chen, Gao, Trapani and Caldwell93] (see Fig. 6 (b)) or in surgical dissection scenarios [Reference Selvaggio, Fontanelli, Ficuciello, Villani and Siciliano92] (see Fig. 6 (c)). More recently, we have developed shared control methods for a remote robotic system performing a dynamic non-prehensile object transportation task, where haptic guidance was used to inform the user about proximity to the sliding condition [Reference Selvaggio, Cacace, Pacchierotti, Ruggiero and Giordano19] (see Fig. 6 (d)).
5. AI and cognitive robotics
In order for a robot to autonomously or cooperatively perform complex tasks in the real world its control system should be endowed with cognitive capabilities enabling deliberation, execution, learning, and perception in dynamic, interactive, and unstructured environments [Reference Rodriguez-Guerra, Sorrosal, Cabanes and Calleja95, Reference Schultheis and Cooper96]. Cognitive robotics [Reference Beetz, Beßler, Haidu, Pomarlan, Bozcuoğlu and Bartels97, Reference Lemaignan, Warnier, Sisbot, Clodic and Alami98] is concerned with these issues proposing architectures and methods for seamlessly integrating sensorimotor, cognitive, and interaction abilities in autonomous/interactive robots. Exploring these topics involves various research areas across AI and robotics. Flexible orchestration, execution, and monitoring of structured tasks is a particularly relevant aspect of robotics [Reference Beßler, Porzel, Pomarlan, Beetz, Malaka and Bateman99, Reference de la Cruz, Piater and Saveriano100]. Current AI and robotics literature mostly relies on integrated planning and execution frameworks to address adaptive execution of complex activities [Reference Carbone, Finzi, Orlandini and Pirri101, Reference Karpas, Levine, Yu and Williams102]. On the other hand, cognitive control models and methods [Reference Botvinick, Braver, Barch, Carter and Cohen103–Reference Cooper and Shallice105] can be deployed to improve robot autonomy as well HRI performance. In this direction, we are currently investigating these methods to develop a cognitive control framework suitable for human-robot collaboration. Another relevant issue we are concerned with is the combination of symbolic and sub-symbolic approaches to incremental task learning [Reference Petrík, Tapaswi, Laptev and Sivic106, Reference Ramirez-Amaro, Yang and Cheng107] and task and motion planning [Reference Mansouri, Pecora and Schüller108]. In Table IV, we provide an overview of recent research activities related to these aspects. These works and results are further described and discussed in the following sections and categorized in Table IV.
5.1. Flexible and collaborative execution of multiple tasks
An autonomous and collaborative robotic system is expected to flexibly execute multiple structured tasks while adeptly handling unexpected events and behaviors. In cognitive psychology and neuroscience, the executive mechanisms needed to support flexible, adaptive responses, and complex goal-directed cognitive processes are associated with the concept of cognitive control [Reference Botvinick, Braver, Barch, Carter and Cohen103]. Despite their relevance in cognitive science, cognitive control models have seldom been integrated into robotic systems. In this regard, we aim at combining classic AI and machine learning methods with cognitive control mechanisms to support flexible and situated adaptive orchestration of robotic activities as well as task planning and learning. In particular, we rely on a supervisory attentional system (SAS) [Reference Cooper and Shallice105, Reference Norman and Shallice122] to orchestrate the execution of hierarchically organized robotic behaviors. This paradigm seems particularly effective for both flexible plan execution and human-robot collaboration, in that it provides attention mechanisms considered as pivotal not only for task switching and regulation but also for human-human communication. Following this approach, we are currently developing a robotic cognitive control framework, based on the SAS paradigm, enabling multiple task orchestration execution, collaborative execution of structured tasks, and incremental task learning [Reference Caccavale and Finzi114]. In this direction, we proposed and developed a practical attention-based executive framework (see (a) in Fig. 5), suitable for real-world collaborative robotic systems, which is also compatible with AI methods for planning, execution, learning, and HRI/communication. We show that the proposed framework supports flexible orchestration of multiple concurrent tasks hierarchically organized [Reference Caccavale and Finzi111, Reference Caccavale and Finzi112] and natural human-robot collaborative execution of structured activities [Reference Caccavale and Finzi114], in that it allows fast and adaptive responses to unexpected events while reducing replanning [Reference Caccavale, Cacace, Fiore, Alami and Finzi110] and supporting task-situated interpretation of the human interventions [Reference Cacace, Caccavale, Finzi and Lippiello74, Reference Caccavale, Leone, Lucignano, Rossi, Staffa and Finzi115] (e.g., human pointing gestures as in (b) Fig. 5). Attentional mechanisms are also effective in improving users’ situation awareness and interpretation of robot behaviors by regulating or adjusting human-robot communication depending on the executive context [Reference Cacace, Caccavale, Finzi and Lippiello109] or to support explainability during human-robot collaboration [Reference Caccavale and Finzi113].
5.2. Task learning and teaching
Attention-based task supervision and execution provide natural and effective support to task teaching and learning from demonstrations [Reference Caccavale and Finzi114]. In [Reference Caccavale, Saveriano, Finzi and Lee117], we proposed a framework enabling kinesthetic teaching of hierarchical tasks starting from abstract/incomplete descriptions: the human physical demonstration (as in (c) Fig. 5) is segmented into low-level controllers while a supervisory attentional system associates the generated segments to the abstract task structure, providing it with concrete/executable primitives. In this context, attentional manipulation (object or verbal cueing) can be exploited by the human to facilitate the matching between (top-down) proposed tasks/subtasks and (bottom-up) generated segments/models. Such an approach was also extended to the imitation learning of dual-arm structured robotic tasks [Reference Caccavale, Saveriano, Fontanelli, Ficuciello, Lee and Finzi118]. Attentional top-down and bottom-up regulations can also be learned from the demonstration. In [Reference Caccavale and Finzi116], robotic task structures are associated with a multi-layered feed-forward neural network whose nodes/edges represent actions/relations to be executed in so combining neural-based learning and symbolic activities. Multi-robot task learning issues were also explored. In [Reference Caccavale, Ermini, Fedeli, Finzi, Lippiello and Tavano119], a reinforcement deep Q-learning approach was proposed to guide a group of sanitizing robots in cleaning railway stations with dynamic priorities. This approach was also extended to prioritized cleaning with heterogeneous teams of robots [Reference Caccavale, Ermini, Fedeli, Finzi, Lippiello and Tavano120].
5.3. Combined task and motion planning
Task and motion planning in robotics are typically handled by separate methods, with high-level task planners generating abstract actions and motion planners specifying concrete motions. These two planning processes are, however, strictly interdependent, and various approaches have been proposed in the literature to efficiently generate combined plans [Reference Mansouri, Pecora and Schüller108]. Recently, we started to investigate how sampling-based methods such as Rapidly Exploring Random Trees (RRTs), commonly employed for motion planning, can be leveraged to generate task and motion plans within a metric space where both symbolic (task) and sub-symbolic (motion) spaces are represented [Reference Caccavale and Finzi121]. The notion of distance defined in this extended metric space is then exploited to guide the expansion of the RRT to generate plans including both symbolic actions and feasible movements in the configuration space (see (d) in Fig. 5). Empirical results collected in mobile robotics case studies suggest that the approach is feasible in realistic scenarios, while its effectiveness is more emphasized in complex and cluttered environments.
6. Industrial robotics
In industry, logistics aims at optimizing the flow of goods inside the large-scale distribution. The task of unloading carton cases from a pallet, usually referred to as depalletizing, yields several technological challenges [Reference Echelmeyer, Kirchheim and Wellbrock123] due to the heterogeneous nature of the cases that can present different dimensions, shapes, weights, and textures. This is the case in supermarkets where the products are stored on mixed pallets, which are pallets made of heterogeneous cases. On the other side, the literature review is mainly focused on the easier task of depalletizing homogeneous pallets, which are pallets made of standardized and equal cases. For instance, AI-enabled depalletizing systems were proposed to address problems of motion planning [Reference Sakamoto, Harada and Wan124] and safety [Reference Jocas, Kurrek, Zoghlami, Gianni and Salehi125]. In [Reference Nakamoto, Eto, Sonoura, Tanaka and Ogawa126], the use of target plane extraction from depth images and package border detection via brightness images to recognize various packages stacked complicatedly was proposed. A similar perception system can be found also in [Reference Schwarz, Milan, Periyasamy and Behnke127], where a deep-learning approach that combines object detection and semantic segmentation was applied to pick bins in cluttered warehouse scenarios. In this case, a specific data-reduction method was deployed to reduce the dimension of the dataset but several images of objects are still needed, impairing its usage by non-expert operators. Moreover, in [Reference Katsoulas and Kosmopoulos128] a system comprising an industrial robot and time-of-flight laser sensors was used to perform the depalletizing task. Some examples of specific gripping solutions developed to address both depalletizing and palletizing tasks (the task of loading cases to assemble a pallet) in highly structured industrial environments include: the robotic manipulator proposed in [Reference Krug, Stoyanov, Tincani, Andreasson, Mosberger, Fantoni and Lilienthal129], the suction systems applied on an autonomous robot capable of picking standard boxes from the upper side and placing them on a conveyance line proposed in [Reference Nakamoto, Eto, Sonoura, Tanaka and Ogawa126, Reference Tanaka, Ogawa, Nakamoto, Sonoura and Eto130], as well as the flexible robotic palletizer presented in [Reference Moura and Silva131]. Table V provides an overview of the work done in this field.
6.1. Logistics
A common activity in logistics is to depalletize goods from shipping pallets. This task, which is hard and uncomfortable for human operators, is often performed by robotic depalletizing systems. These automated solutions are very effective in well-structured environments, however, there are more complex situations, such as depalletizing of mixed pallets in supermarkets, which still represent a challenge for robotic systems. In recent years, we studied the problem of depalletizing mixed and randomly organized pallets by proposing a robotic depalletizing system [Reference Caccavale, Arpenti, Paduano, Fontanellli, Lippiello, Villani and Siciliano132] integrating attentional mechanisms from Sec. 5 to flexibly schedule, monitor, and adapt the depalletizing process considering online perceptual information from non-invasive sensors as well as high-level constraints that can be provided by supervising users or management systems.
Such flexible depalletizing processes also require strong perceptive capabilities. To this end, in [Reference Arpenti, Caccavale, Paduano, Fontanelli, Lippiello, Villani and Siciliano133] a single-camera system was proposed, where RGB-D data were used for the detection, recognition, and localization of heterogeneous cases, both textured and untextured, in a mixed pallet. Specifically, a priori information about the content of the pallet (the product barcode, the number of instances of a given product case in the pallet, the dimensions of the cases, and the images of the textured cases) was combined with data from the RGB-D camera, exploiting a pipeline of 2D and 3D model-based computer vision algorithms, as shown in Fig. 7, left. The integration of such a system into logistic chains was simplified by the short dataset required, based only on the images of the cases in the current pallet, and on a single image from a single RGB-D sensor.
In addition to cognitive and perceptual capabilities, depalletizing robotic systems also requires a high degree of dexterity to effectively grasp mixed cases with complex shapes. In [Reference Fontanelli, Paduano, Caccavale, Arpenti, Lippiello, Villani and Siciliano134], we proposed a sensorized gripper, designed to be assembled on the end-tip of an industrial robotic arm, that allowed grasping of cases either from above or from the lateral sides and was capable to adapt online its shape to different sizes of products.
7. Medical robotics
Medical robotics is a fast-growing field that integrates the principles of robotics with healthcare to advance medical procedures and enhance patient outcomes. Its primary objective is to develop cutting-edge robotic systems, devices, and technologies that cater to a wide range of medical domains, including surgery, rehabilitation, diagnosis, and patient care. In the realm of medical robotics, surgical robotics stands out as a specialized field dedicated to the development and application of robotic systems in surgical procedures. In this context, prioritizing safety is crucial, especially in robotic systems categorized as critical, where it serves as a fundamental design focus. In the quest for heightened safety and decreased cognitive burden, the shared control paradigm has played a crucial role, notably with the integration of active constraints. This methodology has given rise to specialized applications like Virtual Fixtures (VFs), which have garnered increasing popularity in recent years [Reference Bowyer, Davies and Baena135]. VFs act as virtual overlays, delivering guidance and support to surgeons during procedures and offering a diverse array of functionalities. When integrated with haptic feedback or guidance, the use of VFs in surgical teleoperated robots frequently offers active assistance to the surgeon through force rendering at the master side. As an example, Li et al. introduced an online collision avoidance method for the real-time interactive control of a surgical robot in complex environments, like the sinus cavities [Reference Li, Ishii and Taylor136]. The push for autonomous tasks in surgery stems from a drive to enhance precision and efficiency while relieving surgeons of cognitive workload in minimally invasive procedures. The advancement of surgical robots frequently entails the creation of innovative control laws using constrained optimization techniques [Reference Marinho, Adorno, k. and Mitsuishi137]. Ensuring the safety of robots in dynamic environments, particularly in robotics, has been significantly aided by the emergence of the Control Barrier Functions (CBFs) framework, as highlighted in [Reference Ames, Coogan, Egerstedt, Notomista, Sreenath and Tabuada138]. Advances in surgical robotics research extend beyond software applications, encompassing the innovation of hardware devices designed to streamline surgeons’ tasks and elevate their performance capabilities. A motorized hand offers an ergonomic alternative, and researched sensor designs prioritize force sensation for advantages in robotic surgery, such as injury reduction and palpation empowerment [Reference Kim, Kim, Seok, So and Choi139, Reference Lee, Kim, Gulrez, Yoon, Hannaford and Choi140]. In addition to surgical applications, medical robotic research has also advanced the development of sophisticated devices for artificial limbs. Drawing inspiration from the human hand, robotic hands have incorporated compliance and sensors through various technological solutions to enhance robustness by absorbing external impact and improve capabilities in object grasping and manipulation [Reference Catalano, Grioli, Farnioli, Serio, Piazza and Bicchi141, Reference Piazza, Catalano, Godfrey, Rossi, Grioli, Bianchi, Zhao and Bicchi142]. Table VI provides a classification of the recent contributions to the field.
7.1. Surgical robotics
Surgical robotics transformed surgery, progressing from open to minimally invasive and robot-assisted procedures. While open surgery involves large incisions and minimally invasive surgery uses small incisions, robot-assisted surgery utilizes robotic systems to enhance patient outcomes by reducing trauma, recovery times, and risks. However, there are ongoing constraints in accuracy, speed, dexterity, flexibility, and specialized skills. Research and development efforts are dedicated to overcoming these limitations and expanding the applications of robotic systems. Safety in surgical procedures is paramount, and advanced control systems with active constraints like VFs enhance safety and reduce cognitive load. VFs provide virtual guidance and assistance to surgeons through simulated barriers (Forbidden Regions Virtual Fixtures – FRVFs) and attractive forces (Guidance Virtual Fixtures – GVFs), improving surgical outcomes. A novel approach was employed for the precise dissection of polyps in surgical procedures, ensuring accurate detection of the region of interest and high-precision cutting with safety margins [Reference Moccia, Selvaggio, Villani, Siciliano and Ficuciello143]. The method utilized a control approach based on GVFs to constrain the robot’s motion along the dissection path. VFs were created using computer vision techniques, extracting control points from surgical scene images and dynamically updating them to adapt to environmental changes. The effectiveness of the approach was validated through experiments on the da Vinci Research Kit (dVRK) robot, an open-source platform based on the famous da Vinci® Surgical System. In the context of enhancing the suturing process with the dVRK robot, a similar approach was introduced, leveraging vision-based tracking techniques for precise needle tracking [Reference Selvaggio, A.Ghalamzan, Moccia, Ficuciello and Siciliano94]. The system was applied in conjunction with the haptic VF control technique using dVRK, mitigating the risk of joint limits and singularities during suturing. The optimal grasp pose was utilized to calculate force cues that guided the user’s hand through the Master Tool Manipulator. The paper in [Reference Moccia, Iacono, Siciliano and Ficuciello144] presented an example of FRVF application in the form of a surgical tools collision avoidance method. FRVFs were utilized to prevent tool collisions by generating a repulsive force for the surgeon. A marker-less tool tracking method employing a deep neural network architecture for tool segmentation was adopted (see Fig. 8). This work proposed the use of an Extended Kalman Filter for pose estimation to enhance the robustness of VF application on the tool by incorporating both vision and kinematics information. Software applications are moving also toward increasing the autonomy in surgical robotics. For instance, the paper in [Reference Moccia and Ficuciello148] presented an autonomous endoscope control algorithm for the dVRK’s Endoscopic Camera Manipulator in surgical robotics. It employed Image-based Visual Servoing (IBVS) with additional constraints enforced by CBFs to ensure instrument visibility and prevent joint limit violations. Laparoscopic images were used, and deep learning was applied for semantic segmentation. The algorithm configured an IBVS controller and solved a convex optimization problem to satisfy the constraints. The solutions mentioned earlier were tested in a simulated environment using the CoppeliaSim software, with a particular focus on the presentation of the dVRK simulator [Reference Ferro, Brunori, Magistri, Saiella, Selvaggio and Fontanelli149, Reference Fontanelli, Selvaggio, Ferro, Ficuciello, Vendittelli and Siciliano150].
Research advancements in surgical robotics encompass not only software applications but also the development of hardware devices that aim to facilitate surgeons’ jobs and enhance their performance. The MUSHA Hand II, a multifunctional surgical instrument with underactuated soft fingers ( [Reference Ghafoor, Dai and Duffy151]) and force sensors, was integrated into the da Vinci® robotic platform [Reference Liu, Selvaggio, Ferrentino, Moccia, Pirozzi, Bracale and Ficuciello145–Reference Selvaggio, Fontanelli, Marrazzo, Bracale, Irace, Breglio, Villani, Siciliano and Ficuciello147], shown in Fig. 8. This innovative hand enhances the adaptability and functionality of the surgical system, addressing limitations in force sensing during robot-assisted surgery. Experimental validation was performed on the dVRK robotic testbed. The paper in refs. [Reference Fontanelli, Selvaggio, Buonocore, Ficuciello, Villani and Siciliano23, Reference Sallam, Fontanelli, Gallo, La Rocca, Di Spiezio Sardo, Longo and Ficuciello152] introduces a novel single-handed needle driver tool inspired by human hand-rolling abilities. It includes a working prototype and is tested with the dVRK surgical system. Robotic solutions are also created to solve specific surgical procedures, like prostate cancer biopsy. The paper in [Reference Coevoet, Adagolodjo, Lin, Duriez and Ficuciello153] presented a robotic solution for transrectal prostate biopsy, showcasing a soft-rigid robot manipulator with an integrated probe-needle assembly. The system included manual positioning of the probe and autonomous alignment of the needle, along with MRI-US fusion for improved visualization. Experimental validation was conducted using prostate phantoms.
7.2. Robotic hands and prosthesis
Robotic artificial limbs have played a crucial role in aiding individuals with missing body parts to regain functionality in their daily life activities. The PRISMA Hand II, depicted in Fig. 9, represented a mechanically robust anthropomorphic hand with high underactuation, utilizing three motors to drive 19 joints through elastic tendons. Its distinctive mechanical design facilitated adaptive grasping and in-hand manipulation, complemented by tactile/force sensors embedded in each fingertip. Based on optoelectronic technology, these sensors provided valuable tactile/force feedback during object manipulation, particularly for deformable objects. The paper in [Reference Canbay, Ferrentino, Liu, Moccia, Pirozzi, Siciliano and Ficuciello154] detailed the hand’s mechanical design, sensor technology, and proposed a calibration procedure for the tactile/force sensors. It included a comparison of various neural network architectures for sensor calibration, experimental tests to determine the optimal tactile sensing suite, and demonstrations of force regulation effectiveness using calibrated sensors. The paper also introduced a virtual simulator for users to undergo training sessions in controlling the prosthesis. Surface Electromyographic (sEMG) sensors captured muscle signals from the user, processed by a recognition algorithm to interpret the patient’s intentions [Reference Leccia, Sallam, Grazioso, Caporaso, Di Gironimo and Ficuciello155].
8. Future Directions
8.1. Dynamic manipulation and locomotion
Manipulation and locomotion represent two research areas that require explicit or implicit control of the interaction forces and the enforcement of the related frictional constraints. Mastering in-contact situations through accurate force regulation will allow legged or service robots of the future to perform several difficult tasks with unprecedented precision and robustness [Reference Gong, Sun, Nair, Bidwai, R., Grezmak, Sartoretti and Daltorio156]. These include dealing with time-varying or switching contacts with the environment and manipulating or locomoting on articulated, foldable, or even continuously deformable surfaces. In both fields, the synthesis of novel mechanisms is always a meaningful aspect [Reference Jia, Huang, Li, Wu, Cao and Guo157, Reference Jia, Huang, Wang and Li158]. Solving complex tasks requiring simultaneous locomotion and manipulation (commonly referred to as loco-manipulation) using, for example, quadruped robots equipped with an arm, is a very active topic of research. Future works should focus on optimizing the robustness of loco-manipulation trajectories against unknown external disturbances or develop control techniques for safe interaction with humans [Reference Bellicoso, Krämer, Stäuble, Sako, Jenelten, Bjelonic and Hutter159, Reference Ferrolho, Ivan, Merkt, Havoutis and Vijayakumar160]. This will raise the need for improving proprioceptive and exteroceptive perception techniques to accurately retrieve the actual state of the robot and the environment in contact. The combined use of multiple vision, force and tactile sensors, and fusion techniques constitute a promising approach in this direction [Reference Costanzo, Natale and Selvaggio161]. Another future research direction includes the development of improved policy representation and learning or planning frameworks to handle difficult tasks. In other words, finding mappings from the task requirements and sensor feedback to controller inputs for in-contact tasks is still carried out with difficulties. The development of an accurate yet fast physics engine to simulate in-contact tasks with constrained environments will favor this and allow for better policy transfer to handle difficult tasks that can be learned in simulation before being deployed to the real world.
8.2. Aerial robotics
Energy saving, safety in the interactions with people and objects, accuracy, and reliable decisional autonomy pose significant limitations in aerial systems. Future challenges involve power consumption and short-lived batteries, while uncertified devices prompt safety restrictions. Several roadmaps emphasize the need for aerial devices to function in real-world scenarios, facing inclement weather and requiring proper certifications. Mechatronics is crucial for both UAMs. Despite progress, challenges persist in enhancing safety and energy efficiency. Integrating mechanical design and control is essential, with a lack of research on the optimal positioning of grasping tools for UAMs. Hybrid mechatronic solutions are potential avenues for improvement.
Opportunities come from inspection and maintenance tasks for aerial manipulators, such as replacing human operators in remote locations, handling hazardous tasks, and increasing plant inspections. Achieving these goals requires addressing outlined issues and improving environmental performance. While aerial manipulation activities are primarily in academia, recent European-funded projects like AIRobots, ARCAS, SHERPA, EuRoC, Aeroworks, AEROARMS, AERO-TRAIN, and AERIAL-CORE aim to bridge the gap between academia and industry. The AEROARMS project received the European Commission Innovation Radar Prize, showcasing advancements. However, the technology migration remains a challenging journey.
8.3. Physical human-robot interaction
In future works, the proposed HRI frameworks can be extended to integrate multiple interaction modalities other than physical. For instance, visual and audio feedback may provide additional information about the robot’s state to improve readability, safety, and reliability during the assisted modes. In addition, gesture-based and speech-based interaction modalities may complement physical interaction to enable a more natural human-robot communication, while enhancing the robustness of intention estimation.
8.4. AI and cognitive robotics
In our ongoing research activities, we aim to develop an integrated robotic executive framework supporting long-term autonomy in complex operative scenarios. For this purpose, our goal is to investigate incremental task teaching and adaptation methods, progressing from primitive to complex robotic tasks. In this direction, symbolic and sub-symbolic learning methods can be integrated to simultaneously learn hierarchical tasks, sensorimotor processes, and attention regulations through human demonstrations and environmental interaction. In this setting, effective mechanisms are also needed to retrieve and reuse learned tasks depending on the operational and the environmental context. Concerning natural human-robot collaboration, we are currently investigating additional attention mechanisms (e.g., joint attention, active perception, affordances, etc.) that play a crucial role in supporting task teaching and adaptive execution. Regarding combined task and motion planning methods, our aim is to formulate more sophisticated metrics and to address hierarchically structured tasks of mobile manipulation.
8.5. Industrial robotics
As a future research direction, the flexible and adaptive architecture for depalletizing tasks in supermarkets proposed in [Reference Caccavale, Arpenti, Paduano, Fontanellli, Lippiello, Villani and Siciliano132] will be extended also to palletizing tasks or other industrial scenarios, such as packaging [Reference Dai and Caldwell162]. Moreover, more complex environmental conditions along with more sophisticated task structures including safety constraints and fault detection/correction will be investigated. Regarding the vision side, the segmentation accuracy, as well as, the depalletization speed of the algorithms deployed in the framework [Reference Arpenti, Caccavale, Paduano, Fontanelli, Lippiello, Villani and Siciliano133] will be exhaustively compared with the performance of convolutional neural networks and support vector machines. Besides, multiple images from different perspectives will be exploited in a multi-camera approach to better estimate the poses of the cases. Regarding the gripping tool [Reference Fontanelli, Paduano, Caccavale, Arpenti, Lippiello, Villani and Siciliano134], more compact suction systems will be developed to find the best tradeoff between dimensions, weight, and effectiveness for each type of product.
8.6. Medical robotics
Charting the course for the future of medical robotics, especially in the surgical domain, entails a pivotal shift towards the incorporation of cutting-edge AI techniques. This evolution seeks to broaden the applicability of proposed methodologies to embrace realistic surgical scenarios, effectively navigating challenges posed by tissue deformation and occlusions. Rigorous studies on medical procedures will be conducted to precisely define safety standards, ensuring a meticulous approach to healthcare practices. As a conclusive step, collaborative validation with surgeons will serve as a tangible testament to the effectiveness of the proposed pipelines, affirming their real-world impact in enhancing surgical precision and safety. In the realm of advancing robotic surgical instruments and artificial limbs, future trajectories point towards expanding the capabilities of proposed devices to cater to more specific scenarios. This evolution involves a strategic integration of tailored characteristics, incorporating cutting-edge sensing technologies and intelligent control strategies. Having demonstrated the potential applications of these devices, the ongoing endeavor is to refine their design for optimal performance across an array of surgical tasks. The ultimate objective lies in seamlessly transferring these innovations from the realm of development to practical clinical applications, ushering in a new era of enhanced surgical precision and functional prosthetic applications.
9. Conclusion
In this article, we overviewed the main results achieved by the robotics research carried out at the PRISMA Lab of the University of Naples Federico II during the last decade. After a brief overview, the key contributions to the six research areas of dynamic manipulation and locomotion, aerial robotics, physical HRI, AI and cognitive robotics, industrial robotics, and medical robotics were briefly reported and discussed together with future research directions. We highlighted the main achievements in each of these areas, categorizing the adopted methodologies and the key contributions in the fields.
Our dream and goal for the future is to make scientific and technological research advancements in all the considered areas more accessible to other people around the world who may be able to use it for their purposes or needs. From this, significant breakthroughs are expected in the future for the industry, health, education, economic, and social sectors.
Author contribution
Bruno Siciliano conceived the article. Mario Selvaggio, Rocco Moccia, Pierluigi Arpenti, Riccardo Caccavale, and Fabio Ruggiero wrote the manuscript under the supervision of the rest of the authors who reviewed and edited it.
Financial support
The research leading to these results has been partially supported by the following projects: COWBOT, grant 2020NH7EAZ_002, PRIN 2020; AI-DROW, grant 2022BYSBYX, PRIN 2022 PNRR, European Union – NextGenerationEU; Harmony, grant 101017008, European Union’s Horizon 2020; Inverse, grant 101136067, and euROBIN, grant 101070596, European Union’s Horizon Europe; BRIEF, IC IR0000036, National Recovery and Resilience Plan, Mission 4 Component 2 Investment 3.1 of Italian Ministry of University and Research funded by the European Union – NextGenerationEU.
The views and opinions expressed are only those of the authors and do not necessarily reflect those of the funding agencies.
Competing interests
The authors declare no competing interests exist.
Ethical approval
None.