1. Introduction
Today, robots are used in a wide range of applications. Robots show great advantages in industrial areas where repetitive work is performed because of their low cost, high efficiency and low scrap rate [Reference Li, Li and Luo1]. But in areas where motion trajectories do not repeat, robots have difficulty completing tasks. One of the classic tasks is handwriting. Handwriting actions are unavoidable in human life and play an important role in scenarios such as teaching and businesses both requiring signatures [Reference Yin, Alves-Oliveira, Melo, Billard and Paiva2, Reference Zeng, Huang, Chao and Zhou3]. English words are made up of 26 letters arranged in a row, but Chinese characters are two-dimensional pictures made up of Chinese strokes written in specific positions superimposed on each other [Reference Chao, Huang, Zhang, Shang, Yang, Zhou, Hu and Lin4]. How to encode Chinese characters is a challenge. As a result, the robotic writing of Chinese characters has also been studied by numerous researchers. At the same time, the structure of the strokes of Chinese characters has evolved over millennia to conform to the human mind and is the scientifically best way to segment Chinese characters. In other words, the segmentation of Chinese character strokes can also be applied to tasks that require the segmentation of superimposed trajectory images into executable trajectories, such as the grinding and cleaning of complex recesses in workpieces. At the same time, writing is a delicate manipulation task that would be difficult to accomplish if interference were encountered. Therefore, there is an urgent need to design an interference-resistant robotic Chinese character writing system for diverse applications.
The first step in imitating human writing is to disassemble Chinese characters into Chinese strokes. Currently, three methods are widely applied for Chinese character extraction, i.e., (a) computer font reproduction, (b) imitation of human writing trajectories and (c) Chinese character image decomposition. Computer font reproduction involves using the stroke information that comes with a standard font library [Reference Wang, Chen, Deng, Hutchinson and Dellaert5–Reference Chuanjian, Chunmei and Jun8] and handling of word posters [Reference Yang, Chen, Zhou, Zheng, Xiao, Huang and Sun9–Reference Lam and Yam13], but these methods are database dependent and cannot imitate the individual writing of a particular person. Imitation of human writing trajectories includes recording the trajectory of the pen tip when a person is writing [Reference Li, Sun, Zhou and Dai14, Reference Lin and Tang15]. The motion sensing input devices are utilised to capture the gestures and trajectory of the human hand [Reference Chao, Huang, Zhang, Shang, Yang, Zhou, Hu and Lin4], and the physical contact demonstrations [Reference Sun, Qian and Xu16]. However, these methods rely on human teaching and cannot break down Chinese strokes through pictures. Chinese character image decomposition means that it is possible to analyse the strokes of Chinese characters directly through pictures without relying on word library and human teaching. Chinese character image decomposition, the most difficult but widely used method, has been studied by many researchers. Consequently, many methods were developed, such as corner detection algorithm [Reference Chao, Huang, Lin, Yang, Hu and Zhou17, Reference Cao, Wu, Ao and Zhou18], point to boundary orientation distance of one triangular mesh [Reference Wang, Liang, Sun and Liu19], character library template matching segmentation [Reference Gan, Fang, Chao, Zhou, Yang, Lin and Shang20, Reference Wang, Jiang and Liu21], B-spline curve matching [Reference Ju, Ji, Li and Liu22], extraction of strokes using ambiguous zone information [Reference Cao, Wu, Ao and Zhou18], extraction of strokes using optimum paths [Reference Yao, Shao and Yi11], stroke speed feature and stroke vector feature segmentation [Reference Chang23], etc. However, these methods require a lot of computation. There is still a need for a simple picture-based method to extract strokes from Chinese characters.
The robot can write the corresponding Chinese characters according to the extracted strokes. However, in order to eliminate interference during the writing process, an algorithm that can adaptive the writing trajectory according to the position of the writing board is required. Dynamic movement primitive (DMP) is an effective method for modelling robot movement behaviour and biological phenomena [Reference Ijspeert, Nakanishi and Schaal24]. It has the advantage of being stable, simple and easy to generalise. It is possible to generalise a trajectory simply by modifying the position of the start and target points, and the shape of the generalised trajectory is same to the original demonstration. Because of its significant advantages, DMPs have been widely used and many studies have improved the learning and generalisation capabilities of DMPs [Reference Koutras and Doulgeri25]. In [Reference Yang, Chen, He, Cui and Li26], Gaussian mixture model and Gaussian mixture regression were integrated to enhance DMPs’ learning skills from multiple demonstrations. In [Reference Yang, Zeng, Cong, Wang and Wang27], the learning skills are split into a series of sub-skills, thus the generalisation ability of the learned skills is improved. In [Reference Anand, Østvik, Grøtli, Vagia and Gravdahl28], the framework of the DMPs is extended to accommodate real-time changes during task execution. In [Reference Liao, Jiang, Zhao, Wu, Yue and Mei29], a Riemannian-based DMP framework is proposed to learn and generalise multi-space data. In [Reference Zeng, Chen, Wang and Yang30], a biomimetic controller is integrated with DMPs to facilitate the learning and adaptation of compliant skills. The authors of [Reference Koutras and Doulgeri31] proposed a new DMPs to solve the problems arising from the spatial scaling of DMPs in Cartesian space. The authors of [Reference Abu-Dakka and Kyrki32] used Riemannian metrics to reformulate DMPs such that the resulting DMPs can be directly employed with quantities expressed as SymmSymmetric Positive Definite (SPD) matrices. The study in [Reference Wang, Yan, Wang, Gao, Du and Chen33] introduced a deep neural network in DMPs to address the invalidation of DMPs forcing terms. The authors of [Reference Yang, Zeng, Fang, He and Li34] developed a framework for the robot to learn both movement and muscle stiffness features. However, all these methods are difficult to adapt to the case where the start and target points are rotated simultaneously. The writing trajectory of the robot may be distorted when the writing board is rotated and flipped.
At the same time, the contact force between the pen tip and the writing board is an important piece of information during writing, especially when disturbed. Therefore, it is extremely necessary to design a writing system that combines stroke segmentation, trajectory generalisation, visual information and force information.
To address the above issues, the following work has been done. Firstly, inspired by the human mindset of splitting strokes, a direction-based algorithm for extending and stitching stroke components was designed to extract the strokes of a Chinese character picture in a simple and fast way. Subsequently, some improvements were made to the DMPs. The rotation and translation matrices were fused into the framework of the DMPs so that the DMPs could accommodate the simultaneous rotation of the start and target points, i.e., it could accommodate the rotation and flipping of the writing board. Finally, an interference-resistant robotic Chinese character writing system was developed. The system uses segmented strokes as demonstrations and then combines the rotation and translation matrices calculated from visual information with improved DMPs to generalise the trajectories in real time. At the same time, the position is regulated online using admittance control based on force sensor information. Thus, the robot can write complete characters on the writing board despite a human’s random movement, rotation and flipping.
The rest of this article is presented below. Section 2 describes the proposed method in this article, including stroke division, force-position hybrid control and the designed system. The stroke division algorithm is divided into stroke component extraction and stroke component connection. Modified DMPs and admittance control are introduced in force-position hybrid control. Finally, a jam-resistant robotic writing system was designed based on the above algorithm. Section 3 provides sufficient experimental examples to demonstrate the effectiveness of the stroke division algorithm, modified DMPs, admittance control, and the designed system. Section 4 is the conclusion.
2. Proposed method
2.1. Stroke division
Stroke division is one of the difficulties in the study of Chinese characters because it is tough to learn the connections and intersections in Chinese characters. When the endpoints of two strokes are at the same point, as shown in Fig. 1, the point is defined as a connection. Improper recognising this point will tend to misidentify multiple strokes as the same stroke, as shown in Fig. 2. The intersection is the point formed when two strokes intersect, and improper handling of this point will result in incorrect recognition of the strokes, as shown in Fig. 3.
The Chinese character strokes are shown in Fig. 4. Analysing the strokes of Chinese characters, we can see that each stroke can be divided into one or more approximately straight curves. These curves can be summarised as dot, horizontal, vertical, left-falling and right-falling as shown in Fig. 5. These curves are named stroke components. Thus, each Chinese character stroke can be formed by combining one or more Chinese character stroke components. For example, as shown in Fig. 6, a Chinese character stroke “dot” can be seen as a combination of a Chinese character stroke component “dot”. A Chinese character stroke “vertical-turning” can be decomposed as a combination of a Chinese character stroke component “vertical” and a Chinese character stroke component “horizontal”. For another example, a Chinese character “horizontal-break-hook” can be seen as a combination of a “horizontal”, a “vertical” and a “dot”.
As the Chinese character stroke components all extend in the same direction with no turning points, the Chinese character stroke components can be extracted when the direction of extension is known. At the same time, the determination of the direction of extension also deals with the issue of connections and intersections. A turning place connects two stroke components, and thus the turning identification is important in the basic component extraction. The intersection occurs when the strokes have been extended to different directions. When the different stroke components’ writing directions are determined, the stroke component extensions can cross the intersection in the specified direction to prevent misidentification at the intersection. Therefore, the stroke extraction algorithm has two parts: stroke components extraction and stroke components connection.
2.1.1. Stroke components extraction
Firstly, the handwritten Chinese character images were binarised and refined. The algorithm for stroke components extraction is built on the basis of a refined image of a white Chinese character on a black background. The image is then traversed to find the white pixel, which is recorded as the starting point $P_{sta}$ and treated as $P_{0}$ . As shown in Fig. 7, for a set of eight neighbourhoods with $P_{0}$ as the centroid, we set $P_{0}$ as the origin and the length of the pixel as the unit length. The pixel points in this eight-neighbourhood are therefore named by the coordinates in which they are located. Starting from the pixel point above $P_{0}$ , the pixels in a clockwise circle are each named $P_{n}=(x_n,y_n)(n=1,2,3,\ldots,8)$ . The process of extracting the strokes component is shown in Fig. 8. Define a direction vector $\textbf{D} =(\alpha,\beta )$ with an initial value of $(0,0)$ . $\textbf{D}$ describes the direction from the start pixel to the end pixel of the searched stroke component. The values of $\textbf{D}$ are normalised to obtain the coordinate points $P_{D}$ in the eight-neighborhood coordinate system.
The distance between $P_{D}$ and the coordinates of each pixel in the eight neighbourhoods is
Arrange $d_1, d_2\ldots, d_n$ according to their values (from small to large values). To ensure that the stroke component extends in the direction indicated by $\textbf{D}$ , perform a sequential search for the $P_{n}$ corresponding to the first 4 $d_{n}$ . If all the $P_{n}$ picked out are black, the search stops. If $P_{n}$ is a white pixel, extend the base stroke to $P_{n}$ and update direction vector
Subsequently, $P_{n}$ is used as the next $P_{0}$ and $\textbf{D}_{\textbf{new}}$ as the next $\textbf{D}$ in order to continue extending the stroke component until the search stops.
However, the starting point $P_{sta}$ is not always the endpoint of the stroke component. So after the extension along direction $\textbf{D}$ is complete, the direction vector $\textbf{D}$ needs to be reversed
It is then extended to the opposite direction from the starting point $P_{sta}$ , and finally, the two curves are joined to form a complete stroke component.
In order to prevent the same stroke component from being searched repeatedly, each pixel point is blacked out after it has been recorded as a point in the stroke component. However, this creates the interruption problem. As shown in Fig. 9, an intersection point is shared by multiple stroke components and when the intersection is blacked out, another stroke component passing through this intersection will be interrupted at the intersection. Therefore, we use two identical refined images $I_{1}$ and $I_{2}$ of white characters on a black background. Image $I_{1}$ is blacked out at the recorded pixels and is used to find the starting point $P_{sta}$ . This prevents repeated searching of the same basic stroke. Image $I_{2}$ is not blacked out at the recorded pixels and is used to extend the stroke component so that the stroke components are not interrupted at the intersection point.
2.1.2. Stroke components connection
When the end point of stroke component $SC_{1}$ and the start point of stroke component $SC_{2}$ are connected, $SC_{1}$ and $SC_{2}$ will make a stroke. Therefore, the starting and ending points of the stroke components need to be identified, i.e. the points between the connected stroke components. According to the conventional writing rules, the strokes are written from left to right or from top to bottom. The specific sorting algorithm is shown in Algorithm 1.
However, there is a special Chinese stroke called “hook” as shown in Fig. 10, which is written from bottom to top and from right to left. Similar to the stroke component “dot”, it is much shorter than the other basic strokes. It is also connected to the other stroke components. Thus, a stroke component is considered to be a ”hook” if it satisfies the following conditions:
-
1. the end point is connected to the end points of the other stroke components;
-
2. the start point is not connected to the end points of any of the other stroke components;
-
3. the length is less than one-half the length of the average stroke component.
When combining strokes, the points of the “hook” need to be reverse ordered and then connected to other stroke components.
2.2. Force-position hybrid control
For a robotic writing system, the position control mode provides a limited performance, e.g. nonsmooth and nonsimilar to human’s writing characters. To write smoothly on the writing board, the contact force between the pen tip and the writing board needs to be controlled. Therefore, we use the force-position hybrid control. A modified DMP is used to control the position and posture of the robot’s end-effector. At the same time, the contact force between the pen tip and the writing board is controlled using admittance control. As a result, the robot can perform writing tasks.
2.2.1. Modified DMPs
Based on a spring damping model, the DMPs use a non-linear adjusting term to achieve the desired point attractor behaviour. It has the properties of a second-order dynamical system such as convergence at the target point, robustness to disturbances and generalisation in time and space. At the same time, the non-linear terms can be made to produce a continuous smooth trajectory. DMPs are classified as discrete DMPs and periodic DMPs. We use the discrete DMPs to model writing skills. The discrete DMPs model can be expressed as:
where $\tau$ is the temporal scaling constant, and changing it allows the trajectory to generalise in the time dimension. $\alpha$ and $\beta$ represent the damping factor and the spring constant, respectively, and $\alpha$ can generally be set to $\alpha =\beta ^{2}/4$ . $y_{g}$ is the target position. $y$ is the position of the demonstration. $v$ is the velocity of the demonstration. The forcing term $f(x)$ can be expressed as:
where $\omega _i$ is the weight of the Gaussian kernel function, $y_0$ is the starting point of the trajectory, $c_i$ and $h_i$ are the centre and width of the Gaussian kernel function, respectively, $N$ denotes the number of Gaussian kernel functions, and $\alpha _x$ is a positive gain coefficient.
However, as shown in Fig. 11, when the start and target points are rotated, the original DMP cannot rotate the generalised trajectory in its original shape and the trajectory is distorted. In order to apply the DMPs to this case where both the start and target points are rotated, we have improved the original DMPs inspired by [Reference Koutras and Doulgeri31]. The generalised space is obtained by rotating the original space. Once the rotation matrix between the two spaces is known, the corresponding rotation can be applied to the non-linear term. Thus, equation (9) becomes
where $y_{g,1}$ and $y_{g,0}$ are the target points in the generalised space and the original space, respectively. $y_{0,1}$ and $y_{0,0}$ are the starting points in the generalised space and the original space, respectively. $s$ is the scaling constant. $\textbf{R}$ is the rotation matrix and $\textbf{t}$ is the translation matrix. Throughout this article, $\textbf{R}$ and $\textbf{t}$ are calculated from visual information.
2.2.2. Admittance control
The admittance model is defined as follows:
where $M_{A}$ is the quality matrix, $D_{A}$ is the damping matrix and $K_{A}$ is the spring factor matrix. $f_h$ is the traction of the demonstrator. $x_A$ is the pose of the robot end-effector, $\dot{x}_{A}$ is the speed and $\ddot{x}_{A}$ is the acceleration. $x_0$ is the original point of the pose. The robot end-effector needs to move on the writing board during the writing process, so $x_A=x_0$ . The admittance model can be simplified as:
Therefore, the acceleration can be calculated using the traction force $f_h$ and then the velocity and position can be calculated by integration to obtain the trajectory. In this article, $f_h$ is the contact force between the pen tip and the writing board, with an expected value of $2N$ .
2.3. Designed system
Based on the above approach, we have designed a multi-sensor-based robotic writing system. As shown in Fig. 12, at the demonstration stage, the strokes are extracted from the demonstrations. During the trajectory generalisation phase, the camera detects changes in the pose of the writing board and calculates the rotation matrix $\textbf{R}$ and the translation matrix $\textbf{t}$ . The DMPs model calculates the generalised trajectory points based on the $\textbf{R}$ and $\textbf{t}$ and feeds the trajectory points into the admittance control model. The admittance control algorithm calculates the execution position based on the given trajectory points and real-time force information and outputs the execution position to the robot. At the same time, the execution position will be different from the desired position and in the next step this differential will be used to correct the trajectory points in advance to improve accuracy.
3. Experiment
3.1. Stroke extraction
In order to verify the accuracy of Chinese character stroke extraction, we randomly selected 100 handwriting Chinese characters with different structures and complexities for experiments. As shown in Fig. 13, the experiments show that the algorithm works well. Statistically, 93% of the strokes of handwritten Chinese characters can be extracted accurately, with an average speed of 0.4 s per image. The characters boxed in Fig. 13 are characters with inaccurate stroke recognition. After studying them, we have divided them into three categories. Category 1 is for those boxed in red, whose strokes are broken. This means that a single stroke has been split into multiple strokes or is incompletely recognised. Those boxed in green are in category 2: misconnected strokes. This means that a stroke has been incorrectly connected to another stroke. Boxed in blue is category 3, there are redundant strokes. Excess strokes take the form of burrs consisting of just one or two pixels. Analysis of these three categories shows that almost all recognition errors happen in the refinement process. When inaccurate refinement leads to a change in the direction of the strokes, deformations are observed at the intersection and junction points of the Chinese character’s skeleton, errors in category 1 are likely to occur. During the writing process, variations in pen pressure or ink leakage can result in changes in the thickness of strokes, leading to the occurrence of jaggies in the refinement process, errors in category 3 are likely to occur. Errors in category 2 are caused by inaccurate refinement and the poor writing where multiple interconnected or intersecting stroke components share a consistent direction.
3.2. Modified DMPs
There are two methods to generalise a trajectory based on disturbances in real time. Method 1 firstly generalises the new trajectory using the original DMPs and then adjusts the new trajectory using rotation and translation matrices, $\textbf{R}$ and $\textbf{t}$ . Method 2 uses the modified DMPs to generalise the trajectory directly based on $\textbf{R}$ and $\textbf{t}$ . We set up a writing track and use visual information to obtain the pose of the writing board to calculate $\textbf{R}$ and $\textbf{t}$ . These two methods are then used to make the trajectory follow the desired pose on the writing board. The effects of the two methods are shown in Fig. 14. It can be seen that the trajectory output of method 1 is more oscillating, not smooth enough and has large errors. Method 2 outputs a smoother trajectory with less oscillation, which is more effective. In comparison to Method 1, the trajectories produced by Method 2 exhibit a difference of up to 7.1977 mm in the z-direction, which poses a potential safety hazard in practical applications, as it could result in the inadvertent puncturing of the writing surface. We analysed the reasons for the large differences in the results between the two methods. When using method 1, outputs of the original DMP have errors in practice, such as distortion of the trajectory and robot movement errors. Also, $\textbf{R}$ and $\textbf{t}$ are calculated from visual information and have errors. Because $\textbf{R}$ and $\textbf{t}$ do not fall within the framework of DMPs, the multiplication with the trajectory points is not constrained by DMPs, leaving the trajectory with a large offset. Whereas method 2 incorporates $\textbf{R}$ and $\textbf{t}$ into the framework of DMPs, the generalisation of DMPs will change with $\textbf{R}$ and $\textbf{t}$ . At the same time, the trajectory does not produce large offsets due to the constraints of the DMPs framework.
3.3. Admittance control
We have measured the contact force between the pen tip and the writing board when a person writes. We find that the best writing performance is achieved when the force sensor has $2N$ contact force on the z-axis. If the contact force is not $2N$ when the robot arm moves to the specified position, the admittance control algorithm can adjust the end position of the robot arm in time to guarantee the contact force always be $2N$ . As shown in Fig. 15, when the initial contact force is 0, i.e. when the end-effector is not touching the writing board, the admittance control moves the end-effector closer to the writing board until the contact force converges to $2N$ . When the initial contact force is $6N$ , the admittance control policy steers the end-effector moving away from the writing board until the contact force is stabilised at $2N$ . The admittance control also adjusts the contact force in time when there is any interference during the adjustment process.
3.4. Application
In this experiment, we verify the effectiveness of the designed interference-resistant robotic Chinese character writing system. A human randomly writes a Chinese character. Then the robot imitates the human handwriting on the writing board. While the robot writes, the human randomly moves, turns and flips the board, and the robot can still follow the posture of the board to adjust its trajectory. As shown in Fig. 16, the experimental platform is composed of an Elite Collaborative Robot EC66, an Intel RealSense Depth Camera D435 and a LCD Writing Board. The Elite Collaborative Robot EC66 is a lightweight and flexible collaborative robot. Its weight is 17.5 kg and the largest payload is 6 kg. It runs efficiently and can reach a maximum speed of 2.8 m/s, with a soft and consistent trajectory. As a collaborative robot, it enables safe human–robot interaction with a high degree of reliability and safety. It can therefore complete different tasks while ensuring the safety of humans. With a range of 10 m and a wide field of view, the Intel RealSense Depth Camera D435 has a wide range of applications in robotics development. By affixing four calibration papers to the four corners of the writing board, it is able to estimate the posture of the writing board very well in this experiment, i.e. to recognise human interference. During the robot’s writing process, the human randomly and continuously moves, rotates and flips the writing board. The robot can write complete Chinese characters on the writing board.
The results of the experiment are shown in Fig. 17. The robot is able to adjust its end trajectory in time to write complete and accurate Chinese characters on the writing pad under human interferences.
4. Conclusion
In this article, we designed a robotic writing system that allows a robot to imitate human handwriting even with unexpected interruptions. We first designed a new stroke extraction algorithm that extracts stroke components based on the direction of extension, and then connected the stroke components according to the writing direction of the stroke. To cope with random interference, the original DMPs model was improved such that the generalisation function could be adapted to the situations when the writing board was rotated and flipped. The writing system was then designed by combining visual information, admittance control and an improved DMP to enable the robot to accurately imitate human handwriting in the face of interference.
In this article, we have presented an interference-resistant robotic writing system that can also be used in applications such as grinding and cleaning of complex recesses in workpieces. In the future, we will improve the current visual measurement module to make this system more accurate and better applicable.
Author contributions
Xian Li, Chenguang Yang, Sheng Xu and Yongsheng Ou proposed the research project; Xian Li conducted the review and wrote the first draft; Xian Li, Chenguang Yang, Sheng Xu and Yongsheng Ou revised the manuscript; and Chenguang Yang, Sheng Xu and Yongsheng Ou provided advice and supervision.
Financial support
This work was supported in part by National Nature Science Foundation of China (NSFC) under Grant U20A20200 and Major Research Grant No. 92148204, in part by Guangdong Basic and Applied Basic Research Foundation under Grants 2019B1515120076 and 2020B1515120054, in part by Industrial Key Technologies R & D Program of Foshan under Grant 2020001006308 and Grant 2020001006496.
Competing interests
The authors declare no competing interests exist.
Ethical approval
Not applicable.