Semantic geometric fusion multi-object tracking and lidar odometry in dynamic environment

Tingchen Ma; Guolai Jiang; Yongsheng Ou; Sheng Xu

doi:10.1017/S0263574723001868

Semantic geometric fusion multi-object tracking and lidar odometry in dynamic environment

Published online by Cambridge University Press: 11 January 2024

and

Tingchen Ma: Affiliation:
Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen, 518055, China
Guolai Jiang: Affiliation:
Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
Yongsheng Ou*: Affiliation:
Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian, 116024, China
Sheng Xu*: Affiliation:
Guangdong Provincial Key Laboratory of Robotics and Intelligent System, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
*: Corresponding authors: Yongsheng Ou, Sheng Xu; Emails: [email protected], [email protected]
Corresponding authors: Yongsheng Ou, Sheng Xu; Emails: [email protected], [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Simultaneous localization and mapping systems based on rigid scene assumptions cannot achieve reliable positioning and mapping in a complex environment with many moving objects. To solve this problem, this paper proposes a novel dynamic multi-object lidar odometry (MLO) system based on semantic object recognition technology. The proposed system enables the reliable localization of robots and semantic objects and the generation of long-term static maps in complex dynamic scenes. For ego-motion estimation, the proposed system extracts environmental features that take into account both semantic and geometric consistency constraints. Then, the filtered features can be robust to the semantic movable and unknown dynamic objects. In addition, we propose a new least-squares estimator that uses geometric object points and semantic box planes to realize the multi-object tracking (SGF-MOT) task robustly and precisely. In the mapping module, we implement dynamic semantic object detection using the absolute trajectory tracking list. By using static semantic objects and environmental features, the system eliminates accumulated localization errors and produces a purely static map. Experiments on the public KITTI dataset show that the proposed MLO system provides more accurate and robust object tracking performance and better real-time localization accuracy in complex scenes compared to existing technologies.

Keywords

mobile robots navigation lidar SLAM multi-object tracking dynamic object detection

Type: Research Article
Information: Robotica , Volume 42 , Issue 3 , March 2024 , pp. 891 - 910

DOI: https://doi.org/10.1017/S0263574723001868 [Opens in a new window]
Copyright: © The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Zhang, J. and Singh, S., “Loam: Lidar odometry and mapping in real-time,” Rob. Sci. Syst. 2(9), 1–9 (2014).Google Scholar

Shan, T. and Englot, B., “Lego-loam: Lightweight and Ground-Optimized Lidar Odometry and Mapping on Variable Terrain,” 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2018) pp. 4758–4765.Google Scholar

Mur-Artal, R., Montiel, J. M. M. and Tardos, J. D., “ORB-SLAM: A versatile and accurate monocular slam system,” IEEE Trans. Rob. 31(5), 1147–1163 (2015).Google Scholar

Behley, J. and Stachniss, C., “Efficient surfel-based slam using 3d laser range data in urban environments,” Rob. Sci. Syst. 2018(1), 59 (2018).Google Scholar

Deschaud, J.-E., “IMLS-SLAM: Scan-to-Model Matching Based on 3d Data,” 2018 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2018) pp. 2480–2485.Google Scholar

Moosmann, F. and Fraichard, T., “Motion Estimation from Range Images in Dynamic Outdoor Scenes,” 2010 IEEE International Conference on Robotics and Automation (IEEE, 2010) pp. 142–147.Google Scholar

Park, J., Cho, Y. and Shin, Y.-S., “Nonparametric background model-based lidar slam in highly dynamic urban environments,” IEEE Trans. Intell. Transp. 23(12), 24190–24205 (2022).Google Scholar

Tan, W., Liu, H., Dong, Z., Zhang, G. and Bao, H., “Robust Monocular SLAM in Dynamic Environments,” 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR) (IEEE, 2013) pp. 209–218.Google Scholar

Ruchti, P. and Burgard, W., “Mapping with Dynamic-Object Probabilities Calculated from Single 3d Range Scans,” 2018 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2018) pp. 6331–6336.Google Scholar

Pfreundschuh, P., Hendrikx, H. F., Reijgwart, V., Dubé, R., Siegwart, R. and Cramariuc, A., “Dynamic Object Aware Lidar SLAM Based on Automatic Generation of Training Data,” 2021 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2021) pp. 11641–11647.CrossRef Google Scholar

Cheng, J., Sun, Y. and Meng, M. Q.-H., “Robust semantic mapping in challenging environments,” Robotica 38(2), 256–270 (2020).Google Scholar

Chen, G., Wang, B., Wang, X., Deng, H., Wang, B. and Zhang, S., “PSF-LO: Parameterized Semantic Features Based Lidar Odometry,” 2021 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2021) pp. 5056–5062.CrossRef Google Scholar

Chen, X., Milioto, A., Palazzolo, E., Giguere, P., Behley, J. and Stachniss, C., “Suma++: Efficient Lidar-based Semantic SLAM,” 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2019) pp. 4530–4537.Google Scholar

Milioto, A., Vizzo, I., Behley, J. and Stachniss, C., “Rangenet++: Fast and Accurate Lidar Semantic Segmentation,” 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2019) pp. 4213–4220.Google Scholar

Chen, Y. and Medioni, G., “Object modelling by registration of multiple range images,” Image Vision Comput. 10(3), 145–155 (1992).Google Scholar

Dewan, A., Caselitz, T., Tipaldi, G. D. and Burgard, W., “Motion-based Detection and Tracking in 3d Lidar Scans,” 2016 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2016) pp. 4508–4513.Google Scholar

Fischler, M. A. and Bolles, R. C., “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM 24(6), 381–395 (1981).Google Scholar

Dewan, A., Caselitz, T., Tipaldi, G. D. and Burgard, W., “Rigid Scene Flow for 3d Lidar Scans,” 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2016) pp. 1765–1770.Google Scholar

Sualeh, M. and Kim, G.-W., “Semantics aware dynamic slam based on 3d modt,” Ah S. Sens. 21(19), 6355 (2021).CrossRef Google Scholar PubMed

Kalman, R. E., A new approach to linear filtering and prediction problems (1960).CrossRef Google Scholar

Weng, X., Wang, J., Held, D. and Kitani, K., “3d Multi-Object Tracking: A Baseline and New evaluation Metrics,” 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2020) pp. 10359–10366.CrossRef Google Scholar

Shi, S., Wang, X. and Li, H., “Pointrcnn: 3d Object Proposal Generation and Detection from Point Cloud,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019) pp. 770–779.Google Scholar

Kim, A., Ošep, A. and Leal-Taixé, L., “Eagermot: 3d Multi-Object Tracking via Sensor Fusion,” 2021 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2021) pp. 11315–11321.Google Scholar

Wang, S., Cai, P., Wang, L. and Liu, M., “Ditnet: End-to-end 3d object detection and track id assignment in spatio-temporal world,” IEEE Rob. Autom. Lett. 6(2), 3397–3404 (2021).Google Scholar

Huang, K. and Hao, Q., “Joint Multi-Object Detection and Tracking with Camera-Lidar Fusion for Autonomous Driving,” 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2021) pp. 6983–6989.Google Scholar

Llamazares, Á., Molinos, E. J. and Ocaña, M., “Detection and tracking of moving obstacles (datmo): A review,” Robotica 38(5), 761–774 (2020).Google Scholar

Liu, Z. and Zhang, F., “Balm: Bundle adjustment for lidar mapping,” IEEE Rob. Autom. Lett. 6(2), 3184–3191 (2021).Google Scholar

Triggs, B., McLauchlan, P. F., Hartley, R. I. and Fitzgibbon, A. W., “Bundle Adjustment–a Modern Synthesis,” In: International Workshop On Vision Algorithms (Springer, 1999) pp. 298–372.Google Scholar

Dellenbach, P., Deschaud, J.-E., Jacquet, B. and Goulette, F., “CT-ICP: Real-Time Elastic Lidar Odometry with Loop Closure,” 2022 International Conference on Robotics and Automation (ICRA) (IEEE, 2022) pp. 5580–5586.CrossRef Google Scholar

Liu, W., Sun, W. and Liu, Y., “Dloam: Real-time and robust lidar slam system based on CNN in dynamic urban environments,” IEEE Open J. Intell. Transp. Syst. (2021).Google Scholar

Klein, G. and Murray, D., “Parallel Tracking and Mapping for Small AR Workspaces,” 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality (IEEE, 2007) pp. 225–234.Google Scholar

Pire, T., Baravalle, R., D’alessandro, A. and Civera, J., “Real-time dense map fusion for stereo slam,” Robotica 36(10), 1510–1526 (2018).Google Scholar

Forster, C., Pizzoli, M. and Scaramuzza, D., “SVO: Fast Semi-Direct Monocular Visual Odometry,” 2014 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2014) pp. 15–22.Google Scholar

Newcombe, R. A., Lovegrove, S. J. and Davison, A. J., “DTAM: Dense Tracking and Mapping in Real-Time,” 2011 International Conference on Computer Vision (IEEE, 2011) pp. 2320–2327.Google Scholar

Engel, J., Schöps, T. and Cremers, D., “LSD-SLAM: Large-Scale Direct Monocular SLAM,” European Conference on Computer Vision (Springer, 2014) pp. 834–849.Google Scholar

Zhang, J. and Singh, S., “Visual-Lidar Odometry and Mapping: Low-Drift, Robust, and Fast,” 2015 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2015) pp. 2174–2181.CrossRef Google Scholar

Shin, Y.-S., Park, Y. S. and Kim, A., “DVL-SLAM: Sparse depth enhanced direct visual-lidar SLAM,” Auton. Rob. 44(2), 115–130 (2020).Google Scholar

Huang, S.-S., Ma, Z.-Y., Mu, T.-J., Fu, H. and Hu, S.-M., “Lidar-Monocular Visual Odometry Using Point and Line Features,” 2020 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2020) pp. 1091–1097.Google Scholar

Engel, J., Koltun, V. and Cremers, D., “Direct sparse odometry,” IEEE Trans. Pattern Anal. 40(3), 611–625 (2017).CrossRef Google Scholar PubMed

Shan, T., Englot, B., Meyers, D., Wang, W., Ratti, C. and Rus, D., “LIO-SAM: Tightly-Coupled Lidar Inertial Odometry via Smoothing and Mapping,” 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2020) pp. 5135–5142.Google Scholar

Campos, C., Elvira, R., Rodríguez, J. J. G., Montiel, J. M. and Tardós, J. D., “ORB-SLAM3: An accurate open-source library for visual, visual–inertial, and multimap SLAM,” IEEE Trans. Rob. 37(6), 1874–1890 (2021).Google Scholar

Leutenegger, S., Furgale, P., Rabaud, V., Chli, M., Konolige, K. and Siegwart, R., “Keyframe-based Visual-Inertial SLAM Using Nonlinear Optimization,” Proceedings of Robotis Science and Systems (RSS) 2013 (2013).Google Scholar

Shan, T., Englot, B., Ratti, C. and Rus, D., “LVI-SAM: Tightly-Coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping,” 2021 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2021) pp. 5692–5698.CrossRef Google Scholar

Zermas, D., Izzat, I. and Papanikolopoulos, N., “Fast Segmentation of 3d Point Clouds: A Paradigm on Lidar Data for Autonomous Vehicle Applications,” 2017 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2017) pp. 5067–5073.CrossRef Google Scholar

Yang, Z., Sun, Y., Liu, S. and Jia, J., “3DSSD: Point-based 3d Single Stage Object Detector,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020) pp. 11040–11048.Google Scholar

Kuhn, H. W., “The hungarian method for the assignment problem,” Nav. Res. Logist. Q. 2(1-2), 83–97 (1955).Google Scholar

Baidu, Apolloauto (2022) https://github.com/ApolloAuto/apollo.Google Scholar

Koide, K., Yokozuka, M., Oishi, S. and Banno, A., “Voxelized GICP for Fast and Accurate 3d Point Cloud Registration,” 2021 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2021) pp. 11054–11059.Google Scholar

Geneva, P., Eckenhoff, K., Yang, Y. and Huang, G., “Lips: Lidar-Inertial 3d Plane SLAM,” 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2018) pp. 123–130.Google Scholar

Hartley, R. and Zisserman, A., Multiple View Geometry in Computer Vision (Cambridge University Press, UK, 2003).Google Scholar

Madsen, K., Nielsen, H. B. and Tingleff, O., Methods for non-linear least squares problems (2004).Google Scholar

Henein, M., Zhang, J., Mahony, R. and Ila, V., “Dynamic SLAM: The Need for Speed,” 2020 IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2020) pp. 2123–2129.Google Scholar

Geiger, A., Lenz, P. and Urtasun, R., “Are We Ready for Autonomous Driving? the Kitti Vision Benchmark Suite,” 2012 IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2012) pp. 3354–3361.Google Scholar

Qin, T. and Cao, S., A-loam: A lidar odometry and mapping (2019) https://github.com/HKUST-Aerial-Robotics/A-LOAM.Google Scholar

Wang, H., Wang, C., Chen, C.-L. and Xie, L., “F-loam: Fast Lidar Odometry and Mapping,” 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2021) pp. 4390–4396.Google Scholar

Wang, R., Schworer, M. and Cremers, D., “Stereo DSO: Large-Scale Direct Sparse Visual Odometry with Stereo Cameras,” Proceedings of the IEEE International Conference on Computer Vision (2017) pp. 3903–3911.Google Scholar

Wu, H., Han, W., Wen, C., Li, X. and Wang, C., “3d multi-object tracking in point clouds based on prediction confidence-guided data association,” IEEE Trans. Intell. Transp. 23 (6), 5668–5677 (2021).Google Scholar

Sturm, J., Engelhard, N., Endres, F., Burgard, W. and Cremers, D., “A Benchmark for the Evaluation of RGB-D SLAM Systems,” 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IEEE, 2012) pp. 573–580.Google Scholar

Bernardin, K. and Stiefelhagen, R., “Evaluating multiple object tracking performance: The clear mot metrics,” EURASIP J. Image Video Process. 2008(1), 1–10 (2008).Google Scholar

Moore, T. and Stouch, D., “A Generalized Extended Kalman Filter Implementation for the Robot Operating System,” In: Intelligent Autonomous Systems 13 (Springer, 2016) pp. 335–348.CrossRef Google Scholar

Ma et al. supplementary material

Appendix

File 34.4 KB

Article contents

Semantic geometric fusion multi-object tracking and lidar odometry in dynamic environment

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Ma et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests