Kaninfradet3D: A Road-side Camera-LiDAR Fusion 3D Perception Model based on Nonlinear Feature Extraction and Intrinsic Correlation
- DOI
- 10.2991/978-94-6463-972-8_16How to use a DOI?
- Keywords
- Camera-LiDAR Fusion Methods; 3D Object Detection; Kolmogorov-Arnold Network; Roadside Traffic Perception
- Abstract
Numerous approaches for ego-vehicle 3D perception tasks have arisen as a result of the advancement of AI-assisted driving, but there has been little research on roadside perception. The roadside perspective is worthwhile to develop since it offers a global picture and a wider sensory range. While cameras offer semantic information, LiDAR offers exact 3-D spatial data. In 3D detection, these two modalities work well together. Nevertheless, since both the extraction and fusion process is not sufficiently accurate, additional camera data fails to enhance accuracy in certain tests. Kolmogorov-Arnold Networks (KANs), which are more appropriate for high-dimensional, complicated data, have recently been suggested as alternatives to MLPs. Kaninfradet3D, which improves the feature extraction along with fusion modules, is proposed in this study. KAN Layers were used to enhance the encoder and fuser modules of the model. Cross-attention was used to improve feature fusion, and visually comparisons showed that camera features had better merged. This solved the issue of unusually concentrated camera characteristics, which had a detrimental influence on fusion. Our method surpasses the benchmark by +1.40 mAP in the infrastructure portion of the TUMTraf V2X Cooperative Perception Dataset and by +9.87 mAP and +10.64 mAP in both perspectives of the TUMTraf Intersection Dataset. The results highlight the potential of employing KANs in roadside perception challenges by demonstrating that Kaninfradet3D can successfully fuse features.
- Copyright
- © 2025 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Nanfang Zheng AU - Pei Liu AU - Yufei Ji AU - Yiqun Li AU - Yifan Zhuang AU - Yinsong Wang AU - Chengxiang Wang AU - Ziyuan Pu PY - 2025 DA - 2025/12/29 TI - Kaninfradet3D: A Road-side Camera-LiDAR Fusion 3D Perception Model based on Nonlinear Feature Extraction and Intrinsic Correlation BT - Proceedings of the 14th Asia-Pacific Conference on Transportation and the Environment (APTE 2025) PB - Atlantis Press SP - 161 EP - 175 SN - 2589-4943 UR - https://doi.org/10.2991/978-94-6463-972-8_16 DO - 10.2991/978-94-6463-972-8_16 ID - Zheng2025 ER -