A Review of Text Recognition in Complex Scene Images Based on Deep Learning
- DOI
- 10.2991/978-94-6463-821-9_75How to use a DOI?
- Keywords
- Computer vision; deep learning; scene text recognition technology; Complex scenarios
- Abstract
With the rapid development of computer vision technology, Scene Text Recognition (STR) in complex scenarios has become a critical technology in intelligent security, autonomous driving, augmented reality, and related fields. However, traditional Optical Character Recognition (OCR) methods show inherent limitations in complex environments, including poor scene adaptability, reliance on manual feature extraction, and weak anti-interference capabilities. This paper systematically reviews advancements in deep learning-based STR technologies for complex scenarios, analyzing the entire workflow from text detection/recognition to end-to-end systems. Through practical applications such as license plate recognition, tunnel defect detection, and object sorting, we validate the effectiveness of YOLO and CRNN models in specific scenarios, while identifying bottlenecks including excessive model parameters and insufficient hardware compatibility. Further analysis indicates persistent challenges: high computational resource consumption, limited model generalization capabilities, poor interpretability, inadequate real-time performance, and vulnerability to adversarial samples. Finally, this paper proposes future research directions focusing on: Multimodal feature fusion, Weakly supervised learning paradigms, Lightweight model deployment and Decision-centric learning mechanisms These approaches aim to promote practical STR implementations in complex scenarios through interdisciplinary technological collaboration.
- Copyright
- © 2025 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Xuchen Wang PY - 2025 DA - 2025/08/31 TI - A Review of Text Recognition in Complex Scene Images Based on Deep Learning BT - Proceedings of the 2025 2nd International Conference on Mechanics, Electronics Engineering and Automation (ICMEEA 2025) PB - Atlantis Press SP - 776 EP - 783 SN - 2352-5401 UR - https://doi.org/10.2991/978-94-6463-821-9_75 DO - 10.2991/978-94-6463-821-9_75 ID - Wang2025 ER -