A Review of Text Recognition in Complex Scene Images Based on Deep Learning

Xuchen Wang

doi:10.2991/978-94-6463-821-9_75

<Previous Article In Volume

Next Article In Volume>

A Review of Text Recognition in Complex Scene Images Based on Deep Learning

Authors

Xuchen Wang¹^{, *}

¹Country College of Science, Southwest Petroleum University, 637001, Nanchong, China

^*Corresponding author. Email: 202331095107@stu.swpu.edu.cn

Corresponding Author

Xuchen Wang

Available Online 31 August 2025.

DOI: 10.2991/978-94-6463-821-9_75 How to use a DOI?
Keywords: Computer vision; deep learning; scene text recognition technology; Complex scenarios
Abstract: With the rapid development of computer vision technology, Scene Text Recognition (STR) in complex scenarios has become a critical technology in intelligent security, autonomous driving, augmented reality, and related fields. However, traditional Optical Character Recognition (OCR) methods show inherent limitations in complex environments, including poor scene adaptability, reliance on manual feature extraction, and weak anti-interference capabilities. This paper systematically reviews advancements in deep learning-based STR technologies for complex scenarios, analyzing the entire workflow from text detection/recognition to end-to-end systems. Through practical applications such as license plate recognition, tunnel defect detection, and object sorting, we validate the effectiveness of YOLO and CRNN models in specific scenarios, while identifying bottlenecks including excessive model parameters and insufficient hardware compatibility. Further analysis indicates persistent challenges: high computational resource consumption, limited model generalization capabilities, poor interpretability, inadequate real-time performance, and vulnerability to adversarial samples. Finally, this paper proposes future research directions focusing on: Multimodal feature fusion, Weakly supervised learning paradigms, Lightweight model deployment and Decision-centric learning mechanisms These approaches aim to promote practical STR implementations in complex scenarios through interdisciplinary technological collaboration.
Copyright: © 2025 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2025 2nd International Conference on Mechanics, Electronics Engineering and Automation (ICMEEA 2025)
Series: Advances in Engineering Research
Publication Date: 31 August 2025
ISBN: 978-94-6463-821-9
ISSN: 2352-5401
DOI: 10.2991/978-94-6463-821-9_75 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Xuchen Wang
PY  - 2025
DA  - 2025/08/31
TI  - A Review of Text Recognition in Complex Scene Images Based on Deep Learning
BT  - Proceedings of the 2025 2nd International Conference on Mechanics, Electronics Engineering and Automation (ICMEEA 2025)
PB  - Atlantis Press
SP  - 776
EP  - 783
SN  - 2352-5401
UR  - https://doi.org/10.2991/978-94-6463-821-9_75
DO  - 10.2991/978-94-6463-821-9_75
ID  - Wang2025
ER  -

download .riscopy to clipboard