Proceedings of the 2025 2nd International Conference on Mechanics, Electronics Engineering and Automation (ICMEEA 2025)

A Review of Text Recognition in Complex Scene Images Based on Deep Learning

Authors
Xuchen Wang1, *
1Country College of Science, Southwest Petroleum University, 637001, Nanchong, China
*Corresponding author. Email: 202331095107@stu.swpu.edu.cn
Corresponding Author
Xuchen Wang
Available Online 31 August 2025.
DOI
10.2991/978-94-6463-821-9_75How to use a DOI?
Keywords
Computer vision; deep learning; scene text recognition technology; Complex scenarios
Abstract

With the rapid development of computer vision technology, Scene Text Recognition (STR) in complex scenarios has become a critical technology in intelligent security, autonomous driving, augmented reality, and related fields. However, traditional Optical Character Recognition (OCR) methods show inherent limitations in complex environments, including poor scene adaptability, reliance on manual feature extraction, and weak anti-interference capabilities. This paper systematically reviews advancements in deep learning-based STR technologies for complex scenarios, analyzing the entire workflow from text detection/recognition to end-to-end systems. Through practical applications such as license plate recognition, tunnel defect detection, and object sorting, we validate the effectiveness of YOLO and CRNN models in specific scenarios, while identifying bottlenecks including excessive model parameters and insufficient hardware compatibility. Further analysis indicates persistent challenges: high computational resource consumption, limited model generalization capabilities, poor interpretability, inadequate real-time performance, and vulnerability to adversarial samples. Finally, this paper proposes future research directions focusing on: Multimodal feature fusion, Weakly supervised learning paradigms, Lightweight model deployment and Decision-centric learning mechanisms These approaches aim to promote practical STR implementations in complex scenarios through interdisciplinary technological collaboration.

Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the 2025 2nd International Conference on Mechanics, Electronics Engineering and Automation (ICMEEA 2025)
Series
Advances in Engineering Research
Publication Date
31 August 2025
ISBN
978-94-6463-821-9
ISSN
2352-5401
DOI
10.2991/978-94-6463-821-9_75How to use a DOI?
Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Xuchen Wang
PY  - 2025
DA  - 2025/08/31
TI  - A Review of Text Recognition in Complex Scene Images Based on Deep Learning
BT  - Proceedings of the 2025 2nd International Conference on Mechanics, Electronics Engineering and Automation (ICMEEA 2025)
PB  - Atlantis Press
SP  - 776
EP  - 783
SN  - 2352-5401
UR  - https://doi.org/10.2991/978-94-6463-821-9_75
DO  - 10.2991/978-94-6463-821-9_75
ID  - Wang2025
ER  -