Advanced Cyberbullying Detection Using PyTesseract and BERT

B. CH. V. Ramana; S. Santhoshi; Y. Jaya Santhoshini Swetha; V. Devi Prasuna; M. Guna Vardhini

doi:10.2991/978-94-6463-858-5_274

<Previous Article In Volume

Next Article In Volume>

Advanced Cyberbullying Detection Using PyTesseract and BERT

Authors

B. CH. V. Ramana¹^{, *}, S. Santhoshi¹, Y. Jaya Santhoshini Swetha¹, V. Devi Prasuna¹, M. Guna Vardhini¹

¹Information Technology, Vignan’s Institute of Engineering for Women (VIEW), Visakhapatnam, Andhra Pradesh, India

^*Corresponding author. Email: bvrviits@gmail.com

Corresponding Author

B. CH. V. Ramana

Available Online 4 November 2025.

DOI: 10.2991/978-94-6463-858-5_274 How to use a DOI?
Keywords: Cyberbullying Detection; Social Media Platform; Pytesseract; BERT (Bidirectional Encoder Representations from Transformers); NLP (Natural Language Processing); Sensitive Content Detection; Online Safety
Abstract: Cyberbullying has become a major concern in digital communication, as they are appropriate for any other device, social media is on-demand giving rise to a new form of bullying called cyberbullying and their detection is required for immediate action. In this work, we propose an automated cyberbullying detection system that uses PyTesseract for extracting text from multimedia content, and BERT (Bidirectional Encoder Representations from Transformers) for natural language processing-based abuse detection. Based on this, the system is fully integrated into a sample social media application, and it monitors all user-generated content, such as text-based posts and comments as well as images containing text. The system uses PyTesseract to capture text from multimedia content, which BERT analyzes to check for offensive or abusive language. When harmful content is detected, an automated sensitive content warning is sent to the user email registered and this alerts the user instantly and prevents him/her from sharing on his/her details. By integrating optical character recognition (OCR) and deep learning-based analyzing methods, this approach not only successfully identifies the semantic patterns of malicious online behavior but also presents an effective, scalable, and on-time detection method for various content types, thus improving online safety.
Copyright: © 2025 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of International Conference on Computer Science and Communication Engineering (ICCSCE 2025)
Series: Advances in Computer Science Research
Publication Date: 4 November 2025
ISBN: 978-94-6463-858-5
ISSN: 2352-538X
DOI: 10.2991/978-94-6463-858-5_274 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - B. CH. V. Ramana
AU  - S. Santhoshi
AU  - Y. Jaya Santhoshini Swetha
AU  - V. Devi Prasuna
AU  - M. Guna Vardhini
PY  - 2025
DA  - 2025/11/04
TI  - Advanced Cyberbullying Detection Using PyTesseract and BERT
BT  - Proceedings of International Conference on Computer Science and Communication Engineering (ICCSCE 2025)
PB  - Atlantis Press
SP  - 3289
EP  - 3302
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-858-5_274
DO  - 10.2991/978-94-6463-858-5_274
ID  - Ramana2025
ER  -

download .riscopy to clipboard