Human Versus Machine Generated Text Authenticity Detection System

Akella Hima Bala Padmini; G. N. V. G. Sirisha

doi:10.2991/978-94-6463-940-7_27

<Previous Article In Volume

Next Article In Volume>

Human Versus Machine Generated Text Authenticity Detection System

Authors

Akella Hima Bala Padmini¹^{, *}, G. N. V. G. Sirisha²

¹Computer Science and Engineering Department, Sagi Ramakrishnam Raju Engineering College, Bhimavaram, India

²Computer Science and Engineering Department, Sagi Ramakrishnam Raju Engineering College, Bhimavaram, India

^*Corresponding author. Email: himabala@srkrec.ac.in

Corresponding Author

Akella Hima Bala Padmini

Available Online 31 December 2025.

DOI: 10.2991/978-94-6463-940-7_27 How to use a DOI?
Keywords: Text Classification; ANN; DistilBERT; Model Generalization; Transformer Models; NLP Robustness; Tokenization Strategy
Abstract: With the emergence of AI models like GPT, Deep Fake technologies, differentiating AI and human written text is increasingly complex. These models can produce highly realistic and consistent content that is difficult to detect, creating a growing demand for effective AI versus Human Text Authenticity Detection Systems. These technologies can identify minor deviations in writing style, structure and language patterns through linguistic analysis and machine learning. The technology can be used in many ways in areas like education (e.g. detecting AI-written assignments), media (e.g. authorizing human signatures in news headlines), all the way to cybersecurity (e.g. spam filters of phishing messages that are generated by AI).This research focuses on classifying text as either AI-generated or written by a human. Where, the research dealt with five varying datasets provided by Hugging Face, Google, and Kaggle, and includes variety of text types, like literature, essays and news. DistilBERT achieved 96.32% accuracy on internal validation, surpassing the ANN model’s 90.78%. However, when tested on unseen data, the ANN generalized better with 95.2% accuracy, while DistilBERT’s performance dropped to 49.4%. This shows that in basic text classification simpler models when trained cautiously can adapt more readily than their complicated counterparts.
Copyright: © 2025 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the Conference on Social and Sustainable Innovation in Technology & Engineering (SASI-ITE 2025)
Series: Advances in Intelligent Systems Research
Publication Date: 31 December 2025
ISBN: 978-94-6463-940-7
ISSN: 1951-6851
DOI: 10.2991/978-94-6463-940-7_27 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Akella Hima Bala Padmini
AU  - G. N. V. G. Sirisha
PY  - 2025
DA  - 2025/12/31
TI  - Human Versus Machine Generated Text Authenticity Detection System
BT  - Proceedings of the Conference on Social and Sustainable Innovation in Technology & Engineering (SASI-ITE 2025)
PB  - Atlantis Press
SP  - 363
EP  - 373
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-940-7_27
DO  - 10.2991/978-94-6463-940-7_27
ID  - Padmini2025
ER  -

download .riscopy to clipboard