Proceedings of the International Conference on Smart Systems and Social Management (ICSSSM 2025)

Predictive Analysis of Netflix User Data Using a Hybrid Model of Naive Bayes Classifier and K-Means Clustering

Authors
Parthiv Kashyap1, Bikramaditya Barman2, *, Chani Kakati3
1Guahati University, Guwahati, Assam, India
2The Assam Royal Global University, Guwahati, Assam, India
3The Assam Royal Global University, Guwahati, Assam, India
*Corresponding author. Email: bbarman1@rgu.ac
Corresponding Author
Bikramaditya Barman
Available Online 29 December 2025.
DOI
10.2991/978-94-6463-950-6_5How to use a DOI?
Keywords
Netflix; Over-the-Top (OTT); Naive Bayes; K-Means Clustering; Hybrid Model; Recommendation Systems; Entertainment; Supervised Learning Algorithm
Abstract

The swift proliferation of Over-the-Top (OTT) platforms has upended the manner in which users indulge in digital entertainment, In the era of (OTT) platforms with Netflix shrinking as one of the largest global providers. An idea that has been a game-changer in the streaming field, Netflix is a digitally transformed content consumption service that, till 2024, has been able to offer personalized streaming to over 270 million users globally. This study introduces a hybrid predictive model integrating the Naive Bayes classifier and K-Means clustering algorithm to analyze Netflix user data for the development of more progressed recommendation systems and retention plans. The suggested design initially utilizes K-Means clustering, an unsupervised learning technique, to categorize users with similar programming, rating, and demographic data into separate groups. The segmentation process aids in uncovering patterns and identifying user groups that were previously unknown in the dataset. Afterward, the Naive Bayes classifier, a probabilistic supervised learning algorithm, is given the task for each cluster. Upon teaching the Naive Bayes model with the clusters already formed, specialized and precise predictive models for every user segment can be generated. This methodology solves the issue of “cold start” and enhances the total prediction accuracy as compared to when just one algorithm is used. The model achieved an overall accuracy of 97.43%, indicating that New data points can be classified reliably by it to their respective clusters. Such a high accuracy is a confirmation of the effectiveness of the integration of unsupervised clustering with supervised classification. The findings reveal that the hybrid approach is competent when it comes to identifying patterns (through K-Means) as well as making predictions(through Naive Bayes).As compared to employing either clustering or classification singly, this combination leads to the higher accuracy of the set. I mean, the main goal of these combinations is to reach a higher level ofaccuracy and interpretability.

Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Smart Systems and Social Management (ICSSSM 2025)
Series
Advances in Intelligent Systems Research
Publication Date
29 December 2025
ISBN
978-94-6463-950-6
ISSN
1951-6851
DOI
10.2991/978-94-6463-950-6_5How to use a DOI?
Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Parthiv Kashyap
AU  - Bikramaditya Barman
AU  - Chani Kakati
PY  - 2025
DA  - 2025/12/29
TI  - Predictive Analysis of Netflix User Data Using a Hybrid Model of Naive Bayes Classifier and K-Means Clustering
BT  - Proceedings of the International Conference on Smart Systems and Social Management (ICSSSM 2025)
PB  - Atlantis Press
SP  - 55
EP  - 70
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-950-6_5
DO  - 10.2991/978-94-6463-950-6_5
ID  - Kashyap2025
ER  -