Predictive Analysis of Netflix User Data Using a Hybrid Model of Naive Bayes Classifier and K-Means Clustering
- DOI
- 10.2991/978-94-6463-950-6_5How to use a DOI?
- Keywords
- Netflix; Over-the-Top (OTT); Naive Bayes; K-Means Clustering; Hybrid Model; Recommendation Systems; Entertainment; Supervised Learning Algorithm
- Abstract
The swift proliferation of Over-the-Top (OTT) platforms has upended the manner in which users indulge in digital entertainment, In the era of (OTT) platforms with Netflix shrinking as one of the largest global providers. An idea that has been a game-changer in the streaming field, Netflix is a digitally transformed content consumption service that, till 2024, has been able to offer personalized streaming to over 270 million users globally. This study introduces a hybrid predictive model integrating the Naive Bayes classifier and K-Means clustering algorithm to analyze Netflix user data for the development of more progressed recommendation systems and retention plans. The suggested design initially utilizes K-Means clustering, an unsupervised learning technique, to categorize users with similar programming, rating, and demographic data into separate groups. The segmentation process aids in uncovering patterns and identifying user groups that were previously unknown in the dataset. Afterward, the Naive Bayes classifier, a probabilistic supervised learning algorithm, is given the task for each cluster. Upon teaching the Naive Bayes model with the clusters already formed, specialized and precise predictive models for every user segment can be generated. This methodology solves the issue of “cold start” and enhances the total prediction accuracy as compared to when just one algorithm is used. The model achieved an overall accuracy of 97.43%, indicating that New data points can be classified reliably by it to their respective clusters. Such a high accuracy is a confirmation of the effectiveness of the integration of unsupervised clustering with supervised classification. The findings reveal that the hybrid approach is competent when it comes to identifying patterns (through K-Means) as well as making predictions(through Naive Bayes).As compared to employing either clustering or classification singly, this combination leads to the higher accuracy of the set. I mean, the main goal of these combinations is to reach a higher level ofaccuracy and interpretability.
- Copyright
- © 2025 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Parthiv Kashyap AU - Bikramaditya Barman AU - Chani Kakati PY - 2025 DA - 2025/12/29 TI - Predictive Analysis of Netflix User Data Using a Hybrid Model of Naive Bayes Classifier and K-Means Clustering BT - Proceedings of the International Conference on Smart Systems and Social Management (ICSSSM 2025) PB - Atlantis Press SP - 55 EP - 70 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6463-950-6_5 DO - 10.2991/978-94-6463-950-6_5 ID - Kashyap2025 ER -