Url Risk Assessment Using Machine Learning
- DOI
- 10.2991/978-94-6463-858-5_258How to use a DOI?
- Keywords
- Random Forest; HTTPS; XGBoost; Hyperparameter tuning
- Abstract
In today’s digital age, cyber threats like phishing and malware attacks have become increasingly sophisticated, making it essential to develop advanced solutions to protect users. Phishing websites often disguise themselves as legitimate platforms, tricking users into entering sensitive information such as passwords and banking details. To combat this, our system employs machine learning techniques to analyze various features of a URL and determine its legitimacy. Features like the presence of HTTPS, domain age, URL length, special characters, and subdomains are extracted and fed into machine learning models for classification. By leveraging large datasets of both safe and malicious URLs, the system learns to differentiate between them, thereby enhancing its detection accuracy.
Machine learning algorithms such as XGBoost and Random Forest are particularly effective in this context due to their ability to handle complex patterns in data. XGBoost is a powerful gradient-boosting algorithm that optimizes performance by reducing errors in each iteration, making it highly suitable for large datasets. Meanwhile, Random Forest operates by constructing multiple decision trees and aggregating their outputs, which improves the model’s robustness and reduces the risk of overfitting. To further enhance accuracy, hyperparameter tuning is applied, adjusting key parameters like the number of estimators, learning rate, and depth of trees. This optimization process ensures that the models work efficiently in real-world scenarios, minimizing false positives and false negatives.
Once the model is trained and validated, it is integrated into a user-friendly web platform where individuals can input any URL to check its safety. This platform instantly analyzes the link and provides feedback on whether it is safe or potentially dangerous. With this approach, individuals, businesses, and organizations can significantly enhance their cybersecurity defenses, mitigating risks and preventing cybercriminals from exploiting unsuspecting users.
- Copyright
- © 2025 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - G. Bhavya AU - B. Kumaraswamy AU - O. Kalyan Ram AU - A. Venkatesh AU - R. Venkatahema PY - 2025 DA - 2025/11/04 TI - Url Risk Assessment Using Machine Learning BT - Proceedings of International Conference on Computer Science and Communication Engineering (ICCSCE 2025) PB - Atlantis Press SP - 3090 EP - 3096 SN - 2352-538X UR - https://doi.org/10.2991/978-94-6463-858-5_258 DO - 10.2991/978-94-6463-858-5_258 ID - Bhavya2025 ER -