Url Risk Assessment Using Machine Learning

G. Bhavya; B. Kumaraswamy; O. Kalyan Ram; A. Venkatesh; R. Venkatahema

doi:10.2991/978-94-6463-858-5_258

<Previous Article In Volume

Next Article In Volume>

Url Risk Assessment Using Machine Learning

Authors

G. Bhavya¹^{, *}, B. Kumaraswamy¹, O. Kalyan Ram¹, A. Venkatesh¹, R. Venkatahema¹

¹Department of Computer Science Engineering (Data Science), CMR Engineering College, Telangana, Hyderabad, India

^*Corresponding author. Email: 218r1a6780@gmail.com

Corresponding Author

G. Bhavya

Available Online 4 November 2025.

DOI: 10.2991/978-94-6463-858-5_258 How to use a DOI?
Keywords: Random Forest; HTTPS; XGBoost; Hyperparameter tuning
Abstract: In today’s digital age, cyber threats like phishing and malware attacks have become increasingly sophisticated, making it essential to develop advanced solutions to protect users. Phishing websites often disguise themselves as legitimate platforms, tricking users into entering sensitive information such as passwords and banking details. To combat this, our system employs machine learning techniques to analyze various features of a URL and determine its legitimacy. Features like the presence of HTTPS, domain age, URL length, special characters, and subdomains are extracted and fed into machine learning models for classification. By leveraging large datasets of both safe and malicious URLs, the system learns to differentiate between them, thereby enhancing its detection accuracy.

Machine learning algorithms such as XGBoost and Random Forest are particularly effective in this context due to their ability to handle complex patterns in data. XGBoost is a powerful gradient-boosting algorithm that optimizes performance by reducing errors in each iteration, making it highly suitable for large datasets. Meanwhile, Random Forest operates by constructing multiple decision trees and aggregating their outputs, which improves the model’s robustness and reduces the risk of overfitting. To further enhance accuracy, hyperparameter tuning is applied, adjusting key parameters like the number of estimators, learning rate, and depth of trees. This optimization process ensures that the models work efficiently in real-world scenarios, minimizing false positives and false negatives.

Once the model is trained and validated, it is integrated into a user-friendly web platform where individuals can input any URL to check its safety. This platform instantly analyzes the link and provides feedback on whether it is safe or potentially dangerous. With this approach, individuals, businesses, and organizations can significantly enhance their cybersecurity defenses, mitigating risks and preventing cybercriminals from exploiting unsuspecting users.
Copyright: © 2025 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of International Conference on Computer Science and Communication Engineering (ICCSCE 2025)
Series: Advances in Computer Science Research
Publication Date: 4 November 2025
ISBN: 978-94-6463-858-5
ISSN: 2352-538X
DOI: 10.2991/978-94-6463-858-5_258 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - G. Bhavya
AU  - B. Kumaraswamy
AU  - O. Kalyan Ram
AU  - A. Venkatesh
AU  - R. Venkatahema
PY  - 2025
DA  - 2025/11/04
TI  - Url Risk Assessment Using Machine Learning
BT  - Proceedings of International Conference on Computer Science and Communication Engineering (ICCSCE 2025)
PB  - Atlantis Press
SP  - 3090
EP  - 3096
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-858-5_258
DO  - 10.2991/978-94-6463-858-5_258
ID  - Bhavya2025
ER  -

download .riscopy to clipboard