Comparison Analysis: Logistic Regression, Random Forest, XGBoost, and CatBoost in Credit Scoring
- DOI
- 10.2991/978-94-6463-854-7_16How to use a DOI?
- Keywords
- CatBoost; Logistic regression; Random forest; XGBoost; SMOTETomek
- Abstract
This research compares four machine learning algorithms--Logistic Regression, Random Forest, XGBoost, and CatBoost specifically for credit scoring. The models’ performance is assessed using several metrics, such as accuracy, precision, recall, and the Area Under the Curve (AUC). Additionally, the impact of the SMOTETomek technique on handling imbalanced datasets is examined. The findings reveal that ensemble methods, particularly XGBoost and CatBoost, outperform traditional Logistic Regression in terms of predictive accuracy and robustness. The study provides valuable insights for researchers and practitioners in selecting appropriate models and data processing techniques for credit scoring tasks involving imbalanced datasets.
- Copyright
- © 2025 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Aser Heber Ginting AU - Rahmat Widia Sembiring AU - Elviawaty Muisa Zamzami PY - 2025 DA - 2025/11/11 TI - Comparison Analysis: Logistic Regression, Random Forest, XGBoost, and CatBoost in Credit Scoring BT - Proceedings of the 2024 Brawijaya International Conference (BIC 2024) PB - Atlantis Press SP - 210 EP - 219 SN - 3091-4442 UR - https://doi.org/10.2991/978-94-6463-854-7_16 DO - 10.2991/978-94-6463-854-7_16 ID - Ginting2025 ER -