Comprehensive Analysis of Insurance Premium Prediction Using Ensemble Machine Learning Approaches
- DOI
- 10.2991/978-94-6239-642-5_34How to use a DOI?
- Keywords
- Insurance premium prediction; LightGBM; Risk assessment
- Abstract
This study presents a comprehensive investigation into insurance premium prediction utilizing advanced ensemble machine learning methodologies. The research employs a sophisticated stacking framework that integrates ten distinct predictive models, including CatBoost, LightGBM variants, XGBoost, Random Forest, and Linear Regression, to accurately forecast insurance premium amounts. Through meticulous feature engineering, target encoding strategies, and cross-validation techniques, the ensemble approach achieves a remarkable root mean squared logarithmic error of 1.045375 on validation data. The dataset comprises 1.2 million training observations and 800,000 test samples with 20 predictor variables encompassing demographic, financial, health, and policy-related attributes. The methodology addresses critical challenges including missing value imputation, categorical variable transformation, and model heterogeneity optimization. Results demonstrate that strategic combination of gradient boosting algorithms with varying hyperparameter configurations yields superior predictive performance compared to individual models, with LightGBM configurations achieving validation errors as low as 1.04583. This research contributes to the actuarial science domain by establishing a robust framework for premium estimation that balances predictive accuracy with computational efficiency, offering practical implications for insurance industry applications in risk assessment and pricing optimization.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Ziqin Huang PY - 2026 DA - 2026/04/29 TI - Comprehensive Analysis of Insurance Premium Prediction Using Ensemble Machine Learning Approaches BT - Proceedings of the 2026 11th International Conference on Financial Innovation and Economic Development (ICFIED 2026) PB - Atlantis Press SP - 328 EP - 336 SN - 2352-5428 UR - https://doi.org/10.2991/978-94-6239-642-5_34 DO - 10.2991/978-94-6239-642-5_34 ID - Huang2026 ER -