Shapley-Based Comparative Analysis of Predictive Models for Life Expectancy Across Multiple Regions
- DOI
- 10.2991/978-94-6463-978-0_38How to use a DOI?
- Keywords
- Life Expectancy Forecasting; Machine Learning; SHAP; LIME; Model Interpretability; LightGBM; Public Health Policy; Feature Importance; Global and Local Explanations; Socio-economic Planning
- Abstract
Life expectancy forecasting is an important instrument for evidence-based health policy and socio-economic planning since it reflects the intricate interaction between population health, environmental conditions, and availability of health care. Reliable forecasts allow governments and health authorities to maximize resource planning, develop efficient public health interventions, and implement reliable social policies.
Herein, we compare nine machine learning models—Linear Regression, Neural Network, Support Vector Regression (SVR), Decision Tree, Random Forest, XGBoost, LightGBM, CatBoost, and Huber Gradient Boosting—on a dataset of 3,111 observations over the period 2000–2016 across six WHO regions: Europe, Africa, Americas, Eastern Mediterranean, Western Pacific, and South-East Asia. The salient features were child mortality, basic water access, gross national income per capita, and health expenditure. In addition, two-feature models with child mortality and water access were tested with LightGBM and Elastic Net to compare minimal-feature prediction capability. Model performance was tested with 5-fold group cross-validation, mean absolute error (MAE), and the coefficient of determination (R2).
LightGBM with the full feature set had a well-balanced performance (R2 = 0.905, MAE = 2.017 years) with a minimal number of outliers (150), and at the same time, yielded explainable feature contributions using Shapley Additive Explanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME).
It is observed that the Random Forest algorithm resulted in high R2 (0.984). The high accuracy indicates the model’s robustness for practical implementation. The findings from this study also indicate that machine learning techniques can lead to actionable predictions for life expectancy. Further, the results demonstrate that such models can be incorporated in future public health policies and they can aid in data-driven decision-making, thereby eliminating health inequities and contributing to the well-being of populations at a worldwide level.
- Copyright
- © 2025 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Sravya Veda Tadeparti AU - Vahini Ramadhenu AU - Jaya Prakash Vemuri PY - 2025 DA - 2025/12/31 TI - Shapley-Based Comparative Analysis of Predictive Models for Life Expectancy Across Multiple Regions BT - Proceedings of the 1st Engineering Data Analytics and Management Conference (EAMCON 2025) PB - Atlantis Press SP - 440 EP - 448 SN - 2352-5401 UR - https://doi.org/10.2991/978-94-6463-978-0_38 DO - 10.2991/978-94-6463-978-0_38 ID - Tadeparti2025 ER -