Hyperparameter-Tuned PPO-Based Federated Deep Reinforcement Learning (FDRL) with Explainability for Efficient V2X Resource Allocation in 5G Networks

S. Amudha; G. Sivaradje; G. Nagarajan

doi:10.2991/978-94-6239-616-6_42

<Previous Article In Volume

Next Article In Volume>

Hyperparameter-Tuned PPO-Based Federated Deep Reinforcement Learning (FDRL) with Explainability for Efficient V2X Resource Allocation in 5G Networks

Authors

S. Amudha¹^{, *}, G. Sivaradje¹, G. Nagarajan¹

¹Electronics and Communication Engineering, Puducherry Technological University, Puducherry, India

^*Corresponding author. Email: 2401709001@ptuniv.edu.in

Corresponding Author

S. Amudha

Available Online 31 March 2026.

DOI: 10.2991/978-94-6239-616-6_42 How to use a DOI?
Keywords: Vehicle-to-Everything (V2X); Federated Learning (FL); Deep Reinforcement Learning (DRL); Proximal Policy Optimization (PPO); Explainable AI (XAI); Resource Allocation
Abstract: Vehicle-to-Everything (V2X) communication represents an essential application of 5G networks, as it enables the transfer of data with low latency and high throughput among vehicles, between vehicles and infrastructure, and across network connectivity. However, dynamic mobility in V2X environments creates a constantly changing communication environment, making it challenging to allocate resources efficiently. In this paper, we propose an approach to Hyperparameter-Tuned Explainable Federated Deep Reinforcement Learning (X-FDRL), which combines Proximal Policy Optimization (PPO), Federated Learning (FL), and Explainable AI (XAI) to optimize multiple edge agents supporting decentralized and privacy-preserving models. Each edge agent trains its local PPO model independently, sharing model weights with the cloud for global aggregation only after they have trained its model using Bayesian hyperparameter tuning. Hyperparameter tuning of the local PPO model allows FDRL to support the scalability of multiple edge agents. SHAP and LIME will provide interpretable feedback about the agent’s actions concerning the policy. Overall, the X-FDRL model offers quicker convergence, stable convergence, and more explainable decisions in unpredictable, non-stationary V2X environments. We conduct our experiments in a reactive 5G V2X environment created using MATLAB 2021b; we demonstrate clear improvements in latency, throughput, and interpretability by employing the X-FDRL model. Thus, it contributes to the gap in agreement on whether those models can learn efficiently while still being explainable by providing a new framework when applied as a reactive and trustworthy resource allocation approach in next-generation vehicular networks.
Copyright: © 2026 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the International Conference on Artificial Intelligence and Secure Data Analytics (ICAISDA 2025)
Series: Advances in Intelligent Systems Research
Publication Date: 31 March 2026
ISBN: 978-94-6239-616-6
ISSN: 1951-6851
DOI: 10.2991/978-94-6239-616-6_42 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - S. Amudha
AU  - G. Sivaradje
AU  - G. Nagarajan
PY  - 2026
DA  - 2026/03/31
TI  - Hyperparameter-Tuned PPO-Based Federated Deep Reinforcement Learning (FDRL) with Explainability for Efficient V2X Resource Allocation in 5G Networks
BT  - Proceedings of the International Conference on Artificial Intelligence and Secure Data Analytics (ICAISDA 2025)
PB  - Atlantis Press
SP  - 559
EP  - 573
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6239-616-6_42
DO  - 10.2991/978-94-6239-616-6_42
ID  - Amudha2026
ER  -

download .riscopy to clipboard