Software Defect Prediction Using Reinforcement Learning-Based Optimization
- DOI
- 10.2991/978-94-6239-616-6_10How to use a DOI?
- Keywords
- Software Defect Prediction; Reinforcement Learning; Proximal Policy Optimization; Software Metrics; Deep Learning; Quality Assurance
- Abstract
Software defect prediction (SDP) plays a critical role in ensuring software reliability, yet most existing approaches rely on static machine learning or optimization frameworks that struggle with data imbalance, feature redundancy, and limited adaptability across evolving software systems. To address these shortcomings, this work proposes a reinforcement learning–driven methodology using Proximal Policy Optimization (PPO) to model defect prediction as a sequential decision-making process. The system constructs state representations from software metrics such as complexity, Halstead measures, and lines of code, and learns optimal defect classification actions through reward-guided policy updates that penalize false negatives and reward prediction stability. Experimental results on NASA defect datasets demonstrate a 7.4% improvement in F1-score and consistent gains in precision and recall compared with leading ensemble and deep-learning baselines. The PPO agent exhibits stable reward convergence and dynamic feature relevance estimation, enabling superior generalization across heterogeneous modules. The study concludes that reinforcement learning offers a promising paradigm for next-generation defect prediction by enabling adaptive, policy-driven learning rather than static classification. This work establishes a unified RL-centered framework and highlights its practical significance for software quality assurance, particularly in safety-critical and large-scale development environments.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - V. Padmapriya AU - A. Kritika AU - C. Swetha AU - S. Vineetha PY - 2026 DA - 2026/03/31 TI - Software Defect Prediction Using Reinforcement Learning-Based Optimization BT - Proceedings of the International Conference on Artificial Intelligence and Secure Data Analytics (ICAISDA 2025) PB - Atlantis Press SP - 118 EP - 129 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6239-616-6_10 DO - 10.2991/978-94-6239-616-6_10 ID - Padmapriya2026 ER -