| Abstract: |
A thorough analysis of methods for addressing class imbalance in software failure prediction is presented in this work. A common problem that has a big influence on machine learning models' performance and frequently results in biased predictions is class imbalance. To lessen this difficulty, a range of strategies have been investigated, including ensemble strategies like Bagging, Boosting, Stacking, and Two-Stage Ensembles; algorithm-level strategies like Cost-Sensitive Learning; and data-level strategies like SMOTE and MAHAKIL. Based on important performance criteria like accuracy, precision, recall, and stability, the evaluation determines how well these methods work on a number of popular datasets, including PROMISE, NASA, and CPDP. Furthermore, hybrid approaches that blend ensemble learning and sampling strategies have demonstrated encouraging outcomes in terms of enhancing prediction resilience and accuracy. In order to help choose the best techniques for software failure prediction in unbalanced situations, this research attempts to shed light on the advantages and disadvantages of each strategy. [ABSTRACT FROM AUTHOR] |