An Ensemble Machine Learning Approach for Detecting Fraudulent Insurance Claims

Authors

  • Yukti Lnu KForce Author

Keywords:

Insurance Fraud Detection, Ensemble Learning, Stacked Generalization, Gradient Boosting, Random Forest, Imbalanced Classification, Financial Risk Analytics

Abstract

Fraudulent insurance claims impose substantial financial losses and operational inefficiencies on the global insurance industry. Detecting such claims is particularly challenging due to severe class imbalance, evolving fraud tactics, and heterogeneous claim characteristics. This study proposes a structured ensemble machine learning framework for detecting fraudulent insurance claims using a stacked generalization approach. The proposed architecture integrates logistic regression, Random Forest, and XGBoost as base learners, while a meta-level classifier optimally combines their predictive outputs. The framework emphasizes robust preprocessing, class-weight adjustment, and cross-validation-based stacking to mitigate information leakage and enhance minority-class detection. Experimental evaluation on an imbalanced insurance claim dataset demonstrates that the stacked ensemble significantly outperforms individual base models in terms of Area Under the Curve (AUC), recall, and F1-score. The ensemble approach effectively reduces false negatives, thereby improving operational fraud sensitivity without substantially increasing false positives. The findings confirm that ensemble learning enhances predictive stability and generalization in insurance fraud detection tasks. The proposed model offers a scalable, interpretable, and deployment-ready solution for modern insurance analytics environments.

Downloads

Download data is not yet available.

References

E. L. Schrenk and J. B. Palmquist, "Fraud and its Effects on the Insurance Industry," Def. Counsel J., vol. 64, p. 23, 1997.

A. R. Khalid, N. Owoh, O. Uthmani, M. Ashawa, J. Osamor, and J. Adejoh, "Enhancing credit card fraud detection: an ensemble machine learning approach," Big Data and Cognitive Computing, vol. 8, no. 1, p. 6, 2024.

K. R. Kerwin and N. D. Bastian, "Stacked generalizations in imbalanced fraud data sets using resampling methods," The Journal of Defense Modeling and Simulation, vol. 18, no. 3, pp. 175-192, 2021.

A. A. Taha and S. J. Malebary, "An intelligent approach to credit card fraud detection using an optimized light gradient boosting machine," IEEE access, vol. 8, pp. 25579-25587, 2020.

B. Cheng, J. Hu, Y. Chen, F. Shu, and Z. D. Chen, "STAR-RIS enhanced covert communication with delay constraint," in 2023 IEEE/CIC International Conference on Communications in China (ICCC), 2023: IEEE, pp. 1-6.

J. T. Hancock and T. M. Khoshgoftaar, "Gradient boosted decision tree algorithms for medicare fraud detection," SN Computer Science, vol. 2, no. 4, p. 268, 2021.

Z. Zhou, B. Li, C. Wang, Y. Li, and D. Su, "Modeling of the mode conversion effect induced by asymmetry/coating in floating multiconductor cables," IEEE Transactions on Electromagnetic Compatibility, vol. 66, no. 3, pp. 787-800, 2024.

X. Hu et al., "Cost-sensitive GNN-based imbalanced learning for mobile social network fraud detection," IEEE Transactions on Computational Social Systems, vol. 11, no. 2, pp. 2675-2690, 2023.

Downloads

Published

21-03-2025

How to Cite

[1]
Yukti Lnu, “An Ensemble Machine Learning Approach for Detecting Fraudulent Insurance Claims”, American J Data Sci Artif Intell Innov, vol. 5, pp. 81–92, Mar. 2025, Accessed: Apr. 23, 2026. [Online]. Available: https://ajdsai.org/index.php/publication/article/view/115