Swamping and masking rate in binomial regression model
Resumo
High-dimensional data detection becomes challenging due to swamping and masking effects since they prevent the identification of real outliers. This study demonstrates how the integration of robust statistical methods enhances the detection of fraudulent transactions using binomial regression and Isolation Forest models. Binomial regression, leveraging robust variable selection techniques, ensures optimal feature selection while mitigating the effects of swamping and masking. Isolation Forest, with carefully tuned subsampling, enhances anomaly detection by isolating fraudulent instances effectively. Performance evaluation is conducted using accuracy, precision, recall, F1-score, and AUC-ROC metrics, compared against traditional outlier detection methods such as K-Means and DBSCAN.
Downloads
Referências
X. Duan, H. Yan, D. Tian, J. Zhou, J. Su, and W. Hao, In-vehicle can bus tampering attacks detection for connected and autonomous vehicles using an improved isolation forest method, IEEE Transactions on Intelligent Transportation Systems 24 (2021), no. 2, 2122–2134.
J. Gu, N. Zhao, W. Xiong, Q. Liu, Z. Zhang, H. Zhang, J. Zhang, H. Jung, Y. Wang, and X. E. Wang, Swapanything: Enabling arbitrary object swapping in personalized image editing, European Conference on Computer Vision, Springer, 2024, pp. 402–418.
R. Langone, A. Cuzzocrea, and N. Skantzos, Interpretable anomaly prediction: Predicting anomalous behavior in industry 4.0 settings via regularized logistic regression tools, Data & Knowledge Engineering 130 (2020), 101850.
C. Li, Statistical estimation and inference for large-scale categorical data, Ph.d. dissertation, University of Michigan, 2022.
M. Li, M. Ariannezhad, A. Yates, and M. de Rijke, Masked and swapped sequence modeling for next novel basket recommendation in grocery shopping, Proceedings of the 17th ACM Conference on Recommender Systems, ACM, September 2023, pp. 35–46.
J. S. Pimentel, R. Ospina, and A. Ara, Learning time acceleration in support vector regression: a case study in educational data mining, Stats 4 (2021), no. 3, 682–700.
N. Saleem, A. Akbar, A. R. Imon, and A. S. M. Al Mamun, Detection of outliers in binomial regression using ceres and partial residual plots, Journal of Statistical Modeling & Analytics (JOSMA) 4 (2022), no. 2, 1–13.
T. Veran, P. E. Portier, and F. Fouquet, Interpretable hierarchical symbolic regression for safety-critical systems with an application to highway crash prediction, Engineering Applications of Artificial Intelligence 117 (2023), 105534.
J. Westfall and T. Yarkoni, Statistically controlling for confounding constructs is harder than you think, PLOS ONE 11 (2016), no. 3, e0152719.
Copyright (c) 2026 Boletim da Sociedade Paranaense de Matemática

This work is licensed under a Creative Commons Attribution 4.0 International License.
When the manuscript is accepted for publication, the authors agree automatically to transfer the copyright to the (SPM).
The journal utilize the Creative Common Attribution (CC-BY 4.0).



