MALWARE DETECTION USING MACHINE LEARNING

YANNAM RAVI  SANKAR; Anumala  Alekhya; E. Deena  Dayalan; Kattiram Lakshmi  Prasanna

doi:10.5269/bspm.80503

Autores

RAVI SANKAR YANNAM Department of Electrical and Elctronics Engineering, QIS college of Engineering and Technology
Anumala Alekhya
E. Deena Dayalan
Kattiram Lakshmi Prasanna

DOI:

https://doi.org/10.5269/bspm.80503

Resumo

Malware attacks companies and critical infrastructure. This turns into a significant issue in
modern cybersecurity. Malware simply is malicious software, or in other words, programs that are meant
to infiltrate, destroy, or corrupt a system, such as a computer or a network. This may lead to data theft,
loss of money, and crashing of the system. Detecting malware using signature-based methods is not effective
against the evolving nature of malware. Therefore, we need the smarter and cleverer methods of locating it.
The present work presents a machine learning approach to detect and classify different types of malware with
the help of supervised learning. In this study, a dataset is used that has various types of malware, some of
them being adware, SMS malware, benign software, riskware, and banking malware. The method involves
data preparation, feature selection through Chi-Square and Extra Trees Classifier, and classification through
Support Vector Machine (SVM), Decision Tree, and Naive Bayes classifiers. The database is split into a
training and a testing set to test the efficacy of each of the models. The methods of selecting features identify
the largest features that determine the accuracy of the classifications, which enhances the performance of the
model. Experimental data showed that the decision tree classifier was more accurate than the SVM and Na¨Ä±ve
Bayes classifiers. The accuracy scores and confusion matrices are used to evaluate the trained model and
give a clear picture of its classification capacity. The visualizations, such as correlation heatmaps and feature
significance plots, are used to enhance the interpretability of the researchers. The trained model is pickled,
and the finished model is saved, which may be used in real-time malware detector systems. The methods of
selecting features identify the largest features that determine the accuracy of the classifications, which enhances
the performance of the model. Experimental data indicated that the decision tree classifier outperforms both
the SVM and na¨Ä±ve Bayes classifiers in terms of accuracy. The accuracy scores and confusion matrices are used
to evaluate the trained model and give a clear picture of its classification capacity. To enhance interpretability,
data visualization methods like correlation heatmaps or feature significance plots are used. SCOPE: The scope
of the study is limited to the particular examples of a decision tree, support vector machine, na¨Ä±ve Bayes,
feature selection, cybersecurity, and AI-based threat detection.

MALWARE DETECTION USING MACHINE LEARNING

Autores

DOI:

Resumo

Downloads

Publicado

Edição

Seção

Licença

Desenvolvido por

Idioma

Informações