Malware API Calls Detection Using Hybrid Logistic Regression and RNN Model

Behavioral malware analysis is a powerful technique used against zero-day and obfuscated malware. Additionally referred to as dynamic malware analysis, this approach employs various methods to achieve enhanced detection. One such method involves using machine learning and deep learning algorithms to...

Full description

Bibliographic Details
Main Authors: Almaleh, A. (Author), Almushabb, R. (Author), Ogran, R. (Author)
Format: Article
Language:English
Published: MDPI 2023
Subjects:
Online Access:View Fulltext in Publisher
View in Scopus
LEADER 03046nam a2200241Ia 4500
001 10.3390-app13095439
008 230529s2023 CNT 000 0 und d
020 |a 20763417 (ISSN) 
245 1 0 |a Malware API Calls Detection Using Hybrid Logistic Regression and RNN Model 
260 0 |b MDPI  |c 2023 
856 |z View Fulltext in Publisher  |u https://doi.org/10.3390/app13095439 
856 |z View in Scopus  |u https://www.scopus.com/inward/record.uri?eid=2-s2.0-85159258376&doi=10.3390%2fapp13095439&partnerID=40&md5=35bdd1a5c2968476ac43757c7ce29394 
520 3 |a Behavioral malware analysis is a powerful technique used against zero-day and obfuscated malware. Additionally referred to as dynamic malware analysis, this approach employs various methods to achieve enhanced detection. One such method involves using machine learning and deep learning algorithms to learn from the behavior of malware. However, the task of weight initialization in neural networks remains an active area of research. In this paper, we present a novel hybrid model that utilizes both machine learning and deep learning algorithms to detect malware across various categories. The proposed model achieves this by recognizing the malicious functions performed by the malware, which can be inferred from its API call sequences. Failure to detect these malware instances can result in severe cyberattacks, which pose a significant threat to the confidentiality, privacy, and availability of systems. We rely on a secondary dataset containing API call sequences, and we apply logistic regression to obtain the initial weight that serves as input to the neural network. By utilizing this hybrid approach, our research aims to address the challenges associated with traditional weight initialization techniques and to improve the accuracy and efficiency of malware detection based on API calls. The integration of both machine learning and deep learning algorithms allows the proposed model to capitalize on the strengths of each approach, potentially leading to a more robust and versatile solution to malware detection. Moreover, our research contributes to the ongoing efforts in the field of neural networks, by offering a novel perspective on weight initialization techniques and their impact on the performance of neural networks in the context of behavioral malware analysis. Experimental results using a balanced dataset showed 83% accuracy and a 0.44 loss, which outperformed the baseline model in terms of the minimum loss. The imbalanced dataset’s accuracy was 98%, and the loss was 0.10, which exceeded the state-of-the-art model’s accuracy. This demonstrates how well the suggested model can handle malware classification. © 2023 by the authors. 
650 0 4 |a API calls 
650 0 4 |a logistic regression 
650 0 4 |a malware 
650 0 4 |a malware detection 
650 0 4 |a neural network 
650 0 4 |a weight initialization 
700 1 0 |a Almaleh, A.  |e author 
700 1 0 |a Almushabb, R.  |e author 
700 1 0 |a Ogran, R.  |e author 
773 |t Applied Sciences (Switzerland)