LSTM-Based Hierarchical Denoising Network for Android Malware Detection

Mobile security is an important issue on Android platform. Most malware detection methods based on machine learning models heavily rely on expert knowledge for manual feature engineering, which are still difficult to fully describe malwares. In this paper, we present LSTM-based hierarchical denoise...

Full description

Bibliographic Details
Main Authors: Jinpei Yan, Yong Qi, Qifan Rao
Format: Article
Language:English
Published: Hindawi-Wiley 2018-01-01
Series:Security and Communication Networks
Online Access:http://dx.doi.org/10.1155/2018/5249190
id doaj-f309d967722b46a79b6d92264a0d01e0
record_format Article
spelling doaj-f309d967722b46a79b6d92264a0d01e02020-11-25T02:26:49ZengHindawi-WileySecurity and Communication Networks1939-01141939-01222018-01-01201810.1155/2018/52491905249190LSTM-Based Hierarchical Denoising Network for Android Malware DetectionJinpei Yan0Yong Qi1Qifan Rao2Department of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, ChinaDepartment of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, ChinaDepartment of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, ChinaMobile security is an important issue on Android platform. Most malware detection methods based on machine learning models heavily rely on expert knowledge for manual feature engineering, which are still difficult to fully describe malwares. In this paper, we present LSTM-based hierarchical denoise network (HDN), a novel static Android malware detection method which uses LSTM to directly learn from the raw opcode sequences extracted from decompiled Android files. However, most opcode sequences are too long for LSTM to train due to the gradient vanishing problem. Hence, HDN uses a hierarchical structure, whose first-level LSTM parallelly computes on opcode subsequences (we called them method blocks) to learn the dense representations; then the second-level LSTM can learn and detect malware through method block sequences. Considering that malicious behavior only appears in partial sequence segments, HDN uses method block denoise module (MBDM) for data denoising by adaptive gradient scaling strategy based on loss cache. We evaluate and compare HDN with the latest mainstream researches on three datasets. The results show that HDN outperforms these Android malware detection methods,and it is able to capture longer sequence features and has better detection efficiency than N-gram-based malware detection which is similar to our method.http://dx.doi.org/10.1155/2018/5249190
collection DOAJ
language English
format Article
sources DOAJ
author Jinpei Yan
Yong Qi
Qifan Rao
spellingShingle Jinpei Yan
Yong Qi
Qifan Rao
LSTM-Based Hierarchical Denoising Network for Android Malware Detection
Security and Communication Networks
author_facet Jinpei Yan
Yong Qi
Qifan Rao
author_sort Jinpei Yan
title LSTM-Based Hierarchical Denoising Network for Android Malware Detection
title_short LSTM-Based Hierarchical Denoising Network for Android Malware Detection
title_full LSTM-Based Hierarchical Denoising Network for Android Malware Detection
title_fullStr LSTM-Based Hierarchical Denoising Network for Android Malware Detection
title_full_unstemmed LSTM-Based Hierarchical Denoising Network for Android Malware Detection
title_sort lstm-based hierarchical denoising network for android malware detection
publisher Hindawi-Wiley
series Security and Communication Networks
issn 1939-0114
1939-0122
publishDate 2018-01-01
description Mobile security is an important issue on Android platform. Most malware detection methods based on machine learning models heavily rely on expert knowledge for manual feature engineering, which are still difficult to fully describe malwares. In this paper, we present LSTM-based hierarchical denoise network (HDN), a novel static Android malware detection method which uses LSTM to directly learn from the raw opcode sequences extracted from decompiled Android files. However, most opcode sequences are too long for LSTM to train due to the gradient vanishing problem. Hence, HDN uses a hierarchical structure, whose first-level LSTM parallelly computes on opcode subsequences (we called them method blocks) to learn the dense representations; then the second-level LSTM can learn and detect malware through method block sequences. Considering that malicious behavior only appears in partial sequence segments, HDN uses method block denoise module (MBDM) for data denoising by adaptive gradient scaling strategy based on loss cache. We evaluate and compare HDN with the latest mainstream researches on three datasets. The results show that HDN outperforms these Android malware detection methods,and it is able to capture longer sequence features and has better detection efficiency than N-gram-based malware detection which is similar to our method.
url http://dx.doi.org/10.1155/2018/5249190
work_keys_str_mv AT jinpeiyan lstmbasedhierarchicaldenoisingnetworkforandroidmalwaredetection
AT yongqi lstmbasedhierarchicaldenoisingnetworkforandroidmalwaredetection
AT qifanrao lstmbasedhierarchicaldenoisingnetworkforandroidmalwaredetection
_version_ 1724845421217447936