The Effects of Traditional Anti-Virus Labels on Malware Detection Using Dynamic Runtime Opcodes

The arms race between the distributors of malware and those seeking to provide defenses has so far favored the former. Signature detection methods have been unable to cope with the onslaught of new binaries aided by rapidly developing obfuscation techniques. Recent research has focused on the analys...

Full description

Bibliographic Details
Main Authors: Domhnall Carlin, Alexandra Cowan, Philip O'Kane, Sakir Sezer
Format: Article
Language:English
Published: IEEE 2017-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8027024/
id doaj-a993b95944be444da841de3e5690fcb4
record_format Article
spelling doaj-a993b95944be444da841de3e5690fcb42021-03-29T20:18:08ZengIEEEIEEE Access2169-35362017-01-015177421775210.1109/ACCESS.2017.27495388027024The Effects of Traditional Anti-Virus Labels on Malware Detection Using Dynamic Runtime OpcodesDomhnall Carlin0https://orcid.org/0000-0002-8424-2757Alexandra Cowan1Philip O'Kane2Sakir Sezer3Centre for Secure Information Technologies, Queen’s University Belfast, Belfast, U.K.Institute of Electronics, Communications and Information Technology, Queen’s University Belfast, Belfast, U.K.Centre for Secure Information Technologies, Queen’s University Belfast, Belfast, U.K.Centre for Secure Information Technologies, Queen’s University Belfast, Belfast, U.K.The arms race between the distributors of malware and those seeking to provide defenses has so far favored the former. Signature detection methods have been unable to cope with the onslaught of new binaries aided by rapidly developing obfuscation techniques. Recent research has focused on the analysis of low-level opcodes, both static and dynamic, as a way to detect malware. Although sometimes successful at detecting malware, static analysis still fails to unravel obfuscated code, whereas dynamic analysis can allow researchers to investigate the revealed code at runtime. Research in the field has been limited by the underpinning data sets; old and inadequately sampled malware can lessen the extrapolation potential of such data sets. The main contribution of this paper is the creation of a new parsed runtime trace data set of over 100 000 labeled samples, which will address these shortcomings, and we offer the data set itself for use by the wider research community. This data set underpins the examination of the run traces using classifiers on count-based and sequence-based data. We find that malware detection rates are lessened when samples are labeled with traditional anti-virus (AV) labels. Neither count-based nor sequence-based algorithms can sufficiently distinguish between AV label classes. Detection increases when malware is re-classed with labels yielded from unsupervised learning. With sequenced-based learning, detection exceeds that of labeling as simply “malware”alone. This approach may yield future work, where the triaging of malware can be more effective.https://ieeexplore.ieee.org/document/8027024/Network securitymachine learningcomputer security
collection DOAJ
language English
format Article
sources DOAJ
author Domhnall Carlin
Alexandra Cowan
Philip O'Kane
Sakir Sezer
spellingShingle Domhnall Carlin
Alexandra Cowan
Philip O'Kane
Sakir Sezer
The Effects of Traditional Anti-Virus Labels on Malware Detection Using Dynamic Runtime Opcodes
IEEE Access
Network security
machine learning
computer security
author_facet Domhnall Carlin
Alexandra Cowan
Philip O'Kane
Sakir Sezer
author_sort Domhnall Carlin
title The Effects of Traditional Anti-Virus Labels on Malware Detection Using Dynamic Runtime Opcodes
title_short The Effects of Traditional Anti-Virus Labels on Malware Detection Using Dynamic Runtime Opcodes
title_full The Effects of Traditional Anti-Virus Labels on Malware Detection Using Dynamic Runtime Opcodes
title_fullStr The Effects of Traditional Anti-Virus Labels on Malware Detection Using Dynamic Runtime Opcodes
title_full_unstemmed The Effects of Traditional Anti-Virus Labels on Malware Detection Using Dynamic Runtime Opcodes
title_sort effects of traditional anti-virus labels on malware detection using dynamic runtime opcodes
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2017-01-01
description The arms race between the distributors of malware and those seeking to provide defenses has so far favored the former. Signature detection methods have been unable to cope with the onslaught of new binaries aided by rapidly developing obfuscation techniques. Recent research has focused on the analysis of low-level opcodes, both static and dynamic, as a way to detect malware. Although sometimes successful at detecting malware, static analysis still fails to unravel obfuscated code, whereas dynamic analysis can allow researchers to investigate the revealed code at runtime. Research in the field has been limited by the underpinning data sets; old and inadequately sampled malware can lessen the extrapolation potential of such data sets. The main contribution of this paper is the creation of a new parsed runtime trace data set of over 100 000 labeled samples, which will address these shortcomings, and we offer the data set itself for use by the wider research community. This data set underpins the examination of the run traces using classifiers on count-based and sequence-based data. We find that malware detection rates are lessened when samples are labeled with traditional anti-virus (AV) labels. Neither count-based nor sequence-based algorithms can sufficiently distinguish between AV label classes. Detection increases when malware is re-classed with labels yielded from unsupervised learning. With sequenced-based learning, detection exceeds that of labeling as simply “malware”alone. This approach may yield future work, where the triaging of malware can be more effective.
topic Network security
machine learning
computer security
url https://ieeexplore.ieee.org/document/8027024/
work_keys_str_mv AT domhnallcarlin theeffectsoftraditionalantiviruslabelsonmalwaredetectionusingdynamicruntimeopcodes
AT alexandracowan theeffectsoftraditionalantiviruslabelsonmalwaredetectionusingdynamicruntimeopcodes
AT philipokane theeffectsoftraditionalantiviruslabelsonmalwaredetectionusingdynamicruntimeopcodes
AT sakirsezer theeffectsoftraditionalantiviruslabelsonmalwaredetectionusingdynamicruntimeopcodes
AT domhnallcarlin effectsoftraditionalantiviruslabelsonmalwaredetectionusingdynamicruntimeopcodes
AT alexandracowan effectsoftraditionalantiviruslabelsonmalwaredetectionusingdynamicruntimeopcodes
AT philipokane effectsoftraditionalantiviruslabelsonmalwaredetectionusingdynamicruntimeopcodes
AT sakirsezer effectsoftraditionalantiviruslabelsonmalwaredetectionusingdynamicruntimeopcodes
_version_ 1724194914666807296