A Type-Aware Approach to Message Clustering for Protocol Reverse Engineering

Protocol Reverse Engineering (PRE) is crucial for information security of Internet-of-Things (IoT), and message clustering determines the effectiveness of PRE. However, the quality of services still lags behind the strict requirement of IoT applications as the results of message clustering are often...

Full description

Bibliographic Details
Published in:Sensors
Main Authors: Xin Luo, Dan Chen, Yongjun Wang, Peidai Xie
Format: Article
Language:English
Published: MDPI AG 2019-02-01
Subjects:
Online Access:https://www.mdpi.com/1424-8220/19/3/716
Description
Summary:Protocol Reverse Engineering (PRE) is crucial for information security of Internet-of-Things (IoT), and message clustering determines the effectiveness of PRE. However, the quality of services still lags behind the strict requirement of IoT applications as the results of message clustering are often coarse-grained with the intrinsic type information hidden in messages largely ignored. Aiming at this problem, this study proposes a type-aware approach to message clustering guided by type information. The approach regards a message as a combination of n-grams, and it employs the Latent Dirichlet Allocation (LDA) model to characterize messages with types and n-grams via inferring the type distribution of each message. The type distribution is finally used to measure the similarity of messages. According to this similarity, the approach clusters messages and further extracts message formats. Experimental results of the approach against Netzob in terms of a number of protocols indicate that the correctness and conciseness can be significantly improved, e.g., figures 43.86% and 3.87%, respectively for the CoAP protocol.
ISSN:1424-8220