Pipeline Architecture for Segmented Pattern Matching

碩士 === 國立成功大學 === 資訊工程學系 === 102 === In recent years, due to the rapid growth of network traffic, the demand of deep packet inspection (DPI) system to ensure the network security is becoming more and more important. The DPI systems rely on pattern matching to detect the payload of the packet, in ord...

Full description

Bibliographic Details
Main Authors: Chia-YiChu, 朱家毅
Other Authors: Yeim-Kuau Chang
Format: Others
Language:en_US
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/81190732293244439418
Description
Summary:碩士 === 國立成功大學 === 資訊工程學系 === 102 === In recent years, due to the rapid growth of network traffic, the demand of deep packet inspection (DPI) system to ensure the network security is becoming more and more important. The DPI systems rely on pattern matching to detect the payload of the packet, in order to find possible threats in the packet. Therefore, the performance of pattern matching is the key point of the DPI systems. We have to figure out a solution that provides stable throughput and has low memory requirement to improve the performance of pattern matching. In order to let the well-known Aho-Corasick (AC) algorithm work on the pipeline architecture, we segment the patterns into pattern segments for decreasing the number of threads that tracks the input stream. In this thesis, we propose a technique to find out all the common subpatterns in the pattern set and use them to divide the patterns. After dividing the patterns, we use the pattern segments to build up the AC-DFA that eliminates all the failure transitions that are not needed in the pipeline architecture. In addition, we have to combine the subpatterns into their original pattern. Here, a tree-like structure is used for AC-DFA without failure transitions to find the original patterns. Finally, we use the transition tables of these two structures to construct the pipeline architecture. Our implementation result shows that on Xilinx Virtex-7 XC7V2000T, when group threshold K = 5, which handles 58k characters, we utilize 5.9% of Block RAM, 0.2% of Slice Registers, and 16% of Slice LUTs. The memory cost of our scheme is lower than Split AC and Bit-split by 29% and 75%.