Description
Packet content scanning at a high speed has become extremely important due to its applications in network security, network monitoring, HTTP load balancing, etc. In content scanning, the packet payload is compared against a set of patterns specified as regular expressions. In this paper, we first show that memory requirements using traditional methods for fast packet scanning are prohibitively high for many patterns used in networking applications. We then propose regular expression rewrite techniques that reduce memory usage. Further, we develop a scheme based on compiling regular expressions into several engines, which dramatically increases the regular expression matching speed without significantly increasing memory usage. We implement the DFA-based packet scanners. Our experiment results using real-world traffic and patterns have shown that our implementation achieves 3.1 - 4.1 times higher throughput compared to the ungrouped DFA implementation. Compared to the best NFA-based implementation, our DFA-based packet scanner achieves 28-1192 times speedup.