This paper presents and experimentally evaluates two parallelization strategies for the popular open-source Snort network intrusion detection system (NIDS). Snort identifies intrusion attempts by processing a ruleset, a file which specifies various protocolbased, string-based, and regular-expression-based signatures associated with known attacks. As attacks proliferate, NIDS becomes increasingly important. However, the computational requirements of intrusion detection are great enough to limit average achievable throughput to 557 Mbps on a commodity server-class PC — just over half the link-level bandwidth. The strategies studied in this paper accelerate the performance of Snort by parallelizing rule processing while still maintaining the shared state information required for correct operation. The conservative version proposed here parallelizes ruleset processing at the level of TCP/IP flows, as any potential inter-packet dependences are confined to a single flow. Any single flow is processed in-order at one thread, but the flows are partitioned among threads. This solution provides good performance for 3 of the 5 network packet traces studied, reaching as high as 3.0 speedup and 1.7 Gbps inspection rate when implemented on x86-64 Linux for a server with two dual-core Opteron processors (four cores total). Conservative parallelization allows an average inspection rate of 1.07 Gbps across all 5 traces – nearly twice the serial performance. However, it is too restrictive to achieve good performance if there are not enough concurrent flows in the traffic stream. To handle this case, an optimistic version of Snort is also designed that exploits the observation that not all packets from a flow are actually connected by dependence orders (although these dependences cannot be discovered until deep in packet inspection). The optimistic version thus allows a single flow to be simultaneously processed by multiple threads, stalling processing only if an actual dependence is found. The optimistic version has additional overheads that reduce speedup by 7–13% for traces that have flow concurrency. However, the benefits of the optimistic appproach allow one additional trace to see substantial speedup (2.2 on four cores). The average inspection rate stays nearly unchanged at 1.09 Gbps, but the peak increases to over 2 Gbps. Consequently, this may be a good option for protecting systems and networks with few flows. This work is supported in part by the National Science Foundation under Grant Nos. CCF-0532448 and CNS-0532452.

Date of this Version

January 2007