Title: Fast%20and%20Memory-Efficient%20Regular%20Expression%20Matching%20for%20Deep%20Packet%20Inspection
1Fast and Memory-Efficient Regular Expression
Matching for Deep Packet Inspection
- Fang Yu
- Microsoft Research, Silicon Valley
- Work was done in UC Berkeley, jointly with
- Zhifeng Chen (Google Inc) Yanlei Diao (Umass,
Amherst) - T. V. Lakshman (Bell Labs) Randy H. Katz (UC
Berkeley)
2Regular Expressions
- Flexible way to describe pattern
- Example for detecting yahoo messenger traffic
- (ymsgypnsyhoo).?.?.?.?.?.?.? lwt.\xc0\x80
- Used in many payload scanning applications
- L7-filter protocol identifiers
- Bro intrusion patterns
- SNORT
- No regular expression in April 2003
- 1131 out of 4867 intrusion rules contain regular
expressions as of Jan 2006
3Challenges
- Features specific to packet scanning applications
- Large set of patterns, order of 100s or 1000s
Snort L7-filter XML filtering
of regular expressions analyzed 1555 70 1,000-100,000
of patterns with wildcards ., , ?, 74.9 75.7 50 -100
Average of wildcards per pattern 4.7 7.0 1-2
of patterns with class 31.6 52.8 0
Average of classes per pattern 8.0 4.8 0
of patterns with length restrictions on classes or wildcards 56.3 21.4 ?0
4Design Space
Automata-based Approaches
DFA-based
NFA-based
- A group of states can be activated simultaneously
- Only one state is activated
Repeated Scan
One Pass Scan
(Space Problem)
- High percentage of wildcards
- ?NFA-based approaches can be slow, sometimes less
than 1Mb/s
- Start scanning from one position, if no match,
start again at the next position - Good for parsers
- Packets may not contain any patterns
- No guarantee of high speed
- Scan the input only once
- Fast and deterministic throughput
- Add . before patterns
- Some patterns generate very large DFA
m Individual DFA for m patterns
One composite DFA for m patterns
- O(m) processing complexity for each input
character
- O(1) processing complexity for each input
character
Contributions
Patterns (AB)C and (AD)E
- Selectively group patterns into k groups (e.g.,
k3) - Avoid exponential memory growth
- Further speed up matching process
- Rewrite techniques to reduce memory usage
- Make DFA-based approach feasible
5DFA Sizes of Regular Expressions
- Typical patterns in network payload scanning
applications
Pattern features Example of states of patterns Average of states
1) Explicit strings with k characters ABCD .ABCD k1 25.1 23.63
2) Wildcards AB.CD .AB.CD k1 18.82 27.20
3) Patterns with , a wildcard, and a length restriction j AB.jCD AB.0, jCD AB.jCD O(kj) 44.7 180.31
4) Patterns with , a class of characters overlaps with the prefix, and a length restriction j AA-ZjD O(kj2) j370 5.11 136903
5) Patterns with a length restriction j, where a wildcard or a class of characters overlaps with the prefix .AB.jCD .AA-ZjD O(k2j) j344 6.27 gt2214
Rewrite Rule 1
Rewrite Rule 2 Focus of this talk
6Design Considerations
- Completeness of matching results for one pattern
- Complete matching
- Report all the possible substrings
- E.g., a pattern ab and an input abbb
- Four possible matches, i.e., a, ab, abb, and abbb
- Non-overlapping matching
- Common practice left-most longest match,
shortest match results - In most payload scanning applications, for one
pattern, reporting non-overlapping matching
result is sufficient
7Patterns with Exponential DFA Sizes
- Often for detecting buffer overflow attempts,
e.g., .AUTH\s\n100 - DFA needs to remember all the possible AUTH\s
- A second AUTH\s can either match \n100 or be
counted as a new match of the start of the
pattern AUTH\s - Generate a DFA of gt100,000 states
- Cant be efficiently processed by an NFA-based
approach either
Input AUTH\sAUTH\s AUTH\s\s AUTH\s\s\s
NFA for .AUTH\s\n100
8Rewriting Intuition
- Only the first AUTH\s matters
- If there is a \n within the next 100 bytes
- None of the AUTH\s matches the pattern
- Otherwise, the first AUTH\s and the following
characters have already matched the pattern - ?Rewrite the pattern to (AAUAUTAUTH
AUTH\sAUTH\s\n0,99\n)AUTH\s\n100
generates a DFA of only 106 states -
- This rewritten pattern
- Report different numbers of matches from the
original pattern in identifying complete matches - Equivalent in identifying non-overlapping
patterns
9Rewriting Effect on the SNORT Rule Set
Pattern features Example of states of patterns Average of states
1) Explicit strings with k characters ABCD .ABCD k1 25.1 23.63
2) Wildcards AB.CD .AB.CD k1 18.82 27.20
3) Patterns with , a wildcard, and a length restriction j AB.jCD AB.0, jCD AB.jCD O(kj) 44.7 180.31
4) Patterns with , a class of characters overlaps with the prefix, and a length restriction j AA-ZjD O(kj2) O(kj) 5.11 136903
5) Patterns with a length restriction j, where a wildcard or a class of characters overlaps with the prefix .AB.jCD .AA-ZjD O(k2j) O(kj) 6.27 gt2214
v
10Rewriting Effect on the SNORT Rule Set
- Created scripts to automatically rewrite patterns
- After rewriting, patterns in SNORT and Bro can be
compiled into DFAs
Type of Rewrite Rule Set Number of Patterns Average Length Restriction DFA Reduction Rate
Rewrite Rule for Quadratic case Snort 17 370 gt98
Rewrite Rule for Quadratic case Bro 0 0 0
Rewrite Rule for Exponential Case Snort 19 344 gt99
Rewrite Rule for Exponential Case Bro 49 214.4 gt99
11Design Choices
Automata-based Approaches
DFA-based
NFA-based
Repeated Scan
One Pass Scan
m Individual DFA for m patterns
One composite DFA for m patterns
- O(m) processing complexity for each input
character
- O(1) processing complexity for each input
character
Contributions
- Selectively group patterns into k groups (e.g.,
k3) - Further speedup matching process
- Avoid exponential memory growth
- Rewrite techniques to reduce memory usage
- Make DFA-based approach feasible
12State Explosion Problem
- Randomly adding patterns from the L7-filters into
one DFA -
13Interactions of Regular Expressions
- Some patterns generate DFA of exponential sizes
- E.g., A DFA for pattern .AB.CD and .EF.GH
14Grouping Algorithms
- Fixed local memory limitation (NPU or multi-core
architectures) - Compute pair-wise interactive results, form a
graph - Keep adding patterns until reaching limit
- Pick a pattern with the fewest interactions to
the new group - Fixed total memory limitation (General
single-core CPU architecture) - First compute the DFA of individual patterns and
compute the leftover memory size - Distribute the leftover memory evenly among
ungrouped expressions
15Experimental Setup
- Regular expression pattern sets
- Linux application layer filer (L7-filter) 70
regular expressions - Pattern sets from Bro intrusion detection systems
- HTTP related patterns 648 patterns
- Payload related patterns 223 patterns
- Packet traces
- MIT dump with viruses and worms
- Berkeley dump normal traffic
- Scanners
- Generated one pass scanning DFA scanner
- A NFA-based scanner Pcregrep
- A repeated scanning DFA parser generated by flex
16Grouping Results for Patterns in L7-filter (70
patterns)
Results of grouping algorithms for fixed total
memory
Total DFA state Limit Groups Compilation Time (s)
3533 70
3533 12 5.602
4000 10 7.335
6000 8 13.189
8000 6 37.098
10000 5 37.928
16000 4 41.870
32000 3 49.976
No grouping
Sum of individual DFAs No extra memory cost
70/125.83 times less processing per character
70/323.3 times less processing per character
6.83MB of memory
17Throughput Analysis
- For Linux L7-filter (70 patterns)
- Using PCs with 3Ghz single core CPU and 4GB
memory
18Comparisons to Other Approaches
Throughputs (Mb/s) Throughputs (Mb/s) Memory Consumption (KB)
MIT dump Berkeley dump Memory Consumption (KB)
Linux L7-filter (70 patterns) NFA 0.98 3.4 1636
Linux L7-filter (70 patterns) DFA RP 16.3 34.6 7632
Linux L7-filter (70 patterns) DFA OP 3 groups 690.8 728.3 13596
Bro HTTP (648 patterns) NFA 30.4 56.1 1632
Bro HTTP (648 patterns) DFA RP 117.2 83.2 1624
Bro HTTP (648 patterns) DFA OP 1 group 1458 1612.8 4264
Bro Payload (223 patterns) NFA 5.8 14.8 1632
Bro Payload (223 patterns) DFA RP 17.1 25.6 7628
Bro Payload (223 patterns) DFA OP 4 groups 566.1 568.3 4312
NFAPcregrep DFA RP Flex generated DFA-based
repeated scan engine DFA OP Our DFA one pass
scanning engine
- DFA OP is
- 48 to 704 times faster over the NFA
implementation - 12-42 times faster than the commonly used
DFA-based parser - Use 2.6 to 8.4 times memory
19Conclusions
-
- High speed regular expression matching scheme
- Proposed two rewrite rules
- DFA-based approach is possible with our rewriting
rules - Can rewrite complicated patterns from our pattern
sets - In other pattern sets, there may be patterns not
covered by our rewriting rules. - Developed grouping algorithm to selectively group
patterns together - Orders of magnitude faster than existing
solutions - Can be applied to FPGA or ASIC based approaches
as well