Title: ECE 526
1ECE 526 Network Processing Systems Design
- Network Security string matching algorithm
- Chapter 17 George Varghese
2Goal
- Gain basic knowledge to improve network security
from network processing system design perspective
3Outline
- Signature-based IDSs
- String matching algorithms
- Boyer-Moore
- Aho-Corasic
- Bloom Filter
- Approximated Searching
- Approximated Searching Based on Bloom Filters
- Summary
4Internet Security
- Internet lacking of security
- Example?
- What is Internet Security
- Confidentiality data keeping private
- Integrity protected from modification or
destruction - Availability data or service accessible
- What are current approaches
- Engineering?
- non-engineering?
- Intrusion Detection Systems (IDSs)
5Intrusion Detection Systems
- Two types of Intrusion Detection Systems (IDSs)
- Signature detection based on matching events to
the signatures of known attacks - Anomaly detection based on statistical or
learning theory to identify aberrant events - Three important tasks
- String matching searching suspicious strings in
packet payloads - Traceback to detect intruder who uses forged
source address - Detect onset of new worm without prior knowledge
- The problems of current IDSs
- Very slow
- Have a high false-positive rate
- false positive answering membership query
positively when member is not in the set
6Snort Rule Example
- Snort
- one of lightweight detection system, open source
- www.snort.org
- Snort rule example
- Alert tcp BAD 80 -gt GOOD 90 \
- (content perl.exe msg detected perl.exe)
- Looking for string perl.exe contained in TCP
packet from IP BAD, Port 80 to IP GOOD,
Port 90 - Upon detection, generating alert with detected
perl.exe - Question a packet coming, how to check it?
- Question how about multiple rules?
- String matching is bottleneck
7String Searching brute force
- Arbitrary string can be anywhere in the packet
- Naive approach
- Input String size m packet size n (assuming n
gtm) - For i0 to n-m do
- For j0 to m-1 do
- Compare stringj with packetij
- If not equal exit the inner loop
- Complexity
- worst case O(mn)
- Best case O(n)
- Can we do better?
8Boyer-Moore example
- Improving by skipping over a larger number of
character and by comparing last character first - How to build the ship table?
9Boyer Moore skip table
- How far to skip when the last character does not
match. - For example
- pattern CAB
- Skip 1 2 3 3
- Last A B C D E
- Care is needed with repeated letters
- For example
- pattern ABBA
- Skip 1 4 4 4
- Last A B C D E
- Skipc distance of last occurrence of c from
end in pattern
10Boyer Moore algorithm
- Input pattern with size m packet with size n
- i 0
- While iltn-m do
- If patternm-1 packetim-1 then //last
character first - For j0 to m 1 do
- Compare patternj with packetij //one by one
sequentially - ii1
- Else iiskippacketim-1 //skip
- Complexity
- best case O(n/m)
- worst case still O(nm)
11Aho-Corasic
- Failure pointer
- Prevent restarting at top of trie when failure
occurring - New attempt made by shifting
- How about multiple strings?
BABAR
12Multiple String Trie Construction
Example P he, she, his, hers
13Aho-Corasick Searching
Matching String
Input stream
- Scanning input stream only once
- Complexity linear time
- .
h
x
h
e
r
s
14Aho-Corasick summary
- Pros
- Computation complexity worst case O(n)
- Can scan once and output all matches
- Cons
- Constructing a finite state machine
- Failure pointers needed
- Too big to be on chip
- Each node has maximum 256 pointers
15Hashing
- One efficient set membership query mechanism
- Programming trivial
- Query complexity O(n) best case (n size of
packet) - Query accuracy possible false positive
- However, to handle collision
- Each hash entry containing a list of IDs of all
elements share the hash value - Storage minimal requirement O(nw) n number of
elements, w minimal width of each element - Question can we trade accuracy for storage
requirement using hashing idea?
16Bloom Filter
- Data structured proposed by Burton Bloom
- Randomized data structure
- Strings stored using multiple hash functions
(programming) - Check strings presence based on multiple bits
(querying) - Membership queries result in false positives
- Powerful tools for
- Content networks
- Route trace back
- Network measurements
- Intrusion Detection
17Bloom Filter Programming
- Instead using one hash function, k independent
hash functions - Instead requiring nw bit storage m-bit vector
required - Initially all bit are cleared
- Programming set bit based on each hashing
function - bit remaining set if two elements hashed to same
position
18Bloom Filter Querying
- Procedure
- String x is computed by k hashing functions
- Each hashing function pinpointing one bit in
m-bit vector - All value in m-bit vector are ANDed
- If match 0,
- x is not a member
- else
- x is positive member
19Bloom Filter false positive rate
- n number of strings to be stored
- k number of hash functions
- m the size of bit array
- The false positive probability
- f (1/2)k
- Optimal value hash functions k
- K ln2 m/n 0.693m/n
- False positive rate decreases exponentially with
number of hash functions memory
20Counting Bloom Filters
- Member deletion
- Deletion of a member requiring clearing all the
related bits - A bit once set in the bit vector can not be
deleted easily - the bit can be set by multiple members
- Solution
- Assuming member deletion rare case
- Counting bloom filter
- Updating counter when element added or deleted
- Bit reset in m-bit vector when counter value is 0
21Approximate String Searching
22Approximate String Searching
John W. Lockwood and etc. DEEP PACKET INSPECTION
USING PARALLEL BLOOM FILTERS
23Summary
Idea Computation Storage Problem
Brute Force Naïve O(mn) slow
Boyer-Moore Skip O(mn) worst O(n/m) best 0.1 MB (10K Rules) Shift table needed
Aho Corasick Tire O(n) worst case 50 MB (1500 Rules) Storage demanding
Bloom-Filter Approximate searching O(n) 0.1 MB (10K Rules) False positive
24For Next Class
- Read Comer chapter 6 and 9
- Final Project (option 1)
- Project group finalized
- 9/19/07 group leader email me your group
members . - each group no more than 3 members.
- Project topic finalized.
- 9/28/07 Group leader email me your topic.
- Paper presentation Final exam (Option 2)
- 9/19/07 group leader email me your group
members . - each group no more than 2 members.
- based on assigned one or two papers (lt20 min)