Title: Fast Firewall Implementation for Software and Hardware-based Routers
1Fast Firewall Implementation for Software and
Hardware-based Routers
- Lili Qiu, Microsoft Research
- George Varghese, UCSD
- Subhash Suri, UCSB
- 9th International Conference on Network Protocols
- Riverside, CA, November 2001
2Outline
- Motivation for packet classification
- Performance metrics
- Related work
- Our approaches
- Performance results
- Summary
3Motivation
- Traditionally, routers forward packets based on
the destination field only - Firewall and diff-serv require packet
classification - forward packets based on multiple fields in the
packet header - e.g. source IP address, destination IP address,
source port, destination port, protocol, type of
service (ToS)
4Problem Specification
- Given a set of filters (or rules), find the least
cost matching filter for each incoming packet - Each filter specifies
- Some criterion on K fields
- Associated directive
- Cost
- ExampleRule 1 24.128.0.0/16 4.0.0.0/8
udp denyRule 2 64.248.128.0/20
8.16.192.0/24 tcp permitRule N
24.128.0.0/16 4.16.128.0/20 any
permit Incoming packet 24.128.34.8,
4.16.128.3, udp Answer rule 1
5Performance Metrics
- Classification speed
- Wire rate lookup for minimum size (40 byte)
packets at OC192 (10 Gbps) speeds. - Memory usage
- Should use memory linear in the number of rules
- Update time
- Slow updates are acceptable
- Impact on search speed should be minimal
6Related Work
- Given N rules in K dimensions, the worst-case
bounds - O(log N) search time, O(N(K-1)) memory
- O(N) memory, O((log N)(K-1)) search time
- Tree based
- Grid-of-tries (Srinivasan et.al. Sigcomm98)
- Fat Inverted Segment Tree (Feldman et.al.
Infocom00) - Cross-producting (Srinivasan et.al. Sigcomm98)
- Bit vector scheme
- Lucent bit vector (Lakshman et.al. Sigcomm98)
- Aggregated bit vector scheme (Baboescu et.al.
Sigcomm01) - RFC (Pankaj et.al. Sigcomm99)
- Tuple Space Search (Srinivasan et.al. Sigcomm99)
7Backtracking Search
- A trie is a binary branching tree, with each
branch labeled 0 or 1 - The prefix associated with a node is the
concatenation of all the bits from the root to
the node
F1 00
F2 10
A
1
0
B
D
0
C
0
F1
E
F2
8Backtracking Search (Cont.)
A
- Extend to multiple dimensions
- Standard backtracking
- Depth-first traversal of the tree visiting all
the nodes satisfying the given constraints - Example Search for 00,0,0Result F8
- Reason for backtrack
- 00 matches , 0, 00
1
0
B
0
0
C
0
0
D
1
H
0
E
0
I
J
0
0
1
0
1
F8
F
G
F3
1
1
K
0
F6
F4
F2
F5
F7
F1
9Set Pruning Tries
- Multiplane trie
- Fully specify all search paths so that no
backtracking is necessary - Performance
- O(logN) search time
- O(N(k-1)) storage
10Set Pruning Tries Conversion
- Converting a backtracking trie to a set pruning
trie is essentially replacing a general filter
with more specific filters
11Set Pruning Tries Example
1
1
0
0
0
1
0
1
1
0
D
C
B
E
1
1
F2
0
0
F2
F2
F2
F2
F
A
F3
Min(F1,F2)
Min(F2,F3)
F1
Backtracking Trie
Set Pruning Trie
Replace ,, with 0,0,, 0,0,0,
0,1,, 1,0,,1,1,, and 1,1,1.
12Performance Evaluation
- 5 real databases from various sites
- Five dimensions
- src IP, dest IP, src port, dest port, protocol
- Performance metrics
- Total storage
- Total number of nodes in the multiplane trie
- Worst-case lookup time
- Total number of memory accesses in the worst-case
assuming 1 bit at a time trie traversal
13Performance Results
Database Rules Backtracking Backtracking Set Pruning Tries Set Pruning Tries
Database Rules Lookup time Storage Lookup time Storage
1 67 146 1848 86 5541
2 158 153 4914 102 51785
3 183 169 3949 102 59180
4 279 202 6785 102 123951
5 266 208 6555 102 165920
Backtracking has small storage and affordable
lookup time.
14Major Optimizations
- Trie compression algorithm
- Pipelining the search
- Selective pushing
- Using minimal hardware
15Trie Compression Algorithm
0
- If a path AB satisfies the Compressible Property
- All nodes on its left point to the same place L
- All nodes on its right point to the same place R
- then we compress the entire branches by 3
edges - Center edge with value ?(AB) pointing to B
- Left edge with value lt ?(AB) pointing to L
- Right edge with value gt ?(AB) pointing to R
- Advantages of compression save time storage
0 branch gt01010
0 branch lt 01010
0
1
0 branch 01010
F1
1
0
0
1
F3
F1
1
F2
F1
F3
0
F2
F3
16Performance Evaluation of Compression
Database Lookup Time of Uncompressed Lookup Time of Compressed
1 146 30
2 153 51
3 169 49
4 202 98
5 208 59
Compression reduces the lookup time by a factor
of 2 - 5
17Pipelining Backtracking
- Use pipeline to speed up backtracking
- Issues
- The amount of register memory passed between
pipelining stages need to be small - The amount of main memory need to be small
Pipeline Stage 1
Pipeline Stage 2
Pipeline Stage m
18Pipelining BacktrackingLimit the amount of
register
A
- Standard backtracking requires O(KW) state for
K-dimensional filters, with each dimension W-bit
long - Our approach
- Visit more general filters first, and more
specific filters later - Example
- Search for 00,0,0A-B-H-J-K-C-D-E-F-GResult
F8 - Performance
- K1 32-bit registers
1
0
B
0
0
C
0
0
D
1
H
0
E
0
I
J
0
0
1
0
1
F8
F
G
F3
1
1
K
0
F6
F4
F2
F5
F7
F1
19Pipelining Backtracking Limit the amount of
memory
- Simple approach
- Store an entire backtracking search trie at every
pipelining stage - Storage increases proportionally with the number
of pipelining stages - Our approach
- Have pipeline stage i store only the trie nodes
that will be visited in the stage i
20Storage Requirement for Pipeline
Storage increases moderately with the number of
pipelining stages (i.e. slope lt 1).
21Trading Storage for Time
- Smoothly tradeoff storage for time
- Observations
- Set pruning tries eliminate all backtracking by
pushing down all filters ? intensive storage - Eliminate backtracking for filters with large
backtracking time - Selective push
- Push down the filters with large backtracking
time - Iterate until the worst-case backtracking time
satisfies our requirement
O((logN)(k-1)) Time (e.g. Backtrack)
O(N(k-1)) Space (e.g. Set Pruning)
22Example of Selective Pushing
- Goal worst-case memory accesses ? 11
- The filter 0, 0, 000 has 12 memory accesses.
- Push the filter down ? reduce lookup time
- Now the search cost of the filter 0,0,001
becomes 12 memory accesses. So we need to push it
down. Done!
0
0
0
0
0
0
0
0
0
0
0
0
F3
0
0
0
0
0
F3
F3
0
0
1
1
1
0
0
0
1
0
0
0
0
F2
0
F2
F2
F2
F1
F1
F1
F1
F1
23Performance of Selective Push
Uncompressed Trie
Compressed Trie
Lookup time is reduced with moderate increase in
storage until we reach the knee of the curve.
24Summary
- Experimentally show simple trie based schemes
perform much better than the worst case figure - Propose optimizations
- Trie compression
- Pipelining the search
- Selective push
25Summary (Cont.)
Approach Description Performance Gain
Trie compression algorithm Effectively exploit redundancy in trie nodes by using range match Reduce lookup time by a factor of 2 5, save storage by a factor of 2.8 8.7
Pipelining the search Split the search into multiple pipelining stages, and each stage is responsible for a portion of search Increase throughput with marginal increase in memory cost
Selective push Push down the filters with large backtracking time Reduce lookup time by 10 25 with only marginal increase in storage