Title: E-mail: nfhuang@cs.nthu.edu.tw
1????????????????
????? ??????? ??????????/??????? E-mail
nfhuang_at_cs.nthu.edu.tw
2Agenda
- Introduction of Network Security
- Content Inspection Technologies
- Pattern Matching Algorithms
- Flow Classification by Stateful Mechanism
- Open Issues
3 -- ?????? --
- 2000/3????DDos???????,??Yahoo?Amazon?CNN?eBay
??????? - 2001/7Amazon.com ??? Bibliofind ?????????????
- 2002 ??????
- 2003/1 SQL Slammer ??
- 2003/4 ??????????
- 2003/8 Blaster ??????
- 2003/9 SoBig ??????
- 2003/9 ??????
- 2004/3 Netsky ??????
- 2004/4 Sasser ??????
- 2005/5 ?????????????
- 2005/6 ????????????????????
4???????
- ??????????,????????,??????,??????,???????
- ?????????????,??????????????????
- ???????????????????????????????
- ????????????????
- ???????????????
5????????
6??????
- Denial of Service (DoS), Distributed Denial of
Service (DDoS) - Network Invasion
- Network Scanning
- Network Sniffing
- Torjan Horse and Backdoors
- Worm
7P2P/IM ????
- P2P (Peer-to-Peer) ????
- IM (Instant Messenger) ???
- Spyware ????
- Adware ????
- Tunneling ????
8P2P A new paradigm
- Bottleneck of Server
- Powerful PC
- Flexible, efficient information sharing
- P2P changes the way of Web (Internet)
9P2P???????????
- P2P ???????????,??????????,?? SoftEther ?
Skype??????,????,????,????????? - P2P ????????,??
- ??????????
- ?????????
- ??????
- ?????
- ????????
- ??????????
- ??????,?????
10Famous P2P Examples
- BitTorrent
- eZpeer
- Kuro
- eDonkey
- eMule
- MLdonkey
- Gnutella
- Kazaa/Morpheus
- Shareaza
- Direct-connect
- Gnutella
- Soulseek
- Opennap
- Worklink
- Opennext
- Jelawat
- PP???
- SoftEther
- iMESH
- MIB
- WinMix
- WinMule
- Skype
11Instant Messenger (IM)
- MSN
- Yahoo Messenger
- ICQ
- YamQQ
- AIM (AOL IM)
12??????????
- IDP/IPS (Layer-7)
- Application Firewall (Layer-7)
- Network Access Control (NAC)
- Defense-in-Depth/Security Switch
13A Generic Layer-7 Engine
- Packet Normalizer
- Makes sure the integrity of incoming packets
- Eliminates the ambiguity
- Decodes URI strings if necessary
- Pattern-Matching Engine
- Policy Engine
- Gather information from pattern-matching engine
and issue the verdict to allow/drop the packets
14Packet Normalizer
- Integrity Checking
- IP Fragment Reassemble
- TCP Segment Reassemble
- TCP Segments may come out-of-order
- SEQ out of window size
- Segment Overlapping
- URI Decode
- URI hex code obfuscation (a 61)
- URI unicode/UTF-8 obfuscation
- self-referential directories obfuscation
(/././././ /) - directories obfuscation (/abc/a/../a/../a/
/abc/a)
15Pattern-Matching Engine
- The most computation-intensive task in packet
processing. Normally the PM engine needs to
process every single byte in packet payload. - In Snort, the PM routine accounts for 31 of the
total execution time
16Pattern Matching is Expensive!
- 50 Instructions/ 1500 Byte packet
- 30 Instructions/ Byte. 45K Instructions/1500
Byte packet
Source Intel Corp.
17Content Inspection Technologies
- Pattern-Matching Algorithms
- Software Based
- Boyer-Moore
- Aho-Corasick (AC)
- Wu-Manber
- Hardware Based
- Bloom-Filter
- Reconfigure Hardware (FSM)
- TCAM-based
18Pattern Matching Problem Definition
- Given an input text T t0, t1, , tn ,and a
finite set of strings P P1, P2, , Pr, the
string matching problem involves locating and
identifying the substring of T which is identical
to Pj , 1? j? r, where - tsi , 0? i? m-1. And this equation can
be also denoted as - tstsm-1
Text
G C A T C G C A G A G A G T A T A C A G T A A G
G C A G A G A G
19Aho-Corasick (AC) Algorithm
- AC is a classic solution to exact set matching.
It works in time O(n m z) where z is number
of patterns occurrences in T. - AC is based on a refinement of a keyword tree.
- AC is a deterministic algorithm. That is, the
performance is independent of the number of
patterns.
20An Example of AC Algorithm
- Example P ab, ba, babb, bb
21An example of AC Algorithm
!h,s
he
h
e
Patterns hers his she
h
r
s
1
0
2
8
9
hers
i
his
s
s
6
7
he, she
h
e
3
4
5
s
sh
Dashed fail transitions those not shown leads
to the root
22An example of AC Algorithm
i
h
e
s
Got a Match!
h
i
s
Text h e i s h i s
23Reconfigure Hardware (FSM)
- Implement the AC FSM in configurable Logic
Elements (LEs) of FPGA. - Achieve multiple gigabit performance. (Depends on
the FPGA model) - A powerful FPGA is necessary to accommodate
thousands of patterns, so that its not practical
and visible in commercial market.
24FPGA-based pattern matching
25Bloom Filter
- Given a string X, the Bloom filter computes k
hash functions on it producing k hash values
ranging from 1 to m. The same procedure is
repeated for all the members of the pattern set. - The input text is verified by generating k hash
values in the same way. If at least one of these
k bits is found not set then the string is
declared to be impossible to match. - Patterns in Length n are grouped into Bn.
26Bloom Filter (Cont.)
- False positive
- Mim f (0.5)K, while m (k x n) / Ln2
- So, total space, sum(Bi) m x (w - 1)
- if k 1, n 2048, m 3072 bits
- k 1, n 3072, m 4608 bits
- if k 4, f 0.0625
- k 5, f 0.0313
- k 6, f 0.0156
K Hash functions H1, H2, , Hk
27TCAM fundamental
- TCAM stores data with three logic values 0,
1, X (dont care) - Multiple match modes are needed.
28Policy Engine
- Collect the matching events from Pattern-Matching
Engine. - Clarify the relationship between matched
patterns - Ordered A policy may consists more than one
pattern and should be matched in order. - Offset, Depth The matched position should be
within a certain range or location. - Distance, Within The distance between two
matched patterns should be taken into
consideration also. - Trace Application States
- Some applications are difficult to identify by
using only one signature (e.g. P2P). Policy
Engine needs to track the connection state like
the following diagram
Msg Exchange
Data Exchange
Request File
S1
S0
S2
S3
29Content Inspection Technologies
- Our Pattern Matching Algorithms
- Hierarchical Matching Algorithm (HMA) for
Intrusion Detection Systems (IEEE Globecom2005) - A Time and Memory Efficient String Matching
Algorithm for Intrusion Detection Systems (IEEE
Globecom2006) - A Pattern Matching Coprocessor for Deep and Large
Signature Set in Network Security System (IEEE
Globecom2005) - A Fast Pre-filtering Algorithm for Pattern
Matching (IEEE Globecom2006) - Flow Classification by Stateful Methods
- IM/P2P Classification
-
30Hierarchical Matching Algorithm (HMA) for
Intrusion Detection Systems
- HMA is a two-tier and cluster-wise matching
algorithm - Reduce the amount of external memory access
- Reduce the access delay
- Reduce the required processing cycle time
- Improve the performance of IDS
- Low memory requirement
- 1.763 times better than the state-of-the-art
algorithms - Enable an efficient and cost-effective real-time
IDS
31Cluster-wise String Search
Narrow Searching Domain
Pre-filter Fast Search
32Hierarchical Matching Algorithm (HMA) for
Intrusion Detection Systems
33Pattern Matching Coprocessor for Deep and Large
Signature Set in Network Security System
System Architecture
34Pattern Matching Coprocessor for Deep and Large
Signature Set in Network Security System
Central Control Unit
35Pattern Matching Coprocessor for Deep and Large
Signature Set in Network Security System
Simulation Results
FPGA Implementation Results
Module Resource Usage
Selector 530 LEs ( 1 of total LEs)
PE 150?32 Les ( 26 of total LEs)
Pattern Table 22K bits ( 9 of memory )
I/O Pin 210 ( 50 of total pins)
36Pre-filter Search Filter Model
- All the substrings that filtered by the filter
are clear and impossible to contain any of the
defined patterns. - And those substrings passed to the pattern
matching algorithm may or may not contain
pre-defined patterns. - Thus, the search filter may generate false
positive but not false negative. - The false positive here refers to the case that a
substring without any pre-defined patterns is
falsely detected and accepted as with. - An exact string matching mechanism is essential
for finding out which patterns are included in
the accepted substring.
37Pre-filter Search Filter Model
38Super-Symbol Filter
- The basic idea of the proposed Super-Symbol
Filter (SSF) algorithm is to treat two bytes data
as a super-symbol, and the using of bitmap to
indicate the occurrence of each super-symbol in
the pre-defined patterns.
Match Vector Constructing
For example, for the 8-bit ASCII-code, there are
65536 combinations of two bytes data, and a
bitmap vector of 65536 entries is used.
39Filtering phase in SSF-1 Algorithm
Input String Text ABOD CODING IS FOOD
AB BO OD D? ?C CO OD DI IN NG G? ?I IS S? ?F FO OO OD
Bitmap
AA AB CO DE FO OD OO ZZ
0 0 1 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0
AB BO OD D? ?C CO OD DI IN NG G? ?I IS S? ?F FO OO OD
1 0 1 0 0 1 1 0 0 0 0 0 0 0 0 1 1 1
D
O
C
D
O
O
F
B
A
D
O
40SSF-2 Algorithm
- To have better accuracy and less number of false
positives, the extended SSF-2 algorithm, two
match vectors are employed. - The First Match Vector (FMV) is used for the
super-symbols being conjugated by the first two
symbols in each of the patterns. - The Rest Match Vector (RMV) is used for the rest
super-symbols in the patterns except those in the
FMV.
41SSF-2 Algorithm
- The algorithm looks up the FMV and RMV and
detects whether the corresponding bit of each
super-symbol is 1. - Since AB and OD are not the beginning
super-symbol of any patterns (by checking FMV),
the filter algorithm only outputs two substrings
COD and FOOD. And only one substring COD is
false positive in this case.
42Evaluation
- To evaluate the scalability and flexibility, the
popular Snort IDS signatures are employed. - In case most bits of the bitmap are set as 1,
we can expect that the SSF filtering performance
will be impacted dramatically as the hit rate
will be very high. - Fortunately, by tracking the growing paths of
Snort rule patterns, the percentage of setting
bits for the MV, FMV, and RMV is still very small
(less than 5). Thus, the proposed approaches
have a great chance to adopt the fast growth of
Snort releases.
Number of Released Patterns SSF-1 MV bitmap SSF-2 FMV bitmap SSF-2 RMV bitmap
Snort-2.0 2066 3213 695 3027
Snort-2.1 2617 3478 813 3296
Snort-2.2 2664 3575 835 3382
Snort-2.3 2679 3611 845 3413
Snort-2.4 2680 3611 845 3413
43Performance
Parallel Bloom Filter (PBF), Database Processor
(IDP)
Defcon9 Trace Filter-Algorithm Passed by Filter lt bytes gt Filter out percentage Filter cost time lt µs gt AC search cost time ltµsgt Total cost time lt µs gt Throughput ltMbpsgt
Defcon-1 of matched patterns 377,508 times (9,846,572 bytes) PBF 1,173,918 88 gt107 gt107 lt10
Defcon-1 of matched patterns 377,508 times (9,846,572 bytes) IDP 9,782,654 0.7 126,439 550,468 676,907 116
Defcon-1 of matched patterns 377,508 times (9,846,572 bytes) AC 9,846,572 0 0 558,079 558,079 141
Defcon-1 of matched patterns 377,508 times (9,846,572 bytes) SSF-1 2,916,802 70 122,841 212,307 335,148 250
Defcon-1 of matched patterns 377,508 times (9,846,572 bytes) SSF-2 1,917,544 81 130,809 160,872 291,681 270
Defcon-2 of matched patterns 147,843 times (9,849,836 bytes) PBF 492,491 95 gt107 gt107 lt10
Defcon-2 of matched patterns 147,843 times (9,849,836 bytes) IDP 9,777,406 0.8 125,901 529,602 655,503 120
Defcon-2 of matched patterns 147,843 times (9,849,836 bytes) AC 9,849,836 0 0 537,297 537,297 146
Defcon-2 of matched patterns 147,843 times (9,849,836 bytes) SSF-1 1,868,185 81 118,264 119,343 237,607 332
Defcon-2 of matched patterns 147,843 times (9,849,836 bytes) SSF-2 879,353 91 127,651 68,628 196,279 401
Defcon-3 of matched patterns 57,458 times (9,852,342 bytes) PBF 197,046 98 gt107 gt107 lt10
Defcon-3 of matched patterns 57,458 times (9,852,342 bytes) IDP 9,775,924 0.8 125,810 512,169 637,970 123
Defcon-3 of matched patterns 57,458 times (9,852,342 bytes) AC 9,852,342 0 0 513,081 513,081 153
Defcon-3 of matched patterns 57,458 times (9,852,342 bytes) SSF-1 1,350,541 86 117,000 80,374 197,374 400
Defcon-3 of matched patterns 57,458 times (9,852,342 bytes) SSF-2 391,024 96 126,523 29,739 156,262 504
Pentium-4 3.0 GHz personal computer with 1MB
level-2 cache, and installed with Intels VTune
tool
44Filter Percentage Throughput
- The filtering effectiveness of IDP scheme is
pretty bad and is not capable to handle Snorts
patterns. This is due to the bitmap used in the
IDP scheme has only 256 entries for one byte
symbol. - And most of the entries of are set as 1 for the
Snorts patterns. - Both PBF and SSF schemes are less sensible to the
growth of patterns and have a filtering
percentage around 80-98.
45Filter Percentage Throughput
- The PBF is only suitable for hardware-based
implementation, the throughput of PBF is less
than that of AC. - We can see that for the Defcon-1, the system
throughput is around double speed-up (270Mbps vs
141Mbps) compared to that of original AC
algorithm, and for Defcon-3, the system
throughput is even more than three times speed-up
(504Mbps vs 153Mbps). - The proposed SSF schemes consume far less memory
(cache-resident).
46The FA Example FTP
Flow Classification Using Stateful Method
47The FAs of BitTorrent protocols.
48The FAs of Yahoo Messenger protocol.
49????????
- DoS/DDoS
- Content Inspection Algorithms
- Zero-day Attacks
- Web Security
- Network Access Control (NAC)
- Wireless Security
50Zero-day Attacks
MS WMF 0-day exploits
10 Jan, 2006
29 Dec, 2005
28 Dec, 2005
Attack
BroadWeb released pattern update
MS WMF exploit publicly released
Microsoft released patch
Microsoft IE creates TextRange() Vulnerability
11 April, 2006
26 Mar, 2006
24 Mar, 2006
Attack
Vulnerability was publicly unveiled
BroadWeb released pattern update
Microsoft released patch
51MS06-001 WMF 0-day??
- 2005.12.28??
- ??IE????0-day??
- ?????????Win XP SP2?????
- ????????WMF?????,????????,??????????
- Crackz dot ws
- unionseek dot com
- www.tfcco dot com
- Iframeurl dot biz
- beehappyy dot biz
- more ...
52?WMF 0-day??????
53?WMF 0-day??????
54?WMF 0-day??????
????????,????????
55?WMF 0-day??????
?????????,??????????,????????
??IE???0-day?? ????,???????????????????!!
56Web Security
- Security Code
- Buffer Overflow Attack
- Vulnerability Avoidance
57Network Access Control (NAC)
- More than 70 attacks are launched from inside
- Provide first mile protection
- Network Access Controller
- Security Switches
- Defense-in-Depth
58Wireless Security Open Issues
- AAA issues
- Ad hoc networks (security routing Protocols)
- Sensor Networks Security
- WiFi and WiMAX (IEEE 802.16) networks Security
- Wireless Security Switch
59 Open Issues
- How to identify and manage encrypted protocols ?
such as Skype 2.0 and Winny. - Not by signatures (no signatures ?)
- May be by state machines
- How to design fast content inspection or pattern
matching algorithms ? - Modified AC algorithm or others
- Using Cache efficiently
- Pre-filter is good
- Post Filter is also necessary (Rules are more
complex) - How to design fast content inspection
co-processor ? - Regular Expression is necessary
- Many commercial products already, such as SafeNet
4850, Sensory Networks C-2000, IDT, Cavium,
Netlogic, etc - Security Switches provide first mile protection
- Wireless Security Switch as well
- Network Access Control (NAC) is a new emerging
trend.