Towards Scalable and Robust Distributed Intrusion Alert Fusion with Good Load Balancing PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Towards Scalable and Robust Distributed Intrusion Alert Fusion with Good Load Balancing


1
Towards Scalable and Robust Distributed Intrusion
Alert Fusion with Good Load Balancing
  • Zhichun Li, Yan Chen and Aaron Beach

Lab for Internet Security Technology
(LIST) http//list.cs.northwestern.eduNorthwester
n University
2
The Spread of CodeRed
3
Distributed IDSes
  • Distributed Intrusion Detection Systems (IDSes)
  • Crucial to identify large-scale attacks early
  • Robust to various scan techniques
  • Locate the attackers/zombies when spoofed
  • E.g, Symantec has 20,000 sensors in 180 countries
  • General architecture
  • IDS nodes
  • Generate the alarms
  • Heterogeneous host- or network- based
  • Sensor fusion centers (SFCs)
  • Fuse the alarms
  • A subset of IDSes or dedicated hosts

4
Desired Features of DIDS Infrastructure
  • Scalability
  • 15 million daily intrusion alerts reported to
    DShield
  • Route only related alarms to the same SFC
  • Over 18,000 vulnerabilities found CERT
  • 17,500 Win32 threats and their variants
    Symantec
  • Hierarchical fusion cannot scale w/ diverse
    alerts
  • Distributed queries over multiple SFCs
  • Good load balancing
  • Attack resiliency

5
Outline
  • Motivation
  • CDDHT Design
  • Features of CDDHT
  • Evaluation
  • Related Work
  • Conclusion

6
Cyber Disease Distributed Hash Tables (CDDHT)
  • General intrusion alert fusion framework, can
    plug-in any alert generation or alert fusion
    algorithm
  • Part of the Router-based Anomaly/Intrusion
    Detection and Mitigation (RAIDM) system in LIST
  • High-speed network measurement with reversible
    sketches IMC 2004, INFOCOM 2006
  • Online flow-level anomaly/intrusion detection
    IEEE ICDCS 2006 IEEE CGA, Security
    Visualization 06
  • Router-based polymorphic worm signature
    generation IEEE Symposium on Security and
    Privacy 2006

7
CDDHT Design
  • Leverage DHT systems
  • O(log(n)) hops distance where n is the of nodes
  • O(log(n)) maintenance overhead for routing
  • Guaranteed success for deterministic routing
  • Fault-tolerant, robust, and DoS attack resilient
  • Becoming increasingly popular for serious use
  • Eg, eMule P2P system uses Kademila
  • Primitives of CDDHT
  • Put (disease key, symptom report)
  • Summary report Get (disease key)

8
Architecture of CDDHT
Attack Injected
Attack Injected
Internet
9
Disease Key Design
  • Challenge fuse the vast, diverse symptoms from
    heterogeneous IDSes with different views
  • Key generation in a decentralized and
    deterministic manner
  • Key idea generate the disease keys which capture
    the uniqueness of certain attacks
  • Focus on popular types of attacks
  • Improve with features
  • Load balancing
  • Attack resilience

10
The Disease Key
Intrusion ID Characterization Field(s) Characterization Field(s) Characterization Field(s) Length
DoS Attack 000 Victim IP (subnet) Victim IP (subnet) Victim IP (subnet) 35 bits
Scans 001 0 (for vertical block scan) Source IP Source IP 36 bits
Scans 001 1 (for horizontal coordinated scan) Destport Src IP (horizontal scan) 52 bits
Scans 001 1 (for horizontal coordinated scan) Destport 0 (coordinated scan) 52 bits
Viruses/Worms 010 0 (for known) 0 (for known) Worm ID (32bit) 36 bits
Viruses/Worms 010 1 (for unknown) 1 (for unknown) Dst port 20 bits
Botnets 011 00 (for DDNS entry) 00 (for DDNS entry) Botnet ID (32bit) 37 bits
Botnets 011 01 (for URL entry) 01 (for URL entry) Botnet ID (32bit) 37 bits
  • Currently, model four types of attacks
  • Extensible design

11
Port Scan Disease Key Design
  • Vertical scan and block scan
  • Source IP
  • Horizontal scan and Coordinated scan
  • Scan port
  • Horizontal Source IP

12
Viruses/Worms and Botnets Disease Key Design
  • Viruses/Worms
  • Known worms hash of the worm name
  • Unknown worms worm scan port
  • Botnets
  • Assume botnets use centralized CC
  • IRC based bots dynamic DNS
  • Web based bots URL
  • Botnet ID hash of the DDNS or URL

13
Outline
  • Motivation
  • CDDHT Design
  • Features of CDDHT
  • Evaluation
  • Related Work
  • Conclusion

14
Load Balancing
  • Challenges to load balancing
  • Large key space in DHT
  • Highly skewed alert distribution

Number of ports picked
Number of subnets picked
15
Load Balancing II
  • Proactive balancing with stable hot spots
  • Reduce key space of port to 7 bits
  • 64 buckets for 64 most popular port
  • Remaining 64 buckets randomly assigned to other
    port
  • Balancing load of the key space
  • Node migration
  • Virtual node
  • Load-aware bootstrap
  • Balancing load of single hot key
  • IDS alarm rate limiting
  • Aggregation tree for large-scale attacks
  • Received alarms by the final SFC bounded by
    O(log(n))

16
Attack Resilience
  • DoS resilience comparison with hierarchical model
  • Proved the average number of alerts unreachable
    to their corresponding SFCs given one node loss
  • Hierarchical DIDS O(log (n))
  • CDDHT O(1)
  • More in the paper
  • Authenticity of alarms
  • Dealing with compromised nodes

17
Outline
  • Motivation
  • CDDHT Design
  • Features of CDDHT
  • Evaluation
  • Related Work
  • Conclusion

18
Methodology
  • Implementation
  • Preliminary CDDHT system based on Chord simulator
  • Event-driven simulation
  • Each alarm is an event with a timestamp from
    certain IDSes
  • Datasets
  • DShield firewall logs (Jan. 2004)
  • Results from each days data are similar
  • Use January 2nd 2004 as illustration
  • 25 million scan logs from 1,417 providers
  • Randomly choose 10 to be SFCs

Scan type Vertical Horizontal Block Coordinated
of scans 3364 8486 22 25711
19
Evaluation Metrics
  • Fusion effectiveness
  • 100 due to deterministic routing of CDDHT
  • Load balancing
  • Consider number of alerts received at each SFC
  • Maximum vs. mean ratio (MMR)
  • Coefficient of variation (CV)

20
Proactive Balancing with Stable Hot Ports
Proactive load balancing can reduce CV by 60 and
reduce MMR by 40
21
The Load Variation Comparison Between
Hierarchical Scheme and CDDHT
CDDHT w/ PBVN
CDDHT
CDDHT w/ PBVN
CDDHT
Hierarchical
Hierarchical
CDDHT w/ PB
CDDHT w/ PB
  • Median, 10- and 90- percentile of 10 runs
  • CDDHT with proactive balancing (PB) and virtual
    nodes (VN)
  • Compared with Hierarchical schemes, CDDHT
    reduces the MMR by a factor of 5.5 and CV by a
    factor of 5.2

22
Outline
  • Motivation
  • CDDHT Design
  • Features of CDDHT
  • Evaluation
  • Related Work
  • Conclusion

23
Related Works
CDDHT Centralized/Hierarchical Model Publish/Subscribe Model P2P Querying
Failure/ attack resilience High Low High High
Fusion overhead Low Low High Low
Query overhead Low Low Low High
  • WormShield uses DHT specifically to find popular
    content fingerprints as worm signatures, but does
    not work for polymorphic worms

24
Conclusion
  • Large number and diverse alerts from many
    distributed IDSes calls for efficient fusion of
    these alerts
  • CDDHT Cyber Disease DHT
  • Efficient route alarms of different intrusions to
    different SFCs
  • Highly scalable and robust
  • Good load balancing
  • High attack resilience
  • Future work
  • Disease keys for more types of attacks and
    querying of CDDHT

25
Backup Slides
26
Introduction to DHT
  • DHT (Distributed Hash Table) An infrastructure
    that enables the distribution of an ordinary hash
    table onto a set of cooperating nodes

Key Object
0x2535 Apple
0x2353 Banana
0x3978 Peach
0x9123 Strawberry
0x7234 Grape
0x5942 Watermelon
  • Basic operations
  • Put(Key, Object) From Key to find the
    corresponding node via DHT routing and store the
    Object on the node
  • ObjectGet(Key) From Key to find the
    corresponding node via DHT routing and retrieve
    the Object from the node

27
Introduction to DHT II
  • Different DHT systems
  • Chord
  • CAN
  • Pastry
  • Tapestry
  • Kademlia
  • Kademia has been used in eMule P2P software

Chord DHT routing
  • DHT routing
  • Distributed and deterministic routing
  • The max hops to find the node corresponding to a
    key is bounded by O( log (n) )

28
DoS Attack Disease Key Design
  • Most DoS attack target specific IP addresses (the
    server) or the subnet (Bandwidth consuming
    attack)
  • But the victim IP (subnet) can be destination or
    source (in backscatter)
  • Other parts all can be variants

29
Related Works
  • Centralized/Hierarchical Model
  • Publish/subscribe Model
  • O(n2) communicate vs. O(n)
  • P2P Query
  • Scalability with frequent fusion

30
Attack Resilience
  • DoS resilience comparison with hierarchical model
  • Proved the average number of disconnected nodes
    given one node loss
  • in a k-way hierarchical DIDS is O(log (n))
  • but the DHT based is O(1).
  • Authenticity of alarms
  • Valid the source subnets of IDS by Whois and BGP
    tables
  • Use PKI to verify the messages send by IDSes/SFCs

31
Attack Resilience II
  • Dealing with compromised nodes
  • IDS nodes
  • Voting the importance of the results by of
    IDSes, IP coverages
  • Probability based verification for alarm
    aggregation
  • SFC nodes
  • The trust but verify principle
  • Envision that there is a centralized authority
    randomly check the fusion results for the SFCs

32
Proactive Balancing with Stable Hot Ports
Use 7 bits encoding, can reduce MMR by 60 and
reduce CV by 40
33
Dynamic of Load Variation over Time
  • MMR for CDDHT is much smaller and smoother
  • CV also get better
Write a Comment
User Comments (0)
About PowerShow.com