Title: Northwestern Lab for Internet and Security Technology (LIST)
1Northwestern Lab for Internet and Security
Technology (LIST)
- Yan Chen
- Router-based Anomaly/Intrusion Detection and
Mitigation (RAIDM) Systems - Scalable and Accurate Overlay Network Monitoring
and Diagnosis - Wireless and Ad hoc Networking
2Northwestern Lab for Internet and Security
Technology (LIST)
- Yan Chen
- Department of Computer Science
- Northwestern University
- http//list.cs.northwestern.edu
3Our Theme
- Internet is becoming a new infrastructure for
service delivery - World wide web,
- VoIP
- Email
- Interactive TV?
- Major challenges for Internet-scale services
- Scalability 600M users, 35M Web sites, 2.1Tb/s
- Security viruses, worms, Trojan horses, etc.
- Mobility ubiquitous devices in phones, shoes,
etc. - Agility dynamic systems/network,
congestions/failures - Ossification extremely hard to deploy new
technology in the core
4Projects at LIST
- Global Router-based Anomaly/Intrusion Detection
(GRAID) Systems - Distributed Information Retrieval Systems
5Battling Hackers is a Growth Industry!
--Wall Street Journal (11/10/2004)
- The past decade has seen an explosion in the
concern for the security of information - Internet attacks are increasing in frequency,
severity and sophistication - Denial of service (DoS) attacks
- Cost 1.2 billion in 2000
- Thousands of attacks per week in 2001
- Yahoo, Amazon, eBay, Microsoft, White House,
etc., attacked
6Battling Hackers is a Growth Industry (contd)
- Virus and worms faster and powerful
- Melissa, Nimda, Code Red, Code Red II, Slammer
- Cause over 28 billion in economic losses in
2003, growing to over 75 billion in economic
losses by 2007. - Code Red (2001) 13 hours infected gt360K machines
- 2.4 billion loss - Slammer (2003) 10 minutes infected gt 75K
machines - 1 billion loss - Spywares are ubiquitous
- 80 of Internet computers have spywares installed
7The Spread of Sapphire/Slammer Worms
8Current Intrusion Detection Systems (IDS)
- Mostly host-based and not scalable to high-speed
networks - Slammer worm infected 75,000 machines in lt10 mins
- Host-based schemes inefficient and user dependent
- Have to install IDS on all user machines !
- Mostly signature-based
- Cannot recognize unknown anomalies/intrusions
- New viruses/worms, polymorphism
- Statistical detection
- Hard to adapt to traffic pattern changes
- Unscalable for flow-level detection
- IDS vulnerable to DoS attacks
- Overall traffic based inaccurate, high false
positives
9Current Intrusion Detection Systems (II)
- Cannot differentiate malicious events with
unintentional anomalies - Anomalies can be caused by network element faults
- E.g., router misconfiguration, signal
interference of wireless network, etc. - Isolated or centralized systems
- Insufficient info for causes, patterns and
prevalence of global-scale attacks
10Global Router-based Anomaly/Intrusion Detection
(GRAID) Systems
- Online traffic recording and analysis for
high-speed networks - Leverage sketches for data streaming computation
- Online adaptive flow-level anomaly/intrusion
detection and mitigation - Leverage statistical learning theory (SLT)
adaptively learn the traffic pattern changes - E.g., busy vs. idle wireless networks, with
different level of interferences, etc. - Unsupervised learning without knowing ground truth
11GRAID Systems (II)
- Integrated approach for false positive reduction
- Signature-based detection
- Network element fault diagnostics
- Traffic signature matching of emerging
applications - Hardware speedup for real-time detection
- Collaborated with Gokhan Memik (ECE of NU)
- Try various hardware platforms FPGAs, network
processors - Scalable anomaly/intrusion alarm fusion with
distributed hash tables (DHT) - Automatically distribute alerts with similar
symptoms to the same fusion center for analysis
12GRAID Detection Sensor
- Attached to a router or access point as a black
box - Edge network detection is particularly powerful
Monitor each port separately
Monitor aggregated traffic from all ports
Original configuration
13GRAID Sensor Architecture
Remote aggregated sketch records
Sent out for aggregation
Reversible k-ary sketch monitoring
Part I Sketch-based monitoring detection
Normal flows
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Streaming packet data
Keys of suspicious flows
Filtering
Keys of normal flows
Statistical detection
Signature-based detection
Per-flow monitoring
Network fault detection
Part II Per-flow monitoring detection
Suspicious flows
Traffic profile checking
Intrusion or anomaly alarms to fusion centers
Modules on the critical path
Modules on the non-critical path
Data path
Control path
14Scalable Traffic Monitoring and Analysis -
Challenge
- Potentially tens of millions of time series !
- Need to work at very low aggregation level (e.g.,
IP level) - Changes may be buried inside aggregated traffic
- The Moores Law on traffic growth ?
- Per-flow analysis is too slow or too expensive
- Want to work in near real time
- Existing approaches not directly applicable
- Mostly focus on heavy-hitters
15Sketch-based Change Detection(ACM SIGCOMM IMC
2003, 2004)
- Input stream (key, update)
- Summarize input stream using sketches
- Build forecast models on top of sketches
- Report flows with large forecast errors
16Sketch
- Probabilistic summary of data streams
- Originated in STOC 1996 AMS96
- Widely used in database research to handle
massive data streams
Space Accuracy
Hash table Per-key state 100
Sketch Compact With probabilistic guarantees (better for larger values)
17K-ary Sketch
- Array of hash tables TjK (j 1, , H)
- Update (k, u) Tj hj(k) u (for all j)
18K-ary Sketch (contd)
- Estimate v(S, k) sum of updates for key k
19Forecast Model EWMA
- Sketches are linear (Can combine sketches)
- Compute forecast error sketch Serror
- Update forecast sketch Sforecast
20Evaluation of Reversible K-ary Sketch
- Evaluated with tier-1 ISP trace and NU traces
- Scalable
- Can handle tens of millions of time series
- Accurate
- Provable probabilistic accuracy guarantees
- Even more accurate on real Internet traces
- Efficient
- For the worst case traffic, all 40 byte packets
- 16 Gbps on a single FPGA board
- 526 Mbps on a Pentium-IV 2.4GHz PC
- Only less than 3MB memory used
- Patent filed
21Remaining Challenges
- Reversible sketch to infer the culprit flows (ACM
SIGCOMM IMC 2004) - Hierarchical and multi-dimensional sketch
- Detecting distributed and insidious attacks with
sketch
22GRAID Sensor Architecture
Remote aggregated sketch records
Sent out for aggregation
Reversible k-ary sketch monitoring
Part I Sketch-based monitoring detection
Normal flows
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Streaming packet data
Keys of suspicious flows
Filtering
Keys of normal flows
Statistical detection
Signature-based detection
Per-flow monitoring
Network fault detection
Part II Per-flow monitoring detection
Suspicious flows
Traffic profile checking
Intrusion or anomaly alarms to fusion centers
Modules on the critical path
Modules on the non-critical path
Data path
Control path
23Statistical Anomaly Detection
- Online statistical detection with sketches
- Applying Statistical Learning Theory (STL)
- Use Hidden Markov Model (HMM) to adaptively learn
the parameters - Focus on two major intrusions denial of service
(DoS) attacks and port scanning - Monitor traffic with multiple sketches
- With different keys
- (Source IP, Dest IP)
- (Source IP, Dest port)
- (Dest IP, Dest port)
- For each key, record the number of unconnected
TCP requests SYN SYN/ACK
24Intrusion Mitigation
Attacks detected Mitigation
Denial of Service (DoS), e.g., TCP SYN flooding SYN defender, SYN proxy, or SYN cookie for victim
Port Scan and worms Ingress filtering with attacker IP
Vertical port scan Quarantine the victim machine
Horizontal port scan Monitor traffic with the same port for compromised machine
Spywares Warn the end users being spied
25GRAID Sensor Architecture
Remote aggregated sketch records
Sent out for aggregation
Reversible k-ary sketch monitoring
Part I Sketch-based monitoring detection
Normal flows
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Streaming packet data
Keys of suspicious flows
Filtering
Keys of normal flows
Statistical detection
Signature-based detection
Per-flow monitoring
Network fault detection
Part II Per-flow monitoring detection
Suspicious flows
Traffic profile checking
Intrusion or anomaly alarms to fusion centers
Modules on the critical path
Modules on the non-critical path
Data path
Control path
26Network Diagnosis and Fault Location
- Infrastructure ossification led to thrust of
overlay applications - Traceroute gives hop-by-hop round-trip latency
- Asymmetric routing
- Cant get hop-by-hop loss rate !
- Network tomography
- Infer the properties of links from end-to-end
measurements - Limited measurements -gt under-constrained system,
unidentifiable links - Existing work uses various constraints and
assumptions - Tree-like topology
- The number of lossy links is small
27Our Approach Virtual Links
- Minimal link sequences (path segments) whose loss
rates uniquely identified - Locate the faults to certain link(s)
- The first lower-bound on the network tomography
granularity - Use algebraic scheme to find virtual links
- Leverage our work on overlay network monitoring
(ACM SIGCOMM IMC 2003, ACM SIGCOMM 2004)
28GRAID Sensor Architecture
Remote aggregated sketch records
Sent out for aggregation
Reversible k-ary sketch monitoring
Part I Sketch-based monitoring detection
Normal flows
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Streaming packet data
Keys of suspicious flows
Filtering
Keys of normal flows
Statistical detection
Signature-based detection
Per-flow monitoring
Network fault detection
Part II Per-flow monitoring detection
Suspicious flows
Traffic profile checking
Intrusion or anomaly alarms to fusion centers
Modules on the critical path
Modules on the non-critical path
Data path
Control path
29Intrusion/anomaly Alarm Fusion
- Individual IDS has bad accuracy due to limited
view - Crucial to collect information from multiple
vantage points distributed IDS (DIDS) - Each IDS generate local symptom report, send to
sensor fusion center (SFC) - Help understand the prevalence, cause and
patterns of global-scale attacks - Existing DIDS
- Centralized fusion
- Distributed fusion with unscalable communication
30GRAID Sensor Interconnection
- Though Cyber Disease DHT (distributed hash table)
for alarm fusion - Scalability
- Load balancing
- Fault-tolerance
- Intrusion correlation
31Basic Operations of CDDHT
- put (disease_key, symptom report)
- Send report to SFC
- attack_info get (disease_key)
- Query about certain attacks from SFC
- Each operation only O(n) hops
- n is the total number of nodes in CDDHT
32CDDHT Disease Key Design
Intrusion ID Characterization Field(s) Characterization Field(s) Characterization Field(s)
DoS Attack 0 Victim IP (subnet) Victim IP (subnet) Victim IP (subnet)
Scans 1 0 (for vertical block scan) Source IP address Destination IP (for vertical scan)
Scans 1 0 (for vertical block scan) Source IP address 0 (for block scan)
Scans 1 1 (for horizontal coordinated scan) Scan port number Source IP (for horizontal scan)
Scans 1 1 (for horizontal coordinated scan) Scan port number 0 (for coordinated scan)
Viruses/Worms 2 0 (for known virus/worm) 0 (for known virus/worm) Worm ID
Viruses/Worms 2 1 (for unknown virus/worm) 1 (for unknown virus/worm) Destination port number
33Other Challenges of CDDHT
- Load balancing
- Supporting complicated queries
- E.g., aggregate queries
- Attack resilience
- OK to have some IDS sensors compromised
- What about SFCs?
34Research methodology
- Combination of theory, synthetic/real trace
driven simulation, and real-world implementation
and deployment
35Conclusion for GRAID Systems
- Online traffic recording and analysis on
high-speed networks - Online statistical anomaly detection
- Integrated approach for false positive reduction
- Signature-based detection
- Network element fault diagnostics
- Traffic signature matching of emerging
applications - Hardware speedup for real-time detection
- Scalable anomaly/intrusion alarm fusion with
distributed hash tables (DHT)