Northwestern Lab for Internet and Security Technology (LIST) - PowerPoint PPT Presentation

About This Presentation

Title:

Northwestern Lab for Internet and Security Technology (LIST)

Description:

With different keys (Source IP, Dest IP) (Source IP, Dest port) (Dest IP, Dest port) ... For each key, record the number of unconnected TCP requests: SYN SYN ... – PowerPoint PPT presentation

Number of Views:27

Avg rating:3.0/5.0

Slides: 36

Provided by: yanc8

Learn more at: https://users.cs.northwestern.edu

Category:

more less

Transcript and Presenter's Notes

Title: Northwestern Lab for Internet and Security Technology (LIST)

1
Northwestern Lab for Internet and Security
Technology (LIST)

Yan Chen
Router-based Anomaly/Intrusion Detection and
Mitigation (RAIDM) Systems
Scalable and Accurate Overlay Network Monitoring
and Diagnosis
Wireless and Ad hoc Networking

2
Northwestern Lab for Internet and Security
Technology (LIST)

Yan Chen
Department of Computer Science
Northwestern University
http//list.cs.northwestern.edu

3
Our Theme

Internet is becoming a new infrastructure for
service delivery
World wide web,
VoIP
Email
Interactive TV?
Major challenges for Internet-scale services
Scalability 600M users, 35M Web sites, 2.1Tb/s
Security viruses, worms, Trojan horses, etc.
Mobility ubiquitous devices in phones, shoes,
etc.
Agility dynamic systems/network,
congestions/failures
Ossification extremely hard to deploy new
technology in the core

4
Projects at LIST

Global Router-based Anomaly/Intrusion Detection
(GRAID) Systems
Distributed Information Retrieval Systems

5
Battling Hackers is a Growth Industry!
--Wall Street Journal (11/10/2004)

The past decade has seen an explosion in the
concern for the security of information
Internet attacks are increasing in frequency,
severity and sophistication
Denial of service (DoS) attacks
Cost 1.2 billion in 2000
Thousands of attacks per week in 2001
Yahoo, Amazon, eBay, Microsoft, White House,
etc., attacked

6
Battling Hackers is a Growth Industry (contd)

Virus and worms faster and powerful
Melissa, Nimda, Code Red, Code Red II, Slammer
Cause over 28 billion in economic losses in
2003, growing to over 75 billion in economic
losses by 2007.
Code Red (2001) 13 hours infected gt360K machines
- 2.4 billion loss
Slammer (2003) 10 minutes infected gt 75K
machines - 1 billion loss
Spywares are ubiquitous
80 of Internet computers have spywares installed

7
The Spread of Sapphire/Slammer Worms
8
Current Intrusion Detection Systems (IDS)

Mostly host-based and not scalable to high-speed
networks
Slammer worm infected 75,000 machines in lt10 mins
Host-based schemes inefficient and user dependent
Have to install IDS on all user machines !
Mostly signature-based
Cannot recognize unknown anomalies/intrusions
New viruses/worms, polymorphism
Statistical detection
Hard to adapt to traffic pattern changes
Unscalable for flow-level detection
IDS vulnerable to DoS attacks
Overall traffic based inaccurate, high false
positives

9
Current Intrusion Detection Systems (II)

Cannot differentiate malicious events with
unintentional anomalies
Anomalies can be caused by network element faults
E.g., router misconfiguration, signal
interference of wireless network, etc.
Isolated or centralized systems
Insufficient info for causes, patterns and
prevalence of global-scale attacks

10
Global Router-based Anomaly/Intrusion Detection
(GRAID) Systems

Online traffic recording and analysis for
high-speed networks
Leverage sketches for data streaming computation
Online adaptive flow-level anomaly/intrusion
detection and mitigation
Leverage statistical learning theory (SLT)
adaptively learn the traffic pattern changes
E.g., busy vs. idle wireless networks, with
different level of interferences, etc.
Unsupervised learning without knowing ground truth

11
GRAID Systems (II)

Integrated approach for false positive reduction
Signature-based detection
Network element fault diagnostics
Traffic signature matching of emerging
applications
Hardware speedup for real-time detection
Collaborated with Gokhan Memik (ECE of NU)
Try various hardware platforms FPGAs, network
processors
Scalable anomaly/intrusion alarm fusion with
distributed hash tables (DHT)
Automatically distribute alerts with similar
symptoms to the same fusion center for analysis

12
GRAID Detection Sensor

Attached to a router or access point as a black
box
Edge network detection is particularly powerful

Monitor each port separately
Monitor aggregated traffic from all ports
Original configuration
13
GRAID Sensor Architecture
Remote aggregated sketch records
Sent out for aggregation
Reversible k-ary sketch monitoring
Part I Sketch-based monitoring detection
Normal flows
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Streaming packet data
Keys of suspicious flows
Filtering
Keys of normal flows
Statistical detection
Signature-based detection
Per-flow monitoring
Network fault detection
Part II Per-flow monitoring detection
Suspicious flows
Traffic profile checking
Intrusion or anomaly alarms to fusion centers
Modules on the critical path
Modules on the non-critical path
Data path
Control path
14
Scalable Traffic Monitoring and Analysis -
Challenge

Potentially tens of millions of time series !
Need to work at very low aggregation level (e.g.,
IP level)
Changes may be buried inside aggregated traffic
The Moores Law on traffic growth ?
Per-flow analysis is too slow or too expensive
Want to work in near real time
Existing approaches not directly applicable
Mostly focus on heavy-hitters

15
Sketch-based Change Detection(ACM SIGCOMM IMC
2003, 2004)

Input stream (key, update)

Summarize input stream using sketches

Build forecast models on top of sketches

Report flows with large forecast errors

16
Sketch

Probabilistic summary of data streams
Originated in STOC 1996 AMS96
Widely used in database research to handle
massive data streams

Space Accuracy
Hash table Per-key state 100
Sketch Compact With probabilistic guarantees (better for larger values)
17
K-ary Sketch

Array of hash tables TjK (j 1, , H)

Update (k, u) Tj hj(k) u (for all j)

18
K-ary Sketch (contd)

Estimate v(S, k) sum of updates for key k

19
Forecast Model EWMA

Sketches are linear (Can combine sketches)
Compute forecast error sketch Serror

Update forecast sketch Sforecast

20
Evaluation of Reversible K-ary Sketch

Evaluated with tier-1 ISP trace and NU traces
Scalable
Can handle tens of millions of time series
Accurate
Provable probabilistic accuracy guarantees
Even more accurate on real Internet traces
Efficient
For the worst case traffic, all 40 byte packets
16 Gbps on a single FPGA board
526 Mbps on a Pentium-IV 2.4GHz PC
Only less than 3MB memory used
Patent filed

21
Remaining Challenges

Reversible sketch to infer the culprit flows (ACM
SIGCOMM IMC 2004)
Hierarchical and multi-dimensional sketch
Detecting distributed and insidious attacks with
sketch

22
GRAID Sensor Architecture
Remote aggregated sketch records
Sent out for aggregation
Reversible k-ary sketch monitoring
Part I Sketch-based monitoring detection
Normal flows
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Streaming packet data
Keys of suspicious flows
Filtering
Keys of normal flows
Statistical detection
Signature-based detection
Per-flow monitoring
Network fault detection
Part II Per-flow monitoring detection
Suspicious flows
Traffic profile checking
Intrusion or anomaly alarms to fusion centers
Modules on the critical path
Modules on the non-critical path
Data path
Control path
23
Statistical Anomaly Detection

Online statistical detection with sketches
Applying Statistical Learning Theory (STL)
Use Hidden Markov Model (HMM) to adaptively learn
the parameters
Focus on two major intrusions denial of service
(DoS) attacks and port scanning
Monitor traffic with multiple sketches
With different keys
(Source IP, Dest IP)
(Source IP, Dest port)
(Dest IP, Dest port)
For each key, record the number of unconnected
TCP requests SYN SYN/ACK

24
Intrusion Mitigation
Attacks detected Mitigation
Denial of Service (DoS), e.g., TCP SYN flooding SYN defender, SYN proxy, or SYN cookie for victim
Port Scan and worms Ingress filtering with attacker IP
Vertical port scan Quarantine the victim machine
Horizontal port scan Monitor traffic with the same port for compromised machine
Spywares Warn the end users being spied
25
GRAID Sensor Architecture
Remote aggregated sketch records
Sent out for aggregation
Reversible k-ary sketch monitoring
Part I Sketch-based monitoring detection
Normal flows
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Streaming packet data
Keys of suspicious flows
Filtering
Keys of normal flows
Statistical detection
Signature-based detection
Per-flow monitoring
Network fault detection
Part II Per-flow monitoring detection
Suspicious flows
Traffic profile checking
Intrusion or anomaly alarms to fusion centers
Modules on the critical path
Modules on the non-critical path
Data path
Control path
26
Network Diagnosis and Fault Location

Infrastructure ossification led to thrust of
overlay applications
Traceroute gives hop-by-hop round-trip latency
Asymmetric routing
Cant get hop-by-hop loss rate !
Network tomography
Infer the properties of links from end-to-end
measurements
Limited measurements -gt under-constrained system,
unidentifiable links
Existing work uses various constraints and
assumptions
Tree-like topology
The number of lossy links is small

27
Our Approach Virtual Links

Minimal link sequences (path segments) whose loss
rates uniquely identified
Locate the faults to certain link(s)
The first lower-bound on the network tomography
granularity
Use algebraic scheme to find virtual links
Leverage our work on overlay network monitoring
(ACM SIGCOMM IMC 2003, ACM SIGCOMM 2004)

28
GRAID Sensor Architecture
Remote aggregated sketch records
Sent out for aggregation
Reversible k-ary sketch monitoring
Part I Sketch-based monitoring detection
Normal flows
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Streaming packet data
Keys of suspicious flows
Filtering
Keys of normal flows
Statistical detection
Signature-based detection
Per-flow monitoring
Network fault detection
Part II Per-flow monitoring detection
Suspicious flows
Traffic profile checking
Intrusion or anomaly alarms to fusion centers
Modules on the critical path
Modules on the non-critical path
Data path
Control path
29
Intrusion/anomaly Alarm Fusion

Individual IDS has bad accuracy due to limited
view
Crucial to collect information from multiple
vantage points distributed IDS (DIDS)
Each IDS generate local symptom report, send to
sensor fusion center (SFC)
Help understand the prevalence, cause and
patterns of global-scale attacks
Existing DIDS
Centralized fusion
Distributed fusion with unscalable communication

30
GRAID Sensor Interconnection

Though Cyber Disease DHT (distributed hash table)
for alarm fusion
Scalability
Load balancing
Fault-tolerance
Intrusion correlation

31
Basic Operations of CDDHT

put (disease_key, symptom report)
Send report to SFC
attack_info get (disease_key)
Query about certain attacks from SFC
Each operation only O(n) hops
n is the total number of nodes in CDDHT

32
CDDHT Disease Key Design
Intrusion ID Characterization Field(s) Characterization Field(s) Characterization Field(s)
DoS Attack 0 Victim IP (subnet) Victim IP (subnet) Victim IP (subnet)
Scans 1 0 (for vertical block scan) Source IP address Destination IP (for vertical scan)
Scans 1 0 (for vertical block scan) Source IP address 0 (for block scan)
Scans 1 1 (for horizontal coordinated scan) Scan port number Source IP (for horizontal scan)
Scans 1 1 (for horizontal coordinated scan) Scan port number 0 (for coordinated scan)
Viruses/Worms 2 0 (for known virus/worm) 0 (for known virus/worm) Worm ID
Viruses/Worms 2 1 (for unknown virus/worm) 1 (for unknown virus/worm) Destination port number
33
Other Challenges of CDDHT