Automated Worm Fingerprinting - PowerPoint PPT Presentation

About This Presentation
Title:

Automated Worm Fingerprinting

Description:

... Dealing with slow worms Comparison Breather Polygraph: Automatically Generating Signatures For Polymorphic Worms James Newsome, Brad Karp, ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 33
Provided by: ComputerSc270
Category:

less

Transcript and Presenter's Notes

Title: Automated Worm Fingerprinting


1
Automated Worm Fingerprinting
  • Sumeet Singh, Cristian Estan, George Varghese,
    and Stefan Savage
  • Manan Sanghi

2
The menace
3
Context
  • Worm Detection
  • Scan detection
  • Honeypots
  • Host based behavioral detection
  • Payload-based ???

4
Context
  • Characterization
  • A priori vulnerability signatures
  • Generally manual
  • Honeycomb
  • Host based
  • Longest common subsequences
  • Autograph
  • Network level automatic signature generation

5
Context
Internet Quarantine
  • Containment
  • Host quarantine
  • String matching
  • Connection throttling

Address Blacklisting Content Filtering
6
Worm behavior
  • Content Invariance
  • Limited polymorphism e.g. encryption
  • key portions are invariant e.g. decryption
    routine
  • Content Prevalence
  • invariant portion appear frequently
  • Address Dispersion
  • of infected distinct hosts grow overtime
  • reflecting different source and dest. addresses

7
Key Idea
  • Detect unknown worms on the basis of
  • A common exploit sequence
  • Rage of unique sources and destination

8
Content Sifting
  • For each string w, maintain
  • prevalence(w) Number of times it is found in the
    network traffic
  • sources(w) Number of unique sources
    corresponding to it
  • destinations(w) Number of unique destinations
    corresponding to it
  • If thresholds exceeded, then block(w)

9
Issues
  • How to compute prevalence(w), sources(w) and
    destinations(w) efficiently?
  • Scalable
  • Low memory and CPU requirements
  • Real time deployment over a Gigabit scale link

10
prevalence(w)
  • w entire packet
  • Use multi-stage filters (k-ary sketches?)
  • w small fixed length b
  • Rabin fingerprints
  • Value sampling

11
Value Sampling
  • The problem s-b1 substrings
  • Solution Sample
  • But Random sampling is not good enough
  • Trick Sample only those substrings for which the
    fingerprint matches a certain pattern
  • Since Rabin fingerprints are randomly ditributed,
  • Prtrack(x)1-e-f(x-b1)

12
sources(w) destinations(w)
  • Address Dispersion
  • Counting distinct elements vs. repeating elements
  • Simple list or hash table is too expensive
  • Key Idea Bitmaps
  • Trick Scaled Bitmaps

13
Direct Bitmap
  • Each content source is hashed into a bitmap, the
    corresponding bit is set, and an alarm is raised
    when the number of bits set exceeds a threshold
  • Drawback lose estimation of actual values of
    each counter

14
Scaled Bitmap
  • Idea Subsample the range of hash space
  • How it works?
  • multiple bitmaps each mapped to progressively
    smaller and smaller portions of the hash space.
  • bitmap recycled if necessary.

Result Roughly 5 time less memory actual
estimation of address dispersion
15
Putting it together
16
Experience
  • System design Sensors and Aggregators
  • sensor sift through traffic on configurable
    address space zones of responsibility
  • aggregator coordinates real-time updates from the
    sensors, coalesces related signatures and so on.
  • Parameters
  • content prevalence 3
  • address dispersion threshold30
  • garbage collection time several hours

17
prevalence(w) threshold
18
Address Dispersion threshold
19
Garbage Collection threshold
20
Trace-based False Positives
21
Performance
  • Processing time
  • Memory
  • Consumption
  • 4M bytes

22
Live Experience
  • Detect known worms CodeRed,
  • Detect new worms MyDoom, Sasser, Kibvu.B

23
Limitation Extension
  • Variant content
  • Network evasion
  • Extension Dealing with slow worms

24
Comparison
Earlybird Autograph
Infect the system with Network Data (real traces) Infect the system with Network Data (real traces)
Rabin fingerprint Rabin fingerprint
White-list/blacklist White-list/blacklist
No-prefiltering Flow-reassembly
Single sensor algorithmics centralized aggregators Distributed Deployment active cooperation between multiple sensors
On-line Off-line
Overlapping, fixed-length chunks Non-overlapping, variable-length chunks
Qinghua Zhang
25
Breather
26
Polygraph Automatically Generating Signatures
For Polymorphic Worms
  • James Newsome, Brad Karp, Dawn Song

27
The case for polymorphic worms
  • Single Substring Insufficient
  • Sensitive Should exist in all payload of a worm
  • Specific Should be long enough to not exist in
    any non-worm payload

28
Examples
29
Signature Classes
  • Signature set of tokens
  • Conjunction Signatures
  • Token-subsequence Signatures
  • Bayes Signatures

30
Problem Formulation
31
Algorithms
  • Preprocessing
  • Distinct substrings of a minimum length l that
    occur in at least k samples in suspicious pool
  • Generating signatures
  • Conjunction signatures
  • Token Subsequence Signatures
  • Bayes Signatures

32
Wrap Up
  • Automated Worm Fingerprinting (OSDI 2004)
  • Polygraph Automatically Generating Signatures
    For Polymorphic Worms
  • (IEEE Security Symposium 2005)

Manan Sanghi
Write a Comment
User Comments (0)
About PowerShow.com