Reactive Patching: a viable worm defence strategy - PowerPoint PPT Presentation

1 / 82

About This Presentation

Title:

Reactive Patching: a viable worm defence strategy

Description:

Minimum Broadcast Curve ... Comparison: Minimum broadcast curve yields an upper bound on the fraction of infectives ... a logistic minimum broadcast curve ... – PowerPoint PPT presentation

Number of Views:38

Avg rating:3.0/5.0

Slides: 83

Provided by: mil1150

Category:

more less

Transcript and Presenter's Notes

Title: Reactive Patching: a viable worm defence strategy

1
Reactive Patching a viable worm defence
strategy ?

Milan Vojnovic Ayalvadi Ganesh
Microsoft Research
Cambridge, United Kingdom

Tutorial Performance 2005 Juan-le-Pins, France,
Oct 4, 05
2
Who is this tutorial intended for?

Security non-specialist
Learn some strategies of worm spread
effectiveness of some countermeasures
Security specialist
Fundamental limits of patching
Whom is it not?
Those interested in gory details of particular
worms and vulnerabilities they exploit

3
What is a Worm?

Self-replicating malicious code that
Exploits a known or unknown software
vulnerability (e.g. buffer overflow)
To gain (partial?) control over the host
Uses the host to propagate copies of itself
Typically, does not require human intervention
Unlike viruses
Contributes to speed of spread

4
Buffer overflow vulnerability

Example Web form

Name
Data
Program
Name
Worm data
Overwrites program
5
Motivation for studying worms

Self-replicating malicious code spreads very
quickly
Code Red v2 360,000 hosts in 24 hours
Slammer 75,000 hosts in 10 minutes
Causes huge economic damage
Backbone saturation, cleanup
Things could be worse!
Hard or impossible to eliminate all
vulnerabilities in code

6
Roadmap

Worm spread strategies
Target discovery mechanisms
SI epidemic model of worm spread
Patch spread strategies
Analysis of patching
Patching
Patching with filtering
Candidate strategies PUSH, P2P
Summary and conclusions

7
Target discovery (1)

No a priori knowledge of vulnerable hosts
Random scanning
Generate IP address at random
If vulnerable host found at that address, infect
it
Commonly used in current generation of worms,
e.g., Code Red I, Slammer
Not very smart, but can still be fast

8
Host population types

S (susceptible)

I (infected)
P (patched)
9
Random scanning worm
??
1
IP address
scan hit !
10
Random scanning worm
??
1
11
Target discovery (2)

Local preference /subnet biased scanning
Ex. Code Red II, Zotob
Infected hosts split their scanning effort
between
Local subnet (IP addresses with the same 1-2-3
octet prefix)
Global Internet
Why ?

12
Subnet preference Code Red II
1/8
1/2
3/8
/8
/16
13
Subnet preference Zotob

Scans local /16 address space until
512 consecutive scans miss, or
Until 32 scans miss if there has been no success
Then switches to random scanning of entire IP
address space

14
Target discovery (3)

Hit lists Worm seeded with list of vulnerable
targets identified in advance
Carried in worm payload, or
Looked up from external server, e.g., meta-server
for games
Objective accelerate initial spread

15
Target discovery (4)

Topological worms
Target lists obtained from data residing on host,
ex,
Local DNS cache
Instant messenger contact lists
Neighbour lists of P2P applications
Potentially very fast, and hard to distinguish
from legitimate use

16
Roadmap

Worm spread strategies
Target discovery mechanisms
SI epidemic model of worm spread
Patch spread strategies
Analysis of patching
Patching
Patching with filtering
Candidate strategies PUSH, P2P
Summary and conclusions

17
Model of random scanning

Address space of size O 232
N vulnerable hosts, occupy fraction N/O of
address space
Infected hosts scan addresses randomly at rate ?
Code Red ? 360 per minute
Slammer (UDP) ? 4000 per second
If scan locates a vulnerable host, it is infected

18
Random scanning (in pictures)
??
1
19
Stochastic epidemic model

Infected hosts scan IP address space at points of
Poisson process of rate ?
Independent at distinct hosts
Rate at which scans hit vulnerable hosts ß ?
N / O
I(t) Number of infected hosts, evolves as a
Markov process
High-level model ignores network congestion,
latency

20
Deterministic epidemic model

Large population limit
N?8, ?/O fixed
i(t) I(t)/N fraction of hosts infected
i(t) density dependent Markov chain
Converges to limit deterministic ODE
i(t) ß i(t) 1-i(t)

21
Modelling Code Red
22
Characteristic time scale

Epidemic growth follows logistic curve
Initially exponential, then saturates
Time to infect most of the susceptible population
is a small multiple of 1/ß
40 minutes for Code Red
10 seconds for Slammer
Time scale for network-wide infection is hours
for Code Red, minutes for Slammer

23
Not considered non-constant per-infective scan
rate

Some random scanning worms result in a
non-constant per-infective scan rate
Example Slammer (2003)
plausible cause
bandwidth-saturation
see extras

observed scans per unit time
per-infective scan rate
number of infected hosts
Moore04
24
Roadmap

Worm spread strategies
target discovery mechanisms
SI epidemic model of worm spread
Patch spread strategies
Analysis of patching
patching
patching with filtering
candidate strategies PUSH, P2P
Summary and conclusions

25
Patching strategies

Patching Identify vulnerabilities and develop
patches
Filtering Detect and quarantine infected hosts /
subnets
Other ensure code has no vulnerabilities
(non-trivial in practice)

26
Current approaches

Vulnerability is found patch developed first
Patch is released
Worm is subsequently reverse engineered from
patch
Patch needs to be installed before worm is
released hours to days

27
Example Zotob

Aug 9-05 MS05-039 public disclosure
Plug-and-play vulnerability affecting mostly
Win2k
Aug 12-05 Exploit code released
Aug 14-05 Zotob worm discovered
Followed by gt 10 variants

28
Deficiencies

Zero-day worms vulnerability not known or patch
not yet available
Requires automatic response, involving
Detection of worm spread
Automatic patch generation
Automatic patch dissemination
Human reaction times too slow

29
Example Vigilante Costa05

Detectors distributed through network
Detect worms by analysis of stack in code
execution
Can be combined with honeypots etc
Generate self-certifying alerts (SCAs) proving
vulnerability
Disseminate to hosts which verify SCA and create
filters (patches)

30
Problems we address

Architecture for alert dissemination
Vigilante uses structured overlay interconnecting
all end hosts
We propose a hierarchical scheme
Analysis of competing spread of worm and patch
To establish if patching is feasible, and
quantify system requirements

31
Roadmap

Worm spread strategies
target discovery mechanisms
SI epidemic model of worm spread
Patch spread strategies
Analysis of patching
patching
patching with filtering
candidate strategies PUSH, P2P
Summary and conclusions

32
Patching

Hierarchical dissemination
Phase 1 among patching servers
Phase 2 patching-servers to hosts

33
Network partitioned into subnets
subnet j
??
1
1
2
J
j
34
Patching server in each subnet
patching server
??
1
1
2
J
j

patching servers termed superhosts

35
Superhosts interconnected by an overlay
??
1

alerts or patches disseminate over overlay
with alerts, patch generated at superhosts
non essential in modelling

36
PULL

hosts poll a superhost with unit rate
superhost service rate m
results in a patched host, if the polling host
was susceptible

s(t) fraction of susceptible hosts at time t

patching rate at time t m s(t)

37
Host population dynamics

patch susceptible hosts only
assumes worm prevents patching an infected host
plausible assumption for automatic patching (no
human intervention)

Patching system

if m 0, standard logistic
in general

38
Limit host population

Result
Implication
Tight bound whenever infection rate b
sufficiently small(final fraction of infectives
small)
Exponential in the infection to patch rate ratio !

39
Limit host population (example)
10000 vulnerable hosts
b 0.1
dots Monte Carlo
40
Subnets (contd)
alerted subnets
1
2
J
j

patching with rate m only in alerted subnets
g(t) fraction of alerted subnets at time t

41
Broadcast curve

Natural candidate logistic function

g(t) fraction of alerted subnets at time t
T broadcast time

1
1
t
0
T
0
t

Many-superhosts limit for random gossip T
O(log(J))
Same order for standard overlays

42
Broadcast curve (example)

Example
Pastry overlay of J superhosts
Topology GaTech
Broadcasting Flooding
Exhibits logistic growth
(Such overlays randomly constructed locally
tree)

43
Minimum Broadcast Curve

A curve m(t) such that at any time t, fraction of
alerted superhosts ? m(t)
Comparison Minimum broadcast curve yields an
upper bound on the fraction of infectives
Example flooding on Pastry overlay logistic
minimum broadcast curve

m(t)
44
Host population dynamics

fractions of infectives
in alerted subnets

fractions of susceptibles in alerted subnets
45
The migrations

g(t)(i(t)-i1(t)) J g(t)(1-g(t)) i(t)-i1(t)
/ J(1-g(t))
Assume at time t, J(1-g(t)) 5
Pick a subnet at random
M infectives in randomly picked subnet
E(M) (I1I5) / 5

I1
I3
I2
I4
I5
46
Host population dynamics
familiar one-subnet patching system, but with
patching rate mw(t)

The last ODE Ricatti
Use substitution w 1 1/z (w 1, a
particular solution)

47
Per-susceptible patching rate
bottleneck is patching within subnets
are alerts over the overlay
48
Per-susceptible patching rate
bottleneck patching within subnets
bottleneck alerts over overlay
49
Overlays that satisfy a logistic minimum
broadcast curve

Fast-overlay asymptotic small m, b and m / b
fixed
Intuitive T replaced with log(1/g(0))

overlay diameter
Heuristics take g(0)1/J, log(1/g(0)) log(J)
50
Known broadcast time T

Result

Implies

g(t)

If T 0, then consistent with one-subnet
patching
Uses minimum broadcast curve m(t) 1t??T
No patching until all subnets alerted

1
0
t
T
51
Roadmap

Worm spread strategies
target discovery mechanisms
SI epidemic model of worm spread
Patch spread strategies
Analysis of patching
patching
patching with filtering
candidate strategies PUSH, P2P
Summary and conclusions

52
Patching with filtering

Assume each subnet applies edge filtering
An alerted subnet blocks scans in out
gt a scan between two distinct subnets can
succeed only if both subnets are non alerted

BLOCK
BLOCK
successful scan, if hits a susceptible
53
Host population dynamics (1)

Population dynamics under NON alerted subnets

fractions of infectives in NON alerted subnets
fractions of susceptibles in NON alerted subnets
54
Host population dynamics (2)

Result
u(t) g(t)/g(0)
b b(i0(0)s0(0))/(1-g(0))

55
Fraction of infected hosts in non alerted subnets

alert rate 1

56
Further example with random realizations

100 superhosts
Pastry overlay
Broadcast flooding
Topology GaTech
i(0) 1/1000
hosts per subnet 1000
b 0.1

57
Patching vs. Patching Filtering with fast
patching within subnets

Bottleneck is overlay
Within subnets patching assumed instantaneous

diffs
patching
patching filtering
58
Continued

For patching filtering a closed-form
Binomial integral with a series solution
?Simple bound

diameter of the overlay
59
Example

i(0) 10-5
i1(0) r i(0)
s0(0) 1 g(0) (1 - r) i(0)
g(0) 10-3
Note i1(0) p(0) g(0)

60
Roadmap

Worm spread strategies
target discovery mechanisms
SI epidemic model of worm spread
Patch spread strategies
Analysis of patching
patching
patching with filtering
candidate strategies PUSH, P2P
Summary and conclusions

61
PUSH

superhost maintains inventory list of hosts
superhost service rate m
serves hosts in order

s(t) fraction of susceptible hosts at time t

patching rate at time t m / (1-mt) s(t)

62
Host population dynamics
t lt 1/m

By comparison patching rate per susceptible is
larger, so starting from same initial value, the
fraction of infectives is smaller than with PULL
Claim no estimate of ultimate infectives, see
numerics

63
Patching rate

Address space 1, 2, , ?
Susceptibles S(1), S(2), , S(?)
Probability to pick a susceptible at the k-th
pushk 1, S(1) / ?k 2, S(2) / (?-1)k
?, S(?) / 1

64
PULL vs. PUSH
i(0) 10-5

Yes, PUSH superior to PULL
But less than an order of magnitude

65
PULL vs. PUSH (same but wider range)
66
Worm-like patch dissemination (1)
67
Worm-like patch dissemination (2)

Two epidemics
Patch epidemics with larger spread rate m

68
Roadmap

Worm spread strategies
target discovery mechanisms
SI epidemic model of worm spread
Patch spread strategies
Analysis of patching
patching
patching with filtering
candidate strategies PUSH, P2P
Summary and conclusions

69
Conclusions (1)

Containment of random scanning worms
Can be effective by patching only if patching
rate sufficiently larger than worm infection rate
Achievable at reasonable patching rate if scan
rates are constrained
No feasibility problems, but largely engineering
security issues
Smarter worms
?

70
Conclusions (2)

Looking ahead
Worms evolve, so must network immune system
Analysis in its infancy
Need solid theoretical understanding of worm
strategies
Informs design of countermeasures
lets us know their limitations
Beyond dynamics description numerical solving
Analytical estimates

71
This slide deck related references

http//research.microsoft.com/milanv
/immunology.htm

72
References

Moore04 Inside the Slammer Worm, D. Moore, V.
Paxson, S. Savage, C. Shannon, S. Staniford, N.
Weaver, IEEE Security Privacy, 2004.
Costa05 Vigilante End-to-End Containment of
Internet Worms, M. Costa, J. Crowcroft, M.
Castro, A. Rowstron, L. Zhou, L. Zhang, P.
Barham, SOSP05, 2005.
Kesidis04 Coupled Kermack-McKendrick Models
for Randomly Scanning and Bandwidth Saturating
Internet Worms, QoS-IP, Feb 2005.
V-G05a On the Effectiveness of Automatic
Patching, V.-G., WORM05, Nov 2005.
V-G05b On the Race of Worms, Alerts, and
Patches, V.-G., Microsoft Research TR TR-2005-13,
Feb 2005.
many other (see some on the website of the
previous slide)

73
Extras

Why did per-infective scan rate of Slammer
decrease ?
Comparison result for patching system
Patching filtering

74
Why did per-infective scan rate of Slammer
decrease ? (1)

Bandwidth-saturation model Kesidis04
N subnets, each with at most K vulnerable hosts
Each subnet with outbound rate s
A1 one infective saturates the outbound link
A2 ignore worm scans from local subnet
okay for many fixed-size subnets

fraction of subnets with k infectives
per-infective scan rate
75
Why did per-infective scan rate of Slammer
decrease ? (2)

Dynamics
Non-linear system, but

76
Why did per-infective scan rate of Slammer
decrease ? (3)

Closed-form solution
change time du(t) (1 n0(t))dt
makes it a system of linear ODEs

77
Why did per-infective scan rate of Slammer
decrease ? (4)

Special n1(0) 1 n0(0)
Number of infectives at time t
Per-infective scan rate
Similarly for heterogeneous outbound links see
Kesidis04

fit to Slammer data as in Kesidis04
78
Comparison for patching system

Result if f1(t) f2(t), for all t, then
starting from same initial point, the fraction of
infectives with f2 is at most that with f1
See V-G05b for a precise statement

79
Patching filtering