15-446%20Networked%20Systems%20Practicum - PowerPoint PPT Presentation

About This Presentation

Title:

15-446%20Networked%20Systems%20Practicum

Description:

15-446 Networked Systems Practicum Lecture 14 Worms/Viruses/Botnets* – PowerPoint PPT presentation

Number of Views:135

Avg rating:3.0/5.0

Slides: 92

Provided by: Campu177

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: 15-446%20Networked%20Systems%20Practicum

1
15-446 Networked Systems Practicum

Lecture 14 Worms/Viruses/Botnets

2
Outline

Worms
Worm Defense
Botnet/Viruses

3
What is a Computer Worm?

Self replicating network program
Exploit vulnerabilities to infect remote machines
Victim machines continue to propagate infection
Three main stages
Detect new targets
Attempt to infect new targets
Activate code on victim machine
Difference w/ computer virus?
No human intervention required

4
Why Worry About Worms?

Speed
Much faster than viruses
CRv2 14 hours for 359.000 victims
Slammer 10 minutes for 75.000 victims
Faster than human reaction
Highly malicious payloads
DDoS or data corruption

5
Some Major Worms
Worm Year Strategy Victims Other Notes
Morris 1988 Topological scanning 6K First major autonomous worm
Code Red 2001 Random scanning 300K First recent "fast" worm
Nimda 2001 Local scanning 200K Local subnet scanning Effective mix of techniques
Slammer 2003 Random scanning gt75K Spread worldwide in 10 minutes
MyDoom 2004 Topological scanning lt15K First Zero Day Worm
Conficker 2008 Random scanning gt15M? Largest infection, capability of updates
6
Threat Model

Traditional
High-value targets
Insider threats

Worms Botnets
Automated attack of millions of targets
Value in aggregate, not individual systems
Threats Software vulnerabilities naïve users

7
... and it's profitable

Botnets used for
Spam (and more spam)?
Credit card theft
DDoS extortion
Flourishing Exchange market
Spam proxying 3-10 cents/host/week
25k botnets 40k - 130k/year
Also for stolen account compromised machines,
credit cards, identities, etc. (be worried)?

8
Why is this problem hard?

Monoculture little genetic diversity in hosts
Instantaneous transmission Almost entire
network within 500ms
Slow immune response human scales (10x-1Mx
slower!)?
Poor hygiene Out of date / misconfigured
systems naïve users
Intelligent designer ... of pathogens
Near-Anonymitity

9
Code Red I v1

July 12th, 2001
Exploited a known vulnerability in Microsofts
Internet Information Server (IIS)
Buffer overflow in a rarely used URL decoding
routine published June 18th
1st 19th of each month attempts to spread
Random scanning of IP address space
99 propagation threads, 100th defaced pages on
server
Static random number generator seed
Every worm copy scans the same set of addresses
? Linear growth

10
Code Red I v1

20th 28th of each month attacks
DDOS attack against 198.137.240.91
(www.whitehouse.gov)
Memory resident rebooting the system removes
the worm
However, could quickly be reinfected

11
Code Red I v2

July 19th, 2001
Largely same codebase same author?
Ends website defacements
Fixes random number generator seeding bug
Scanned address space grew exponentially
359,000 hosts infected in 14 hours
Compromised almost all vulnerable IIS servers on
internet

12
Analysis of Code Red I v2

Random Constant Spread model
Constants
N total number of vulnerable machines
K initial compromise rate, per hour
T Time at which incident happens
Variables
a proportion of vulnerable machines compromised
t time in hours

13
Analysis of Code Red I v2

N total number of vulnerable machines
K initial compromise rate, per hour
T Time at which incident happens
Variables
a proportion of vulnerable machines compromised
t time in hours

Logistic equation Rate of growth of epidemic in
finite systems when all entities have an equal
likelihood of infecting any other entity
14
Code Red I v2 Plot

K 1.8
T 11.9

Hourly probe rate data for inbound port 80 at the
Chemical Abstracts Service during the initial
outbreak of Code Red I on July 19th, 2001.
15
Improvements Localized scanning

Observation Density of vulnerable hosts in IP
address space is not uniform
Idea Bias scanning towards local network
Used in CodeRed II
P0.50 Choose address from local class-A network
(/8)
P0.38 Choose address from local class-B network
(/16)
P0.12 Choose random address
Allows worm to spread more quickly

16
Code Red II (August 2001)

Began August 4th, 2001
Exploit Microsoft IIS webservers (buffer
overflow)
Named Code Red II because
It contained a comment stating so. However the
codebase was new.
Infected IIS on windows 2000 successfully
but caused system crash on windows NT.
Installed a root backdoor on the infected
machine.

17
Improvements Multi-vector

Idea Use multiple propagation methods
simultaneously
Example Nimda
IIS vulnerability
Bulk e-mails
Open network shares
Defaced web pages
Code Red II backdoor

18
Better Worms Hit-list Scanning

Worm takes a long time to get off the ground
Worm author collects a list of, say, 10000
vulnerable machines
Worm initially attempts to infect these hosts

19
How to build Hit-List

Stealthy randomized scan over number of months
Distributed scanning via botnet
DNS searches e.g. assemble domain list, search
for IP address of mail server in MX records
Web crawling spider similar to search engines
Public surveys e.g. Netcraft
Listening for announcements e.g. vulnerable IIS
servers during Code Red I

20
Better Worms Permutation scanning

Problem Many addresses are scanned multiple
times
Idea Generate random permutation of all IP
addresses, scan in order
Hit-list hosts start at their own position in the
permutation
When an infected host is found, restart at a
random point
Can be combined with divide-and-conquer approach

21
Warhol Worm

Simulation shows that employing the two previous
techniques, can attack 300,000 hosts in less than
15 minutes
Conventional 10 scans/sec
Fast Scanning 100 scans/sec
Warhol 100 scans/sec,
Permutation scanning and 10,000 entry hit list

22
More on Warhol worm
23
Flash worms

A flash worm would start with a hit list that
contains most/all vulnerable hosts
Realistic scenario
Complete scan takes 2h with an OC-12
Internet warfare?
Problem Size of the hit list
9 million hosts ? 36 MB
Compression works 7.5MB
Can be sent over a 256kbps DSL link in 3 seconds
Extremely fast
Full infection in tens of seconds!

24
Surreptitious worms

Idea Hide worms in inconspicuous traffic to
avoid detection
Leverage P2P systems?
High node degree
Lots of traffic to hide in
Proprietary protocols
Homogeneous software
Immense size (30,000,000 Kazaa downloads!)

25
Example Outbreak SQL Slammer (2003)

Single, small UDP packet exploit (376 b)?
First 1min classic random scanning
Doubles of infected hosts every 8.5sec
(In comparison Code Red doubled in 40min)?
After 1min, starts to saturate access b/w
Interferes with itself, so it slows down
By this point, was sending 20M pps
Peak of 55 million IP scans/sec _at_ 3min
90 of Internet scanned in lt 10mins
Infected 100k or more hosts

26
Stuxnet Worm

The first worm for control systems
Discovered in June 2010
Attack SCADA systems using Siemens WinCC/PCS 7
software
Not only spying but also reprogram programmable
logic controllers (PLCs)
Four zero-day attacks used
Infection includes Iran (62K) and China (6M?)
Nation-wide support cyberwarefare?

27
Prevention

Get rid of the or permute vulnerabilities
(e.g., address space randomization)
makes it harder to compromise
Block traffic (firewalls)
only takes one vulnerable computer wandering
between in out or multi-homed, etc.
Keep vulnerable hosts off network
incomplete vuln. databases 0-day worms
Slow down scan rate
Allow hosts limited of new contacts/sec.
Can slow worms down, but they do still spread
Quarantine
Detect worm, block it

28
Outline

Worms
Worm Defense
Botnet/Viruses

29
Context

Worm Detection
Scan detection
Honeypots
Host based behavioral detection
Payload-based

30
Worm Countermeasures

Signature-based worm scan filtering
Vulnerable to polymorphic worms
Scan detection
High scanning activity to identify victims
Scanning with high failure rate compared to
legitimate users (DNS)
TCP RST, ICMP Unreachable
Two dimensions time, space
Rate limiting, rate halting
False positive (Index crawler, NAT, etc.)
Disruption to legitimate services
Not applicable to UDP based propagation

31
Worm behavior

Content Invariance
Limited polymorphism e.g. encryption
key portions are invariant e.g. decryption
routine
Content Prevalence
invariant portion appear frequently
Address Dispersion
of infected distinct hosts grow overtime
reflecting different source and dest. addresses

32
Signature Inference

Content prevalence Autograph, EarlyBird, etc.
Assumes some content invariance
Pretty reasonable for starters.
Goal Identify attack substrings
Maximize detection rate
Minimize false positive rate

33
Content Sifting

For each string w, maintain
prevalence(w) Number of times it is found in the
network traffic
sources(w) Number of unique sources
corresponding to it
destinations(w) Number of unique destinations
corresponding to it
If thresholds exceeded, then block(w)

34
Issues

How to compute prevalence(w), sources(w) and
destinations(w) efficiently?
Scalable
Low memory and CPU requirements
Real time deployment over a Gigabit link

35
Estimating Content Prevalence

Tablepayload
1 GB table filled in 10 seconds
Tablehashpayload
1 GB table filled in 4 minutes
Tracking millions of ants to track a few
elephants
Collisions...false positives

36
Multistage Filters
stream memory
Array of counters
Hash(Pink)
37
Multistage Filters
packet memory
Array of counters
Hash(Green)
38
Multistage Filters
packet memory
Array of counters
Hash(Green)
39
Multistage Filters
packet memory
40
Multistage Filters
packet memory
Collisions are OK
41
Multistage Filters
Reached threshold
packet memory
packet1 1
Insert
42
Multistage Filters
packet memory
packet1 1
43
Multistage Filters
packet memory
packet1 1
packet2 1
44
Multistage Filters
packet memory
Stage 1
packet1 1
No false negatives! (guaranteed detection)
45
Conservative Updates
Gray all prior packets
46
Conservative Updates
47
Conservative Updates
48
Value Sampling

The problem s-b1 substrings
Solution Sample
But Random sampling is not good enough
Trick Sample only those substrings for which the
fingerprint matches a certain pattern

49
sources(w) destinations(w)

Address Dispersion
Counting distinct elements vs. repeating elements
Simple list or hash table is too expensive
Key Idea Bitmaps
Trick Scaled Bitmaps

50
Bitmap counting direct bitmap
Set bits in the bitmap using hash of the flow ID
of incoming packets
HASH(green)10001001
51
Bitmap counting direct bitmap
Different flows have different hash values
HASH(blue)00100100
52
Bitmap counting direct bitmap
Packets from the same flow always hash to the
same bit
HASH(green)10001001
53
Bitmap counting direct bitmap
Collisions OK, estimates compensate for them
HASH(violet)10010101
54
Bitmap counting direct bitmap
HASH(orange)11110011
55
Bitmap counting direct bitmap
HASH(pink)11100000
56
Bitmap counting direct bitmap
As the bitmap fills up, estimates get inaccurate
HASH(yellow)01100011
57
Bitmap counting direct bitmap
Solution use more bits
HASH(green)10001001
58
Bitmap counting direct bitmap
Solution use more bits
Problem memory scales with the number of flows
HASH(blue)00100100
59
Bitmap counting virtual bitmap
Solution a) store only a portion of the bitmap
b) multiply estimate by scaling
factor
60
Bitmap counting virtual bitmap
HASH(pink)11100000
61
Bitmap counting virtual bitmap
Problem estimate inaccurate when few flows active
HASH(yellow)01100011
62
Bitmap counting multiple bmps
Solution use many bitmaps, each accurate
for a different range
63
Bitmap counting multiple bmps
HASH(pink)11100000
64
Bitmap counting multiple bmps
HASH(yellow)01100011
65
Bitmap counting multiple bmps
Use this bitmap to estimate number of flows
66
Bitmap counting multiple bmps
Use this bitmap to estimate number of flows
67
Bitmap counting multires. bmp
Problem must update up to three bitmaps
per packet
Solution combine bitmaps into one
68
Bitmap counting multires. bmp
HASH(pink)11100000
69
Bitmap counting multires. bmp
HASH(yellow)01100011
70
Multiresolution Bitmaps

Still too expensive to scale
Scaled bitmap
Recycles the hash space with too many bits set
Adjusts the scaling factor according

71
Scaled Bitmap

Idea Subsample the range of hash space
How it works?
multiple bitmaps each mapped to progressively
smaller and smaller portions of the hash space.
bitmap recycled if necessary.

Result Roughly 5 time less memory actual
estimation of address dispersion
72
Putting It Together
Address Dispersion Table
key src cnt dest cnt

key cnt

Content Prevalence Table
73
Putting It Together

Sample frequency 1/64
String length 40
Use 4 hash functions to update prevalence table
Multistage filter reset every 60 seconds

74
Parameter Tuning

Prevalence threshold 3
Very few signatures repeat
Address dispersion threshold
30 sources and 30 destinations
Reset every few hours
Reduces the number of reported signatures down to
25,000

75
Parameter Tuning

Tradeoff between and speed and accuracy
Can detect Slammer in 1 second as opposed to 5
seconds
With 100x more reported signatures

76
False Negatives in EB

False Negatives
Very hard to prove...
Earlybird detected all worm outbreaks reported on
security lists over 8 months
EB detected all worms detected by Snort
(signature-based IDS)?
And some that weren't

77
False Positives in EB

Common protocol headers
HTTP, SMTP headers
p2p protocol headers
Non-worm epidemic activity
Spam
BitTorrent (!)?
Solution
Small whitelist...

78
Outline

Worms
Worm Defense
Botnet/Viruses

79
... and it's profitable

Botnets used for
Spam (and more spam)?
Credit card theft
DDoS extortion
Flourishing Exchange market
Spam proxying 3-10 cents/host/week
25k botnets 40k - 130k/year
Also for stolen account compromised machines,
credit cards, identities, etc. (be worried)?

80
Botnet

A group of zombie computers under the remote
control of an attacker via a command and control
(CC) server

81
Botnet Countermeasure

Detecting new botnets by using honeypots,
analyzing spam pools, capturing group activities
in DNS
Sinkholing or nullrouting CC server connections
and cleaning zombies

82
Outline

Worms
Worm Defense
Botnet/Viruses

83
Malicious Code

Many types of malicious code
Virus, worm, botnet, spyware, spam, etc.
Who writes this and why?
Challenge (for fun)
Fame (for pride)
Business (for money)
Black markets for attacks (DDoS and spams) and
info(credit cards, vulnerabilities)
Ideology (for activism)
Hactivism, cyberterrorism, cyberwarefare

84
What is a Computer Virus?

Program that spreads itself by infecting
(modifying) an executable file and making copies
of itself

85
Components

Propagation mechanism
Sharing infected file with other computers
USB drive, email attachment, and shared folders
Executing infected file
? Infect other computers and spread infection
Trigger
Time/condition when payload is activated
Payload
Damage existing files
Extort sensitive information
Consume computers resources

86
Infected File
Before
1 Insert document in fax machine. (Program entry-point)
2 Dial the phone number.
3 Hit the SEND button on the fax.
4 Wait for completion. If a problem occurs, go back to step 1.
5 End task.
After
1 Skip to step 6.
2 Dial the phone number.
3 Hit the SEND button on the fax.
4 Wait for completion. If a problem occurs, go back to step 1.
5 End task.
6 VIRUS instructions
7 Insert document in fax machine and go to step 2.
Nachenberg, Computer Virus-Antivirus Coevolution,
CACM 1997
87
Propagation