Title: Network-level Malware Detection
1Network-level Malware Detection
- Mike McNett, Matthew Spear, Richard Barnes
- CS-851 Malware
- 23 October 2004
2Outline
- Introduction Design of a System for Real-Time
Worm Detection - Example 1 Detecting Early Worm Propagation
through Packet Matching (DEWP) - Example 2 Fast Detection of Scanning Worm
Infections - Example Application Therminator
- Conclusions
3Introduction
- Questions Being Considered
- Why network level detection?
- What are the alternatives?
- Are there reasonable solutions?
- What are the limitations, advantages,
disadvantages compared to the alternatives?
4Introduction
- Malware Detection Options?
- Prevention vs. Treatment
- Signature vs. Anomaly
- Host-based containment
- Network containment
- Packet Header vs. Packet Payload
- What are the advantages, disadvantages, and
limitations of the above?
5Network-level Detection
6Design of a System for Real-Time Worm Detection
- Hash
- Count Vector
- Character Filter
- SRAM Analyzer
- Alert Generator
- Periodic Subtraction of Time Averages
7Design of a System for Real-Time Worm Detection
- Scalable to high throughput
- Solution depends on specialized hardware
- Low false positive rate
- What are the problems?
- What are the advantages?
- Are there other, more simplistic signatures?
- Can similar attacks be detected at the host level?
8Detecting Early Worm Propagation through Packet
MatchingXuan Chen and John Heidemann
ISI-TR-2004-585February 2004
9DEWP
- Router-based system
- automatically detects and quarantines Internet
worm propagation - matches destination port numbers between incoming
and outgoing connections (automated signature
creation) - detects and suppresses worms due to unusual
traffic patterns - detects worm propagation within about 4 seconds
- protects gt 99 hosts from random-scanning worms
10DEWP Thesis
- Matches destination port numbers between incoming
and outgoing connections. Two observations on
worm traffic - Worms usually exploit vulnerabilities related to
specific network port numbers - Infected hosts will probe other vulnerable hosts
exploiting the same vulnerability - So high levels of bi-directional probing traffic
with the same destination port number ? new worm - Scalable Matching destination port numbers
consumes low computational power
11DEWP
- Two components of DEWP worm detector and packet
filter - Two step detection destination port matching and
destination address counting - Uses packet filtering to suppress worm spreading
- Can deploy at different levels of network
12Worm Containment
- DEWP uses traffic filtering routers drop
packets with the automatically discovered
destination port - Worm containment protect internal hosts from
internal and external threats notify other
networks about attacks
13Design
- Maintains one port-list for each direction
(incoming and outgoing) records number of
connections to different destination ports - Timer for each entry in port-lists
- If port has not been accessed for certain time
interval, reset corresponding list entry - Monitor outgoing destination addresses of
non-zero entries in both port-lists - Every T seconds, check number of unique addresses
observed within last time interval. Worm traffic
detected with the following condition - N is the number of unique addresses observed.
- Long-term average
- ? is the system sensitivity to changes
14Effectiveness of Worm Detection and Quarantine
- Random scanning worm detects worm traffic in 4.8
seconds when fully deployed with a 1 second
detection interval. - Always detects worm probing traffic in 4-5
seconds when deployed to different layers. - Number of infected hosts in the protected network
primarily determined by the number of probing
packets received from outside - Can protect almost all hosts from infection when
only deployed on the access router.
15Local Scanning
- Local scanning Can detect worm probing traffic
in 3.87 seconds. But, almost all vulnerable hosts
in the protected network are compromised - Deployment has little impact on either detection
delay or infection percentage. - The infection percentage increases as number DEWP
deployed layers are reduced When only on the
access router ? all vulnerable hosts compromised
within 10 seconds - More frequent detection reduces vulnerability to
local-scanning worms - DEWP quickly detects worm attacks regardless
probing techniques. - With full deployment about 9 vulnerable hosts
compromised in the protected network - Due to difficulty to effectively quarantine
local-scanning worms ? a very small detection
interval and wide deployment is critical to
protect vulnerable hosts
16Effect of Detection Intervals
- Address-counting with an interval of T seconds.
- Different detection intervals affect detection
delay and infection percentage - Random-scanning worm. Detection delay and the
number of infected hosts increases with detection
intervals. - Local-scanning worms 1) No significant
difference in detection delay 2) Infection
percentage increases dramatically at larger
intervals - So, automatic system needs to react to worm
traffic within small time intervals
17False Detections
- No false positives
- Discovered 10 suspicious destination ports
including 21 (FTP), 53 (DNS), and 80 (Web) - Depends on address-counting to reduce false
positives - Worm scan rate C affects false negatives when
worm scan at low rate, probing traffic has less
effect on overall traffic. DEWP routers have
more difficulty distinguishing them from normal
traffic. - With C 500 ? worm traffic stands out compared
to regular traffic - DEWP is not able to detect worms with scanning
rate lower than C 25.
18Conclusions
- Detects and quarantines propagation of Internet
worms - Uses port-matching and address-counting as the
signature. - Detects worm attack within 4-5 seconds
- By automatically blocking worm traffic, it
protects most vulnerable hosts from
random-scanning worms. - Authors believe that an automatic worm detection
and containment system should be widely deployed
and have very small detection intervals - Not realistic to deploy DEWP on all routers for
random scanning worms sufficient to put on
access router.
19Worm Detection
Fast Detection of Scanning Worm Infection
20Detection Techniques
- Reverse Sequential Hypothesis Testing (TH)
- Detects worms based upon number of failed
connection attempts - Uses probability to determine if a local host is
scanning - Designed to be tied into a containment system
- Signature Based Analysis (Early Bird System
(EBS)) - Detects worms based upon Rabin signatures of
content/port - Used in conjunction with a containment system
21Definitions
l Local Host
d Destination Address
FCC First contact connection
Yi Indicator variable of ith FCC
H0 Hypothesis that l is not scanning
H1 Hypothesis that l is scanning
?k Pr(Yi 0 Hk) k ? 0,1
?0 Upper bound to accept H1
?1 Lower bound to accept H0
22Definitions
PD Probability of detecting an infected host
PF Probability of host as infected when it is not
a Upper bound on PF (a PF)
ß Lower bound on PD (ß PD)
Cl Credits for l
23Basic Algorithm
- Maintain separate state information for each host
(l) being monitored ( ), the hosts that
have been previously contacted, and an FCC queue
(FCCQ) of first contact attempts that have been
attempted but have not been recorded in the
observation (PCH). - When a packet is observed check to see if d is in
the PCH of l, if not then add d ? PCH and add the
attempt to FCCQ as PENDING. - When an incoming packet is sent to l and the
source address exists in FCCQ update the record
to SUCCESS in the FCCQ unless the packet is a TCP
RST. - When the head entry of FCCQ has status of PENDING
and has been in queue for longer than a
predefined time limit set its status to FAILURE. - If the entry at the head of FCCQ has status other
than PENDING update and compare it to
?1
24Basic Algorithm
Credit Based Connection Rate Limiting (CBCRL)
- Simple scheme to limit the amount of connections
l can make in a given slot of time by allotting
each l a set number of credits (Cl) that is
modified given events.
Event Change to Cl
Initial Cl ? 10
FCC issued by l Cl ? Cl - 1
FCC succeeds Cl ? Cl 2
Every second Cl ? if Cl gt 10
Allowance Cl ? 1 if Cl 0 for 4 seconds
- Used in conjunction with TH to limit number of
connections a host can make allowing TH time to
determine if a host is infected.
25Experiment
- Conducted two experiments in 2003 (isp-2003) and
2004 (isp-04). - Worms identified via comparing traffic to known
worm descriptions.
isp-03 isp-04
Date APR 10 2003 JAN 28 2004
Duration 627 min 66 min
Total Outbound connection attempts 1402178 178518
Total Active Local hosts 404 451
?0 .7 .7
?1 .1 .1
a 5E-5 5E-5
ß .99 .99
26Results
isp-03 isp-04
Worms Detected (Total) 5 6
CodeRed II 2 0
Blaster 0 1
MyDoom 0 3
Minmail.j 0 1
HTTP (other) 3 1
False Alarms (Total) 0 6
HTTP 0 3
SMTP 0 3
P2P Detected (Total) 6 11
Total Identified 11 23
Alarms Detection Efficiency Effectiveness
TH 34 11 .324 .917
27Limitations, Future Work?
Are there any serious flaws in this
algorithm? Future work?
- Warhol type scanning
- Network outages can cause TH to decide that a
host is a worm - Worms could conceivably collaborate to defy
detection - Worms could remember hosts that it can contact
and defy detection through them - Spoofing attack to get an uninfected host blocked
- Interleave scanning with benign activities (i.e.
for every scan visit a website that is known to
be running)
- Can trivially modify to work with the containment
strategies discussed earlier
28THERMINATOR!!!
- Science comes to the aid of network-level anomaly
detection
29Network behavior is complicated
- How do we use microscopic packet-level data to
make macro network-level decisions? - Too broad, e.g. keeping track of global traffic
patterns. - Too refined, e.g. looking at individual packets.
- Hmm who else tries to make sense of the overall
behavior of millions of single objects? - Physicists and Chemists!
30Idea
- Given a computer network with gt1000 nodes,
- Want to detect anomalous traffic, without any
foreknowledge. - Idea of THERMINATOR
- Take advantage of lots of packet-level data.
- Use physical techniques to distill information
into relevant statistics Temperature, entropy,
etc.
31Data Reduction
- Take the set of hosts and group them into
buckets or conversation groups. - Observe communication among buckets.
- Calculate physical statistics based on these
higher-level communications. - By virtue of the mathematics, these are
guaranteed to be the same as if wed just looked
at hosts.
32Physical Network Visualization
- Based on reduced data, we know pseudo-physical
statistics - Bucket size
- Temperature
- Entropy
- Heating rate
- Work rate
- Visualizing these data shows network events.
Image courtesy of DISA
33Network Event Detection
34THERMINATOR Implementation
- Jointly developed by DISA, NSA, and Lancope Inc.
- Uses Lancopes data-collection hardware to
provide data to THERMINATOR. - THERMINATOR reduces data, computes stats, and
provides visualization. - Research tests validated that THERMINATOR
detected anomalies that the intrusion detection
systems did not capture. -- NSA
35Conclusion
- Combined approaches (host-based, network-based,
visualization)? - Can signatures be automatically generated?
- Can attacks be visualized?
- Potential impacts of false positives (is the
medicine worse than the sickness) and automated
containment? - Need different solutions for local-scanning vs.
non-local scanning worms? - Are there other scientific areas that malware
research can leverage?