Automating Analysis of Large-Scale Botnet Probing Events

About This Presentation

Title:

Automating Analysis of Large-Scale Botnet Probing Events

Description:

Most of extrapolated global scopes are at /8 size, which means the botnets do ... Validation based with DShield data. DShield: the largest Internet alert repository ... – PowerPoint PPT presentation

Number of Views:41

Avg rating:3.0/5.0

Slides: 28

Provided by: zhich

Learn more at: https://users.cs.northwestern.edu

Category:

more less

Transcript and Presenter's Notes

Title: Automating Analysis of Large-Scale Botnet Probing Events

1
Automating Analysis of Large-Scale Botnet Probing
Events

Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson
Lab for Internet and Security Technology (LIST)
Northwestern University
UC Berkeley / ICSI

2
Motivation
IPv4 Space
Botnets
Can we answer this question with only limited
information observed locally in the enterprise?
Enterprise
Does this attack specially target us?
Administrators
3
Motivation

Can we infer the probe strategy used by botnets?
Can we infer whether a botnet probing attack
specially targets a certain network, or we are
just part of a larger, indiscriminant attack?
Can we extrapolate botnet global properties given
limited local information?

4
Agenda

Motivation
Basic framework
Discover the botnet probing strategies
Extrapolate global properties
Evaluation
Conclusions

5
Botnet Probing Events
Big spikes of larger numbers of probers mainly
caused by botnets
6
System Framework

See the paper for subtle system details.

7
Agenda

Motivation
Basic framework
Discover the botnet probing strategies
Extrapolate global properties
Evaluation
Conclusions

8
Discover the Botnet Probing Strategies

Use statistical tests to understand probing
strategies
Leverage on existing statistical tests
Monotonic trend checking detect whether bots
probe the IP space monotonically
Uniformity checking detect whether bots scan the
IP range uniformly.
Design our own
Hitlist (liveness) checking detect whether they
avoid the dark IP space
Dependency checking do the bots scan
independently or are they coordinated?

9
Design Space
10
Hitlist Checking

Configure the sensor to be half darknet and half
honeynet
Use metric ? src in darknet/ src in
honeynet.
Threshold 0.5

11
Agenda

Motivation
Basic framework
Discover the botnet probing strategies
Extrapolate global properties
Global scan scope, total of bots, total of
scans, total scan rate for each bot
Evaluation
Conclusions

12
Extrapolate Global Properties Basic Ideas and
Validation

Observe the packet fields that change with
certain patterns in continuous probes.
IPID a packet field in IP header used for IP
defragmentation
Ephemeral port number the source port used by
bots
Increment for a fixed per scan
Validation
IPID continuity All versions of Windows and
MacOS
Ephemeral port number continuity botnet source
code study
Agobot, Phatbot, Spybot, SDbot, rxBot, etc.
Control experiments with NAT

13
Estimate Global Scan Rate of Each Bot

Count the IPID ephemeral port changes
Recover the overflow of IPID and ephemeral port
number
Estimate the rate with linear regression when
correlation coefficient gt 0.99
Counter overestimation use less of the two

14
Extrapolate Global Scan Scope
IPv4 Space
Botnets
boti
ni100
Total scans from boti scan rate Ri scan time
Ti 1001000100,000
Local/global ratio
Aggregating multiple bots
15
Extrapolate Global of Bots

Idea similar to Mark and Recapture
Assumption All bots have the same global scan
range

Total M4000

Bots
M

First half m11000

Second half m21000

Observed by both m12 250

m1
m2
Mm1m2/m12
m12
16
Agenda

Motivation
Basic framework
Discover the botnet probing strategies
Extrapolate global properties
Evaluation
Conclusions

17
Dataset

Based on a 10 /24 honeynet in a National Lab
(LBNL)
293GB packet traces in 24 months (2006-07)
Totally observed 203 botnet probing events
Average observed bots/event is 980.
Mainly on SMB/WINRPC, VNC, Symantec, MSSQL, HTTP,
Telnet
Size of the system 13,900 lines Bro (6,000),
Python (4,000), C (2,500), R (1,400)

18
Property Checking Results

More than 80 uniform scanning
Validate the results through visualization and
find the results are highly accurate.

19
Extrapolation Results

Most of extrapolated global scopes are at /8
size, which means the botnets do not target the
enterprise (LBNL).
Validation based with DShield data
DShield the largest Internet alert repository
Find the /8 prefixes in DShield with sufficient
source (bots) overlap with the honeynet events
Due to incompleteness of Dshield data, 12 events
validated
Calculate the scan scope in each /8 based on
sensor coverage ratio.

20
Extrapolation Validation

Define scope factor as max(DShield/Honeynet,Honeyn
et/DShield)

75 within 1.35 All within 1.5
CDF of the scope factor
21
Conclusions

Develop a set of statistical approaches to assess
four properties of botnet probing strategies
Designed approaches to extrapolate the global
properties of a scan event based on limited local
view
Through real-world validation based on DShield,
we show our scheme are promisingly accurate

22
Backup
23
Event size distribution
24
Extrapolate the scope
Probes observed locally
Local/global ratio
Estimate global probing rate
Probing time window
25
Monotonic trend checking

Goal detect whether the bots probe the IP space
monotonically
E.g. simple sequential probing
Technique
Mann-Kendall trend test
Intuition check whether the aggregated sign
value (sign(Ai1-Ai)) out of the range of
randomness can achieve.
When most (gt80) senders in an events follow
trend we label the events follow trends

26
Uniformity Checking

Goal detect whether the botnet scan the IP range
uniformly.
Technique
Chi-Square test
Intuition put address into bins. The scan
observed in each bin should be similar.
Significance level of 0.5

27
Dependency Checking

Goal Is the bots try to get out each others
way?
Idea account the number of address receive zero
scan and comparing with confidence interval of
the independent random case.

Write a Comment

User Comments (0)