Automating Analysis of Large-Scale Botnet Probing Events - PowerPoint PPT Presentation

About This Presentation
Title:

Automating Analysis of Large-Scale Botnet Probing Events

Description:

Most of extrapolated global scopes are at /8 size, which means the botnets do ... Validation based with DShield data. DShield: the largest Internet alert repository ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 28
Provided by: zhich
Category:

less

Transcript and Presenter's Notes

Title: Automating Analysis of Large-Scale Botnet Probing Events


1
Automating Analysis of Large-Scale Botnet Probing
Events
  • Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson
  • Lab for Internet and Security Technology (LIST)
    Northwestern University
  • UC Berkeley / ICSI

2
Motivation
IPv4 Space
Botnets
Can we answer this question with only limited
information observed locally in the enterprise?
Enterprise
Does this attack specially target us?
Administrators
3
Motivation
  • Can we infer the probe strategy used by botnets?
  • Can we infer whether a botnet probing attack
    specially targets a certain network, or we are
    just part of a larger, indiscriminant attack?
  • Can we extrapolate botnet global properties given
    limited local information?

4
Agenda
  • Motivation
  • Basic framework
  • Discover the botnet probing strategies
  • Extrapolate global properties
  • Evaluation
  • Conclusions

5
Botnet Probing Events
Big spikes of larger numbers of probers mainly
caused by botnets
6
System Framework
  • See the paper for subtle system details.

7
Agenda
  • Motivation
  • Basic framework
  • Discover the botnet probing strategies
  • Extrapolate global properties
  • Evaluation
  • Conclusions

8
Discover the Botnet Probing Strategies
  • Use statistical tests to understand probing
    strategies
  • Leverage on existing statistical tests
  • Monotonic trend checking detect whether bots
    probe the IP space monotonically
  • Uniformity checking detect whether bots scan the
    IP range uniformly.
  • Design our own
  • Hitlist (liveness) checking detect whether they
    avoid the dark IP space
  • Dependency checking do the bots scan
    independently or are they coordinated?

9
Design Space
10
Hitlist Checking
  • Configure the sensor to be half darknet and half
    honeynet
  • Use metric ? src in darknet/ src in
    honeynet.
  • Threshold 0.5

11
Agenda
  • Motivation
  • Basic framework
  • Discover the botnet probing strategies
  • Extrapolate global properties
  • Global scan scope, total of bots, total of
    scans, total scan rate for each bot
  • Evaluation
  • Conclusions

12
Extrapolate Global Properties Basic Ideas and
Validation
  • Observe the packet fields that change with
    certain patterns in continuous probes.
  • IPID a packet field in IP header used for IP
    defragmentation
  • Ephemeral port number the source port used by
    bots
  • Increment for a fixed per scan
  • Validation
  • IPID continuity All versions of Windows and
    MacOS
  • Ephemeral port number continuity botnet source
    code study
  • Agobot, Phatbot, Spybot, SDbot, rxBot, etc.
  • Control experiments with NAT

13
Estimate Global Scan Rate of Each Bot
  • Count the IPID ephemeral port changes
  • Recover the overflow of IPID and ephemeral port
    number
  • Estimate the rate with linear regression when
    correlation coefficient gt 0.99
  • Counter overestimation use less of the two

14
Extrapolate Global Scan Scope
IPv4 Space
Botnets
boti
ni100
Total scans from boti scan rate Ri scan time
Ti 1001000100,000
Local/global ratio
Aggregating multiple bots
15
Extrapolate Global of Bots
  • Idea similar to Mark and Recapture
  • Assumption All bots have the same global scan
    range
  • Total M4000

Bots
M
  • First half m11000
  • Second half m21000
  • Observed by both m12 250

m1
m2
Mm1m2/m12
m12
16
Agenda
  • Motivation
  • Basic framework
  • Discover the botnet probing strategies
  • Extrapolate global properties
  • Evaluation
  • Conclusions

17
Dataset
  • Based on a 10 /24 honeynet in a National Lab
    (LBNL)
  • 293GB packet traces in 24 months (2006-07)
  • Totally observed 203 botnet probing events
  • Average observed bots/event is 980.
  • Mainly on SMB/WINRPC, VNC, Symantec, MSSQL, HTTP,
    Telnet
  • Size of the system 13,900 lines Bro (6,000),
    Python (4,000), C (2,500), R (1,400)

18
Property Checking Results
  • More than 80 uniform scanning
  • Validate the results through visualization and
    find the results are highly accurate.

19
Extrapolation Results
  • Most of extrapolated global scopes are at /8
    size, which means the botnets do not target the
    enterprise (LBNL).
  • Validation based with DShield data
  • DShield the largest Internet alert repository
  • Find the /8 prefixes in DShield with sufficient
    source (bots) overlap with the honeynet events
  • Due to incompleteness of Dshield data, 12 events
    validated
  • Calculate the scan scope in each /8 based on
    sensor coverage ratio.

20
Extrapolation Validation
  • Define scope factor as max(DShield/Honeynet,Honeyn
    et/DShield)

75 within 1.35 All within 1.5
CDF of the scope factor
21
Conclusions
  • Develop a set of statistical approaches to assess
    four properties of botnet probing strategies
  • Designed approaches to extrapolate the global
    properties of a scan event based on limited local
    view
  • Through real-world validation based on DShield,
    we show our scheme are promisingly accurate

22
Backup
23
Event size distribution
24
Extrapolate the scope
Probes observed locally
Local/global ratio
Estimate global probing rate
Probing time window
25
Monotonic trend checking
  • Goal detect whether the bots probe the IP space
    monotonically
  • E.g. simple sequential probing
  • Technique
  • Mann-Kendall trend test
  • Intuition check whether the aggregated sign
    value (sign(Ai1-Ai)) out of the range of
    randomness can achieve.
  • When most (gt80) senders in an events follow
    trend we label the events follow trends

26
Uniformity Checking
  • Goal detect whether the botnet scan the IP range
    uniformly.
  • Technique
  • Chi-Square test
  • Intuition put address into bins. The scan
    observed in each bin should be similar.
  • Significance level of 0.5

27
Dependency Checking
  • Goal Is the bots try to get out each others
    way?
  • Idea account the number of address receive zero
    scan and comparing with confidence interval of
    the independent random case.
Write a Comment
User Comments (0)
About PowerShow.com