Balancing Risk and Utility in Flow Trace Anonymization - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Balancing Risk and Utility in Flow Trace Anonymization

Description:

Sharing of traffic measurements is crucial. Only a limited set of sources available ... X Random Perm. A Case Study: IP Address Truncation ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 15
Provided by: artu71
Category:

less

Transcript and Presenter's Notes

Title: Balancing Risk and Utility in Flow Trace Anonymization


1
Balancing Risk and Utility in Flow Trace
Anonymization
  • Martin Burkhart, ETH Zurichburkhart_at_tik.ee.ethz.c
    h

Joint work with Daniela Brauckhoff, Elisa Boschi,
Martin May
2
Motivation
  • Sharing of traffic measurements is crucial
  • Only a limited set of sources available
  • Reproducibility of results
  • Dynamics / variability of traffic
  • Get the big picture (e.g. Internet Storm Center)
  • Keep up with globalized attacks (e.g. botnets)
  • More and more traces are collected but not shared
  • Data protection legislation
  • Security concerns
  • Competitive advantage

3
State-Of-The-Art Anonymization
  • Black Marking
  • Truncation
  • E.g. last bits of IP addresses
  • Permutation
  • Random
  • (Partial) Prefix-preserving IP address
    permutation
  • Enumeration
  • E.g. Timestamps keep the logical order of events
  • Categorization
  • Randomization (data mining community)
  • K-Anonymity (data mining community)

4
The Tradeoff in Anonymization
  • Its a trade-off
  • RU-Maps
  • t Anony. Strength
  • X-Axis Utility(t)
  • Y-Axis Risk(t)
  • Not quantitatively studied, lack of metrics
  • Strongly dependent on the application / attacker
    model

Risk(t)
Algorithm X
X t0.1
X t0.2
X t0.4
X t0.7
Sweet Spot
Utility(t)
5
A Case Study IP Address Truncation
  • Techniques that permute IP addresses 11 are
    reversible
  • Characteristic object sizes/frequencies,
    behavioral profiling, fingerprint active ports,
    exploit prefix structure
  • Apply IP address truncation and evaluate the risk
    and utility dimensions
  • Lower risk Hosts are aggregated to subnets
  • Lower utility Resolution of entities is reduced
  • Quantifying the tradeoff How bad is it in
    numbers?

IP address 8 bits trunc. 16 bits trunc.
123.45.67.89 123.45.67.0 123.45.0.0
123.45.67.123 123.45.67.0 123.45.0.0
123.45.12.34 123.45.12.0 123.45.0.0
6
Internal vs. External Prefixes
  • Asymmetry in prefixes
  • external
  • Internal (AS 559)
  • Is this reflected in
  • Risk reduction?
  • Utility reduction?

Unique Count (log)
Prefix length (32-x)
7
Measuring Utility of Truncated Data
  • Specific application anomaly detection
  • Compare detection quality of scans and (D)DoS
    attacks in original and truncated data
  • Two IP-based metrics
  • Unique address count
  • Address entropy
  • 3 weeks of NetFlow data
  • 43 billion flows
  • SWITCH network

8
Measuring Detection Quality
  • Ground truth Manual identification of
    scans/(D)DoS attacks
  • Run a Kalman filter on metric timeseries
  • Utility measured by AUC (area under the ROC curve)

Vary threshold
9
Utility of Truncated Data
  • Internal metrics degrade faster than external
    metrics
  • Counts degrade faster than Entropy

10
Approximating Risk of Host Identification
  • In general Truncation of x bits leads to
  • 2(32-x) prefixes with 2x addresses per prefix
  • But only a fraction (A) of potential addresses
    is usually active
  • Hence, On average A2x addresses per prefix

1, 2, 3, ...10, 11, 12, ... 240, 241, ...254,
255
129.130.80.
e.g. A 10
11
Risk of Truncated Data
(total 2.2 million)
(total 4.3 billion)
  • Risk for external addresses is higher due to
    sparcity!
  • Constant offset

12
The Risk-Utility Tradeoff
No truncation
4 bits
8 bits
12 bits
16 bits
best tradeoff
Metric x Utility Risk
internal entropy 8 0.94 0.035
internal entropy 12 0.87 0.002
external entropy 16 0.97 0.02
13
Conclusion
  • We made a quantitative evaluation of the
    risk-utility tradeoff in anonymization
  • Entropy is much more resistant to truncation than
    unique counts
  • Risk and utility degrade faster for internal
    addresses
  • For detection of scans and (D)DoS attacks, it is
    possible to get a good tradeoff with high utility
    and low risk

14
Thank You for the Attention
Write a Comment
User Comments (0)
About PowerShow.com