Title: New Packet Sampling Technique for Robust Flow Measurements
1New Packet Sampling Technique for Robust Flow
Measurements
- Shigeo Shioda
- Department of Architecture and Urban Science
- Graduate School of Engineering, Chiba University
2Objectives of traffic measurements
- Short-term monitoring.
- Detecting high volume traffic patterns (denial of
service attacks). - Detecting unexpected or illegal packets.
- Investigating of origins.
- Long-term traffic engineering.
- Rerouting traffic.
- Upgrading selected links.
3Per-flow-base traffic measurement (1)
- Just counting the number of packets or bytes is
not sufficient per-flow-base traffic measurement
is necessary. - What is a flow?
- Informally, a set of packets consisting logical
communication between application processes
running on different hosts. - Flow-level information could tell us who is now
using the Internet.
4Per-flow-base traffic measurement (2)
Flow 1
Flow 2
5Per-flow-base traffic measurement (3)
- How we could distinguish flows.
- Investigating headers of packets.
- Classifying packets based on IP addresses, port
numbers, and protocol ID.
version
HL
TOS
Total Length
Identification
Flags
Fragment Offset
IP Header
TTL
Protocol-ID
Header Checksum
Source Address
Destination Address
Source Port
Destination Port
TCP Header
Sequence Number
Acknowledgement Number
6Per-flow-base traffic measurement (4)
- Flow-measurement procedure.
- A Router maintains flow cache containing a flow
record. - When a packet is seen, a router updates counters
of the corresponding entry in the flow cache.
Flow Cache
of packets
of bytes
Flow 1
0
0
1
1500
2
3000
2
0
1
3
1500
3000
0
4500
Flow 2
0
0
1
1500
Flow 3
Flow 1 packet
Flow 2 packet
Flow 3 packet
7Problems of flow measurements
- Lack of scalability
- Due to the rapid increase of the todays line
speed, the number of concurrent flows are
increasing yearly. - Updating per-flow counter on a per-packet basis
is already impossible with todays line speed. - The gap between DRAM speeds and link speeds is
increasing.
8Packet sampling
- Updating a flow cache only for sampled packets.
- Elephant flows would be detected even under the
packet sampling. - Although many tiny (and unimportant) flows would
be missed under the packet sampling, it does not
matter in terms of network management. - Ciscos Sampled NetFlow.
- How to sample packets?
9Fixed rate sampling
- Definition
- Choosing sampled packets at a fixed rate.
- For example, taking one in every N packets.
- Ciscos Sampled NetFlow uses the fixed rate
sampling.
N 5
10Shortcomings of the fixed rate sampling
- The size of memory holding the flow cache
strongly depends on the traffic load. - When DoS attacks are in progress, the memory
would be rapidly consumed even if the sampling
rate is low. - However, low sampling rate would yield large
error in traffic measurement under the normal
load. - Its a hard decision for network operators to set
the static sampling rate.
11Fixed period sampling
- Definition
- Choosing at most one packet to sample in every
fixed-length period (called sampling window). - For example, taking one in every tw second.
- Our solution.
Sampling Window
12Properties of fixed period sampling
- The number of samplings during a second is
bounded by 1/tw. - The number of entries in the flow cache is also
bounded. - Sampling interval (tw) is easily determined based
on the available memory or CPU for flow
measurements.
13Number of flow entrees
Number of Entries
Number of Entries
Indianapolis-Kansas City
U.S.-Japan link
Time s
Time s
N1000, tw10ms
14Number of sampled packets
Number of Sampled Packets
Trace 1
Trace 2
Time s
N1000, tw10ms
15Second Packet Sampling (1)
- An arbitrary packet can be chosen to sample
during each sampling window. - Which packets to be sampled?
- The simplest (and the most natural) rule the
first packet sampling. - Intuitively the first packet sampling rule seems
to work well, but it is not true. - We apply the second packet sampling.
16First packet sampling and second packet sampling
- First packet sampling
- Second packet sampling
tw
2 tw
3 tw
4 tw
0
tw
2 tw
3 tw
4 tw
0
17Second Packet Sampling (2)
Flow 1 packets arrive periodically
Flow 2 packets arrive according to a Poisson
process
We theoretically found that Under the first
packet sampling rule, 63.2 of sampled packets
are of flow 1. (strongly biased) Under the
second packet sampling rule, 49.7 of sampled
packets are of flow 1. (almost unbiased)
18Flow level traffic estimation
- Sampling inevitably misses some information.
- Some inference techniques are required to know
the statistics of flow level traffic from the
sampled packets. - Here, we focus on the flow rate estimation.
19Flow rate estimation (1)
- Flow rate
- Informally, the rate at which a flow sends data.
- Formally, the ratio of the total bytes
transferred to the flow duration. - Flow rate is an index for identifying vital
flows, which often have significant impact on
network performance. - Flow rate can be estimated from sampled packet
streams.
20Flow rate estimation (2)
- Real trace on a link between Indianapolis-Kansas
City
tw10ms (0.15 packets were sampled)
tw1ms (1.5 packets were sampled)
Actual Flow Rate Mbps
Actual Flow Rate Mbps
Estimated Flow Rate Mbps
Estimated Flow Rate Mbps
21Flow rate estimation (3)
- Real trace on a U.S. Japan link
tw10ms (1.5 packets were sampled)
tw1ms (13.4 packets were sampled)
Actual Flow Rate Mbps
Actual Flow Rate Mbps
Estimated Flow Rate Mbps
Estimated Flow Rate Mbps
22Conclusion
- Sampling techniques are indispensable to todays
traffic measurement in the Internet. - Fixed period sampling could bypass problems of
the existing sampling technique (fixed rate
sampling). - Fixed period sampling should be used together
with the second packet sampling. - Flow rate can be estimated well with the fixed
period sampling.
23Thank you.
24Flow rate estimation under first packet sampling
Indianapolis-Kansas
U.S.-Japan link
Actual Flow Rate Mbps
Actual Flow Rate Mbps
Estimated Flow Rate Mbps
Estimated Flow Rate Mbps
N1000, tw10ms
25Bayesian Estimates (2)
Naive Estimator
Bayesian Estimator
Actual Flow Rate Mbps
Actual Flow Rate Mbps
Estimated Flow Rate Mbps
Estimated Flow Rate Mbps
26Bayesian Estimates (1)
Naive Estimator
Bayesian Estimator
Actual Flow Rate Mbps
Actual Flow Rate Mbps
Estimated Flow Rate Mbps
Estimated Flow Rate Mbps
27Objectives of traffic measurements (2)
- QoS monitoring.
- Measurement of QoS properties.
- Validating service-level agreement.
- Usage-based accounting.
- Input to charge or billing.
28Shortcomings of the fixed rate sampling
- Is there any sampling strategy which work even
under massive DoS attacks?
350
300
250
Traffic
200
150
100
50
0
150
300
450
600
750
900
Time s
29Existing solutions to the fixed rate sampling
- Sampling rate adaptation
- First, the sampling rate is initialized to the
maximum rate, at which the processor can operate. - Then, the sampling rate is dynamically adjusted
based on the amount of consumed memory. - Adaptive NetFlow.
- We propose another solution.
30Fixed period sampling (2)
- Timeout transaction
- Under the sampling measurements, one could not
exactly know the beginning and end of flows. - (SYN or FIN packets may not be sampled.)
- Thus, flow entries that have not been seen during
last N samplings are deleted from the flow cache. - Due to timeout transaction, the flow cache keeps
only flows, whose packets have been detected at
least once during last N samplings.
31Simulation experiments
- The accuracy of the flow-rate estimation was
investigated using real traffic data. - Two real traces (traffic data) were used .
- Trace1 Traffic data measured by PMA Project on a
backbone link between Indianapolis - Kansas City. - Trace 2 Traffic data measured by WIDE Project on
a U.S. and Japan link published.
32Flow rate estimation (2)
- Naïve estimation.
- Estimation based on the sampling frequency.
- Bayesian estimation.
- If we know the probability density function of
the flow rate as prior information, we could
apply Bayesian estimator to improve the
estimation accuracy.