Fast, MemoryEfficient Traffic Estimation by Coincidence Counting - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Fast, MemoryEfficient Traffic Estimation by Coincidence Counting

Description:

... size N, memory size M, and k (the length of PT); Run and count the coincidences CC(f) for ... Multiple (i.e., k) comparisons increase the number of membership ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 21
Provided by: HuiZ5
Category:

less

Transcript and Presenter's Notes

Title: Fast, MemoryEfficient Traffic Estimation by Coincidence Counting


1
Fast, Memory-Efficient Traffic Estimation by
Coincidence Counting
Fang Hao1, Murali Kodialam1, T. V. Lakshman1,
Hui Zhang2, 1Bell Labs, Lucent
Technologies 2University of Southern
California
2
Traffic flow measurement
flows
packets
router
Flow rate statistics
network link
  • Related work
  • Sample and hold Estan et al. 2002, Smart
    Sampling Duffield et al. 2004, RATE Kodialam
    et al. 2004, ACCEL-RATEHao et al. 2004, etc.
  • Flow rate pf the proportion of packets
    belonging to flow f during a certain time period.
  • Arrival rate rf pf ? ? ? - total arrival
    rate (packets / second)

3
Problem definition
  • Whats the traffic flow composition through a
    network link during a certain time period, given
    an estimation accuracy requirement (?, ?)?
  • ? - confidence percentile
  • ? - estimation error.
  • For any given flow f with its rate be pf,
    determine an estimate pf for pf such that
  • pf ? (pf - ?/2, pf ?/2)
  • with probability greater than ?.
  • e.g., ? 0.9975, ? 0.0001.

4
One solution naive counting
  • Directly counting the number of packets for each
    flow.
  • Its simple
  • It estimates the rate of the flows rapidly
  • But it requires frequent access of a large amount
    of high-speed memory.
  • There can be millions of active flows on backbone
    links Duffield et al. 2001

5
Arrival model
  • Flow rates are stationary during the estimation
    period.
  • An arriving packet belongs to flow f with
    probability pf.
  • Packet arrivals are independent.
  • An arriving packet belongs to flow f independent
    of other arrivals.

6
Performance metrics
  • Sample size (estimation time) the number of
    arrivals needed to achieve the specified
    estimation accuracy.
  • Memory size the number of flows that are tracked
    during the estimation.

7
Can we
  • Can we design a scheme which runs as fast as
    naïve counting but only catches interesting
    flows?

8
Problem re-definition
  • Whats the traffic flow composition through a
    network link during a certain time period given
    an estimation accuracy requirement (?, ?, ?, ?)?
  • ? - confidence percentile ? - estimation
    error
  • ? - threshold rate ? - error relaxation
    factor.
  • For any given flow f with its rate be pf,
    determine an estimate pf for pf such that
  • pf? (pf - ?/2, pf ?/2) if pf ? ?
  • pf? (pf - ??/2, pf ??/2) if pf ? ?
  • with probability greater than ?.
  • e.g., ? 0.9975 (Z?3), ? 0.0001, ? 0.01, ?
    10.

9
Our solution CATE
  • Coincidence bAsed Traffic Estimation
  • Its simple
  • It estimates the size of the flows rapidly
  • It requires a small amount of memory.
  • A generalization of RATE scheme Kodialam et al.
    2004

10
CATE scheme description (I)
k-length
11
CATE scheme description (II)
  • Estimation procedure
  • Specify the estimation accuracy requirement (?,
    ?, ?, ?)
  • Calculate sample size N, memory size M, and k
    (the length of PT)
  • Run and count the coincidences CC(f) for each
    flow f
  • At the end of the estimation, output the
    estimated flow rates as
  • For each flow f in CCT, p(f)
  • For the rest of the flows, report 0 as the
    estimated rates.

12
Intuition behind CATE
  • Counting coincidence dramatically amplifies
    (squares) the ratio of catching probability
    between a large flow and a small flow.
  • Good news CATE sample size is still
    for the estimation accuracy requirement (?, ?, ?,
    ?).
  • Multiple (i.e., k) comparisons increase the
    number of membership testing to kN with N
    arrivals.
  • Good news the reduction on estimation variance
    due to increase in testing number is no less than
    the increase on estimation variance due to
    comparison correlation.

13
CATE sample size
  • Given the accuracy requirement (?, ?) and
    k-length predecessor table for CATE,
  • The minimal sample size

14
Theorem 1 CATE sample size
  • Given an estimation accuracy requirement (?, ?,
    ?, ?), if
  • then setting
  • gives
  • the sample size
  • 99.9
  • ? 0.0001
  • 0.01 ? ? 10 ? k 50
  • 0.001 ? ? 20 ? k 500

15
Theorem 2 CATE memory size
  • Given the estimation setting in Theorem 1,
  • the maximum expected memory
  • Totally 100,000 flows, rate range 10-6, 1
  • Z? 3
  • 0.002
  • CATE will catch no more than 1650 flows.
  • Naïve counting will record all flows w.h.p.

16
CATE experiment 1
  • Real IP traces from NLANR
  • A size-1000 buffer between new arrival and PT.

(The memory size 547)
(The memory size 1464)
17
CATE experiment 2 (I)
  • methodology
  • Totally 1 million flows, synthetic traces
  • 5 large flows, rates uniformly distributed
    between 0.1 and 0.2 of the entire traffic
  • 1000 medium-sized flows, rates uniformly
    distributed between 0.0001 and 0.0002 of the
    entire traffic
  • All the rest are small flows, each with rate
    roughly 10-7
  • Deliberately chose a short sample time (1 million
    packet time) to illustrate the impact of k, the
    predecessor table length.

18
CATE experiment 2 (II)
19
CATE experiment 2 (III)
20
Conclusion future work
  • CATE, a memory efficient traffic estimation
    scheme as fast as naïve counting.
  • Future work
  • Extending CATE for byte rate estimation.
  • Extending CATE to minimize the impact of arrival
    dependence without (excessive) additional
    overhead.
Write a Comment
User Comments (0)
About PowerShow.com