Origins of Long Range Dependence Myths and Legends - PowerPoint PPT Presentation

About This Presentation
Title:

Origins of Long Range Dependence Myths and Legends

Description:

Max time scale = 8 min. Also, if there is number of on-off TCP connections, they can spread LRD ... Cut-off time scales observed: 150Mbps link rate, 500 bits ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 29
Provided by: Srid89
Category:

less

Transcript and Presenter's Notes

Title: Origins of Long Range Dependence Myths and Legends


1
Origins of Long Range Dependence Myths and
Legends
  • Aleksandar Kuzmanovic
  • 01/08/2001

2
Outline
  • Definitions
  • Why is LRD important?
  • Heavy tails
  • Producing self-similar traffic
  • Physical interpretation in LAN and WAN networks
  • Different hypothesis from around 10 papers

3
On the Self-Similar Nature of Ethernet Traffic,
W. Willinger, 1994
4
Definitions
  • Long range dependent process
  • if its autocorrelation function is nonsummable
  • Self-similar process
  • scaling behavior of finite dimensional
    distributions
  • X(m(1-H))X(m) in distribution
  • Second order self-similar process
  • aggregated processes possess the same
    non-degenerate AC functions as the original
    process
  • X and (m(1-H))X(m) have the same AC function
  • Self-similar processes have hyperbolically
    decaying autocorrelation functions - LRD can be
    characterized by a single parameter H

5
Heavy tails (Noah effect)
  • Heavy-tailed distributions
  • LLCD
  • Pareto a typical example

6
Producing Self-Similar Traffic
  • 1. Multiplexing ON/OFF sources that have a fixed
    rate in ON periods and ON/OFF period lengths that
    are heavy tailed.
  • Aggregate traffic is fBm with
  • 2. queue model
  • implies that multiplexing constant-rate
    connections with Poisson connection arrivals and
    a heavy-tailed distribution for connection
    lifetimes would result in self-similar traffic
  • 3. Inter-arrival packet times are i.i.d. Pareto
    with
  • and then consider the corresponding count process
    (the number of arrivals in consecutive
    intervals), we have pseudo self-similar traffic
    (Paxson, Floyd) (or even self-similar (L.
    Lipsky)?)

7
Questions we want to answer
  • What physical activity causes LRD?
  • What is the role of protocols (TCP and MAC layer
    protocols)?
  • What is the role of limited resources (i.e.
    bandwidth)?
  • What model fits best to each of the assumptions?
  • What is the largest time-scale over which the
    correlation is present?
  • Self-similarity vs. pseudo self-similarity and
    relevance

8
Statistical Analysis of Ethernet LAN Traffic at
the Source Level, W. Willinger, 1997, I
9
Statistical Analysis of Ethernet LAN Traffic at
the Source Level, W. Willinger, 1997, II
  • Model 1 (heavy tailed ON/OFF activity at the
    source level) is widely accepted
  • Result proven theoretically
  • Noah effect (heavy-tailed periods)
  • ON periods alpha 1.7
  • OFF periods alpha 1.2
  • TCP traffic measured most of the time...
  • Higher load - H increases
  • WAN measurements do not fit into this model
  • connection typically do not stay long

10
Wide Area Traffic The Failure of Poisson
Modeling, V. Paxson, S. Floyd, 1995
  • Summary of ways to produce LRD traffic
  • WAN (TCP) traffic for TELNET and FTP applications
  • TELNET connection arrivals appear to be Poisson,
    but packet arrivals are not
  • Single TELNET connection is LRD
  • Model 3 Inter-arrival times are i.i.d. Pareto
  • Aggregate is also LRD, but there is no analytical
    proof ()
  • FTP traffic also LRD, yet non of the models fit
    because of limited resources.
  • Aggregated traffic is not fBm (single H is not
    enough)

11
Explaining WWW Traffic Self-Similarity, M.
Crovella, 1995
  • WWW traffic is self-similar
  • but only when load is high (i.e. in busiest
    hours)
  • Authors force model 1 (ON/OFF model)
  • The distribution of
  • transfer times (alpha 1.21)
  • user requests for documents (alpha 1.06)
  • document sizes available in the Web (alpha
    1.05)
  • user think times (alpha 1.5)
  • H increases as the load increases (same as in LAN)

12
On the Relationships betw. file sizes, tran.
prot. and s-s netw. traffic, M. Crovella, 1996
  • Model 1 The success of this simple model is
    surprising given that it ignores non-linarities
    arising in real networks
  • Hypothesis
  • Heavy tailed file size distributions together
    with TCP is responsible for LRD
  • if UDP is used, there is little or no LRD
  • Explanation
  • In some sense, the effect of the unaccounted for
    nonlinearity is reflected back as a stretching in
    time effect, thus conforming to the models
    original suppositions
  • Other interesting stuff mix of Pareto and exp.
    background traffic

13
On the Propagation of LRD in the Internet, A.
Veres, 2000, I
  • Not about roots, but about propagation of
    self-similarity by TCP
  • A(t) C - B(t)
  • TCP is a linear system beyond a characteristic
    time scale
  • if it adapts well to a background traffic, it
    itself becomes self-similar

14
On the Propagation of LRD in the Internet, A.
Veres, 2000, II
  • Experimental proof
  • NY-Budapest file transfer, source is not LRD -
    traffic is LRD (H0.76)
  • Max time scale 8 min
  • Also, if there is number of on-off TCP
    connections, they can spread LRD
  • W. Willinger obviously does not like this paper
  • This is a fraud and has no relevance for LRD
    observed on link level...
  • Protocols have no impact on LRD, they just have
    to send the data generated by applications...

15
TCP Congestion Control and Heavy-Tails, M.
Crovella, 2000, I
  • Switch to Model 3 (Heavy-tailed inter-packet
    arrivals)
  • Although heavy-tailed flow lengths are commonly
    associated with heavy-tailed file sizes, there is
    no strong correlation between file sizes and
    transmission times
  • It has been shown that TCP can show heavy-tailed
    inter-arrival times under some
  • conditions
  • Because most of the
  • connections are short
  • lived (!) only slow start
  • and exp. back-off were
  • considered

16
TCP Congestion Control and Heavy-Tails, M.
Crovella, 2000, II
  • Simple Markov chain model for exp. backoff and
    slow start with pr. of loss parameter
  • State probability with different loss rates
  • For alpha to be
  • between 1 and 2,
  • p has to be between
  • 1/8 and 1/4
  • ...but for different model
  • p increases gt
  • H increases

17
TCP Congestion Control and Heavy-Tails, M.
Crovella, 2000, III
  • Pathological TCP connections 15 packets
  • Analytical model not that good (borders are
    loose)
  • For this set-up, correlation up to 1000 sec
  • For larger file sizes, up to 200-300 sec
  • Under certain conditions, heavy tailed
    transmission times can occur even in the absence
    of any variability in file sizes
  • Future work to consider the variability in
    round-trip time estimation

18
On the Autocorrelation Structure of TCP Traffic,
Don Towsley, 2000, I
  • Answer to previous two papers
  • TCP can create self-similarity but over finite
    range of time scales - pseudo self similarity
  • but everything in nature is finite (thus
    pseudo)
  • Also criticize pathological model of previous
    paper, but they themselves use pathological model
    of different kind (always packets model)
  • Separate Markovian models for Congestion
    avoidence (CA) and Time Out (TO) models
  • Simulated these two models with different loss
    probability parameters

19
On the Autocorrelation Structure of TCP Traffic,
Don Towsley, 2000, II
  • Range of time scales observed from the simulation
    (26RTT(2.5 to 10)) gt 29RTT
  • Explanation on why aggregate is self-similar
  • independent bottlenecks (at the edge)
  • aggregate of independent pseudo-self-similar
    flows should be self-similar itself ()

20
On the Autocorrelation Structure of TCP Traffic,
Don Towsley, 2000, III
  • !About Veres paper
  • compute loss probability (0.08 to 0.14)
  • TO model predicts H0.69-0.72 (really measured
    0.74)
  • Time scale goes up to 26 RTO (also near measured
    value)
  • Experiments (file transfers)
  • North-South America
  • Measurements p 0.13, H 0.77, ts (27 to
    28)RTT
  • TO model p 0.12, H 0.72, ts (27 to
    29)RTT
  • East - West Coast
  • Measurements p 0.018, H 0.86, ts 26RTT
  • CA model p 0.018, H 0.75, ts
    24RTT
  • One should be careful when attributing the origin
    of traffic characteristics to a specific cause

21
Protocols Can Make Traffic Appear Self-Similar,
Jon Peha, 1997. I
  • How basic retransmission mechanism can cause
    self-similarity
  • No model, only experimental investigation
  • Simple single queue (bottleneck) model
  • Input traffic - Poisson retransmissions are
    bursty
  • As time-scale gets larger, burstiness from
    original Poisson traffic decreases, but
    burstiness from retransmissions stays the same!
  • Unlikely that traffic from retransmission
    mechanism cause truly self similar traffic,
    rather pseudo self-similarity

22
Protocols Can Make Traffic Appear Self-Similar,
Jon Peha, 1997. II
  • Pictorial
  • proof

23
Protocols Can Make Traffic Appear Self-Similar,
Jon Peha, 1997. III
  • Cut-off time scales observed
  • 150Mbps link rate, 500 bits packets, RTT 60 msec
  • TS 5 minutes
  • 10Mbps Ethernet, No. of retransmissions5, To125
  • TS in range of minutes
  • For larger To, it is possible to reach time
    scales measured at Bellcore
  • I have computed cut-off time-scale for Veres
    paper
  • 128 Kbps, Tout10RTT2 sec, TS8min
  • If this effect is found to be as strong in more
    complex models, this could be a significant cause

24
The Second-order Characteristics of TCP,
J.Y.Boudec, 1996, I
  • Pseudo self similarity (TS20-30 sec)
  • Minimum bottleneck bandwidth 34Mbps (?)
  • Two main reasons (both heavy-tailed)
  • Burst length arrivals
  • Round trip time
  • Real network measurements
  • Figure - missing

25
The Second-order Characteristics of TCP,
J.Y.Boudec, 1996, II
  • Even for 34Mbps link and utilization of 25, the
    arrival bursts are eliminated and the inter
    packet times are dependent on the round trip
    times
  • The aggregate of TCP connections have the same H
    as a single TCP connection ()
  • It seems likely that the heavy tailed
    distributions observed in Willingers work were a
    result of, among other things, the heavy tailed
    distribution of a round trip time

26
More on RTTs
  • Why are round trip times heavy-tailed?
  • Because of TCP congestion control?
  • Because of retransmissions?
  • Because of variety of destinations?
  • It can be heavy-tailed even without any
    congestion protocol or different destinations!
  • Measurement and Analysis of LRD Behavior of
    Internet Packet Delay, M. Borella, Infocom 97
  • Constant UDP transmissions - LRD response
  • Is cross-traffic heavy-tailed?
  • Or multiple bottlenecks assumption?
  • Simple example (not through bandwidth adaptation,
    but through RTT adaptation)

27
Summary
  • Heavy-tailed parameters
  • File sizes
  • Connection life-times
  • Inter-arrival packet times
  • Document sizes available in the web
  • User think times
  • TELNET packet arrivals
  • Round trip times
  • Pseudo self-similarity
  • it should be clear that the range of time scales
    covered is far beyond dominant time scales, and
    as long as packet loss is concerned, this is
    relevant

28
Conclusions
  • One should be careful when attributing the origin
    of traffic characteristics to a specific cause
  • There is more than one physical activity causing
    LRD
  • Protocols (TCP) influence is more than relevant
  • Time scales covered are relevant in both
    generation, time-stretching and propagation
    hypothesis
  • Model 3 (inter-arrival times i.i.d. Pareto) plus
    heavy-tailed file sizes (introducing congestion)
    is promising
  • Analytical proof for aggregate is missing
    (simulation proof reported in 3 papers)
  • Round-trip times hypothesis might be promising -
    supports Veres idea in a slightly different way
Write a Comment
User Comments (0)
About PowerShow.com