Title: QoS is Quite often Stupid Why is traffic so hard to understand
1QoS is Quite often Stupid!Why is traffic so
hard to understand?
LanMan 2005 Chania, Crete
- Jim Roberts, France Telecom
2What's stupid and what's not the example of
Intserv
- Guaranteed Service is "stupid"
- token bucket is not a useful traffic descriptor
- deterministic worst case guarantees are
unreasonably pessimistic - but the RFC drafters were not stupid...
- they looked at QoS from a different
perspective... - ...and deterministic "network calculus" is not
stupid - brilliant mathematics...
- ... but not appropriate for networking
- though, of course, we are all "quite often
stupid" !
3Talk outline
4MPLS Diffserv-aware Traffic Engineering 3 quotes
- MPLS TE
- "For the purpose of bandwidth allocation, a
single canonical value of bandwidth requirements
can be computed from a traffic trunk's traffic
parameters" (RFC 2702, Sept 1999) - Diffserv
- "Diffserv deployment is simple in that it
lets an SP support different service levels
merely by using different under- or
overprovisioning ratios per class" (Cisco, IEEE
Internet Computing, Feb 2005) - Diffserv TE
- "MPLS Diffserv-aware TE combines the advantages
of both Diffserv and MPLS. The result is the
ability to give strict QoS guarantees while
optimizing use of network resource" (Juniper
White paper, 2004)
5MPLS Diffserv-aware Traffic Engineering coloured
pipes
6Diffserv from traffic descriptor to SLS?
- principle of Diffserv
- based on a per-class traffic descriptor,
- satisfy the terms of a per-class SLS
7From traffic descriptor to SLS?
- principle of Diffserv
- based on a per-class traffic descriptor,
- satisfy the terms of a per-class SLS
- but how ?
- fit a leaky bucket and make worst case traffic
assumptions... - or "merely use different under- and
over-provisioning ratios per class"
8Traffic and performance
demand
capacity
performance
Delay lt 5ms Jitter lt 1ms Loss lt 0.001
- bandwidth
- how it is shared
- packet latency
- response time
9Traffic and performance
- e.g., an M/M/1 queue
- E delay t r / (1 - r) , t packet time, r
link load - very little scope for service differentiation
- quality of service is "good" or "bad"
- a need for overload control (when r 1)
- e.g., finite buffer, admission control
10Flow level characterization of Internet traffic
- traffic is composed of flows
- an instance of some application
- (same identifier, minimum packet spacing)
- flows are "streaming" or "elastic"
- streaming SLS "conserve the signal"
- elastic SLS "transfer as fast as possible"
- an essential characteristic the flow peak rate
- streaming peak rate coding rate
- elastic peak rate exogenous rate limit (access
line,...)
peak rate
peak rate
streaming
elastic
11Understanding traffic 2 performance models
- 1. bufferless statistical multiplexing
- for streaming flows
- 2. statistical bandwidth sharing
- for elastic flows
12Bufferless statistical multiplexing
- Pr input rate gt output rate lt e ? controlled
packet level performance - loss ? E input output / E input
- delay a modulated Poisson arrival process, M/M/1
results apply - e.g., Proba gt83 pkts 10-4 at load 90 (83
pkts ? 1 ms at 1 Gbps)
13Bufferless statistical multiplexing
- Pr input rate gt output rate lt e ? controlled
packet level performance - loss ? E input output / E input
- delay a modulated Poisson arrival process, M/M/1
results apply - e.g., Proba gt83 pkts 10-4 at load 90 (83
pkts ? 1 ms at 1 Gbps) - acceptable load depends on flow peak rate
- low rate ? low variance ? high utilization
- high rate ? high variance ? low utilization
14Bufferless statistical multiplexing for EF?
- performance depends on load and flow peak rates
- i.e., mean and variance of EF rate
- what SLS guarantees by traffic conditioning at
the edge? - no control on flow paths (on egress, in
particular) - no control of flow peak rates
- but there may be pragmatic guarantees
- giving priority to EF packets
- ensuring load is very small ("overprovisioning")
15Statistical bandwidth sharing
- fair sharing of bandwidth between elastic flows
- realized approximately by TCP congestion control
- performance measured by flow response times
- depends on changing population of active flows
16Performance of fair sharing
- a fluid simulation
- Poisson flow arrivals
- no exogenous peak rate limit ? flows are all
bottlenecked - load 0.5 (arrival rate x size / capacity)
17The process of flows in progress depends on link
load
load 0.5
18The process of flows in progress depends on link
load
flows in progress
30
20
10
0
load 0.9
19Insensitivity of processor sharing
- link sharing ? behaves like an M/M/1 queue
- for any size distribution, correlated arrival
process - performance depends on load, r (arrival rate x
size) / capacity - E flows in progress r/(1-r), E throughput)
C (1 - r) - small number of bottlenecked flows in normal load
(r ltlt 1) - but, in practice, r lt 0.5 and E flows in
progress O(104) !
20Trace data
- an Abilene link (Indianapolis-Clevelend) from
NLANR - OC 48, utilization 16
- flow rates ? (10 Kb/s, 10 Mb/s)
- 7000 flows in progress at any time
- the link is "transparent"
- packet level performance of bufferless
multiplexing
21Insensitivity of processor sharing a miracle of
queueing theory !
- link sharing ? behaves like an M/M/1 queue
- for any size distribution, correlated arrival
process - performance depends on load, r (arrival rate x
size) / capacity - E flows in progress r/(1-r), E throughput)
C (1 - r) - except when flow rate ltlt C
- rapid performance degradation as r ? 1
no scope for differentiation
22Number of flows in progress depends on link load
flows in progress
30
20
10
0
load 1.1
23Overload control by flow blocking
- according to the model, throughput is zero if
demand ? C ! - with or without rate limit
- in fact, stability ensured by flow aborts with
the result - ineffective bandwidth utilisation,
- low throughput, decreasing with increasing
patience and perseverence - an alternative flow-level admission control
- reject new demands when a saturation is detected
- to preserve the throughput of flows in progress
24Statistical bandwidth sharing for AF?
- no scope for throughput differentiation in the
core - peak rate throughput, even at high load
- but severe degradation in overload throughput ?
0 - differentiation can protect premium traffic in
overload - priority to premium flow packets (PQ as good as
any other scheduling) - maintains premium throughput... but best effort
throughput ? 0
25Statistical bandwidth sharing for AF?
- no scope for throughput differentiation in the
core - peak rate throughput, even at high load
- but severe degradation in overload throughput ?
0 - differentiation can protect premium traffic in
overload - priority to premium flow packets (PQ as good as
any other scheduling) - maintains premium throughput... but best effort
throughput ? 0 - an alternative differentiation by admission
control - reject best effort flows at onset of congestion
26MPLS Traffic Engineering
- RFC 2702. Requirements for Traffic Engineering
over MPLS - "For the purpose of bandwidth allocation, a
single canonical value of bandwidth requirements
can be computed from a traffic trunk's traffic
parameters. Techniques for performing these
computations are well known. One example of this
is the theory of effective bandwidth." - traffic engineering is based on the assumption
that traffic trunks can be represented as a
constant bandwidth deduced from traffic
parameters - in routing traffic trunks and LSPs (constrained
routing) - in managing priorities, preemptions,...
eg, Russian dolls...
... or is this stupid ?
27Notes on effective bandwidth (Kelly 1996)
- let X(t) be the amount of data arriving in the
interval (0,t) - the effective bandwidth is a function a(s,t)
- examples
an exponential on-off source
28Notes on effective bandwidth (Kelly 1996)
- let X(t) be the amount of data arriving in the
interval (0,t) - the effective bandwidth is a function a(s,t)
- examples
a periodic on-off source
a(s,t)
s
t
29Notes on effective bandwidth (Kelly 1996)
- let X(t) be the amount of data arriving in the
interval (0,t) - the effective bandwidth is a function a(s,t)
- examples
30Notes on effective bandwidth (Kelly 1996)
- let X(t) be the amount of data arriving in the
interval (0,t) - the effective bandwidth is a function a(s,t)
- "The effective bandwidth of a source depends
sensitively upon the statistical properties of
the source yet these properties may not be known
with certainty either to the user responsible for
the source or to the network" - "The appropriate choice of space and time scale
will depend upon the characteristics of the
resource such as its capacity, buffer size,
traffic mix and scheduling policy." - conclusion the effective bandwidth is not a
canonical value
31"A single canonical value"?
- "...deduced from a traffic trunk's traffic
parameters" - e.g., from the parameters of a leaky bucket ?
- or derived by measurements...
- the mean rate is sufficient for elastic traffic
- but can it be measured on every traffic trunk?
- a fundamental problem the representation of
variable rate traffic - no satisfactory solution...
- better to avoid it by flow-aware networking !
32Consequences of "stupid" QoS
- no relation between traffic descriptor and
resource allocation - the leaky bucket is not a satisfactory traffic
descriptor - SLS cannot be guaranteed "merely" by using
different under- or overprovisioning ratios for
different classes - performance depends on flow rates
- limited protection against overload
- no guarantees for lowest priority flows
- limited protection against "misbehaviour"
- only aggregate policing, relies on
TCP-friendliness - traffic engineering is as if traffic trunks are
constant bit rate - but there is no "canonical bandwidth value" for
variable rate flows - though, of course, the above are not visible in
an overprovisioned network !
33A fundamental relationship for engineering
stress
it works
strength
34(No Transcript)
35(No Transcript)
36(No Transcript)
37(No Transcript)
38(No Transcript)
39An alternative to MPLS Diffserv-aware TE
Flow-aware networking
40Streaming flows and elastic flows
- an essential traffic characteristic the flow
peak rate - streaming flows peak determined by coding
- elastic flows peak determined by other links on
path
41Three link operating regimes
"congested"
low throughput, significant loss
needs differentiation
needs overload control
FIFO sufficient
42Flow-aware networking enhanced best effort
- network elements are aware of user-defined flows
- identifying flows "on the fly"
- enforced per-flow fair sharing of link bandwidth
- instead of relying on TCP...
- per-flow admission control
- to maintain performance in overload
- realizes implicit streaming/elastic
differentiation - low latency for streaming flows
- fair shares for elastic flows
43User-defined flows
- defined by flow id...
- eg, IPv6 flow label IP adresses
- ... and inter-packet lt T
- eg, T 2s
44Per-flow fair queueing
- schedulers share link bandwidth equally between
active flows - e.g., Deficit Round Robin, Self-Clocked Fair
Queueing,... - realizing max-min fair sharing
- as fair as possible accounting for all path
constraints - many advantages
- performance not vulnerable to misbehaviour
- not restricted to TCP friendly transport
- implicit streaming/elastic differentiation,...
incoming rate
45Example Deficit Round Robin
- ActiveList a table of flows to be scheduled
- FlowID, DeficitCounter, Quantum, position in
round - each flow transmits up to Quantum bytes per round
- DeficitCounter accounts for variable packet sizes
- feasibility depends on ActiveList size
- flow in ActiveList ? packet in queue
46Priority Deficit Round Robin
- priority to packets of flows of rate lt fair rate
- includes streaming flows
- realized easily by adapting DRR
- new flows send first quantum to priority queue
- enter ActiveList schedule only if rate ? fair rate
47Fair queueing is scalable
- distinguish flows in progress and active flows
- number of active flows depends on link load
- and relative flow peak rates ... but not on link
rate - active ? flow has one or more packets in queue
- some flows are bottlenecked but only a small
number - most flows are not bottlenecked and only need
schedulng when they have a packet in the queue
48ActiveList distribution at load 0.9
Pr list gt n
number of flows (n)
49Measurement-based admission control
- admission control to preserve performance in
overload - keep fair rate gt threshold1 ? throughput
guarantees - keep priority queue load lt threshold2 ? latency
guarantees - implicit admission control
- protected flow table identifier and epoch of
last packet - soft state time out is no packet in T seconds
(e.g., T 2) - reject packets of new flows in congestion
50Performance in overload impact of admission
control
- synthetic traffic TCP (80) UDP (20)
- rate 10Mbps, offered traffic 11Mbps
flows in ActiveList
load of priority queue
max size
400
1.0
200
0.5
0
0
0 200 400 sec
0 200 400 sec
without admission control
with admission control
51Implementing flow-aware networking
- flow-aware ? an enhanced best effort network
- controlled over provisioning
- needs no new standards
- mechanisms are technologically feasible
- incremental implementation
- starting with critical links (e.g., peering
links) - but routers don't implement fair queueing or
admission control! - and vendors do business with MPLS Diffserv-aware
TE,...
Cross-protect router
52A related vision flow state-aware routing (cf.
L. Roberts)
- flow-awareness is economical
- to maintain flow state (100 bytes/flow) is easy
- per-flow route caching saves costly address look
ups - based on "QoS signalling" (cf. proposed ITU
standard) - in-band signalling packets
- per-class rate allocations (max rate, available
rate,) - flow-awareness is efficient
- controlled link utilization (flow by flow)
- avoids TCP inefficiency
- flow-awareness has additional benefits
- enhanced security (DoS detection,)
- traffic data,
53Conclusion
- MPLS Diffserv-aware TE the Emperor's new clothes
- "anyone who couldn't see his clothes was either
stupid or incompetent". - no useful traffic descriptors, no canonical
values, no magic provisioning ratios
54Conclusion
- MPLS Diffserv-aware TE the Emperor's new clothes
- "anyone who couldn't see his clothes was either
stupid or incompetent". - no useful traffic descriptors, no canonical
values, no magic provisioning ratios - Flow-aware networking understanding the
traffic-performance relation - feasible router mechanisms fair queueing,
admission control
Get flow-aware !