048866: Packet Switch Architectures - PowerPoint PPT Presentation

About This Presentation

Title:

048866: Packet Switch Architectures

Description:

Time from when a customer arrives until it has departed. Spring 2006 ... Let A(n), D(n) and Q(n) denote the arrivals, departures and queue size of some output queue. ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 61

Provided by: isaa8

Category:

more less

Transcript and Presenter's Notes

Title: 048866: Packet Switch Architectures

1
048866 Packet Switch Architectures

Output-Queued Switches
Deterministic Queueing Analysis
Fairness and Delay Guarantees

Dr. Isaac Keslassy
Electrical Engineering, Technion
isaac_at_ee.technion.ac.il
http//comnet.technion.ac.il/isaac/

2
Outline

Output Queued Switches
Terminology Queues and Arrival Processes.
Deterministic Queueing Analysis
Output Link Scheduling

3
Generic Router Architecture
1
1
Queue Packet
Buffer Memory
2
2
Queue Packet
Buffer Memory
N times line rate
N
N
Queue Packet
Buffer Memory
4
Simple Output-Queued (OQ) Switch Model
Link rate, R
Link rate, R
Link 2
R1
Link 1
Link 3
R
R
Link 4
R
R
R
R
5
How an OQ Switch Works
Output Queued (OQ) Switch
6
OQ Switch Characteristics

Arriving packets are immediately written into the
output queue, without intermediate buffering.
The flow of packets to one output does not affect
the flow to another output.

7
OQ Switch Characteristics

An OQ switch is work conserving an output line
is always busy when there is a packet in the
switch for it.
OQ switches have the highest throughput, and
lowest average delay.
We will also see that the rate of individual
flows, and the delay of packets can be controlled.

8
The Shared-Memory Switch
A single, physical memory device
Link 1, ingress
Link 1, egress
Link 2, ingress
Link 2, egress
R
R
Link 3, ingress
Link 3, egress
R
R
Link N, ingress
Link N, egress
R
R
9
OQ vs. Shared-Memory

Memory Bandwidth
Buffer Size

10
Memory Bandwidth (OQ)
Total (N1)R
11
Memory Bandwidth

Basic OQ switch
Consider an OQ switch with N different physical
memories, and all links operating at rate R
bits/s.
In the worst case, packets may arrive
continuously from all inputs, destined to just
one output.
Maximum memory bandwidth requirement for each
memory is (N1)R bits/s.
Shared Memory Switch
Maximum memory bandwidth requirement for the
memory is 2NR bits/s.

12
OQ vs. Shared-Memory

Memory Bandwidth
Buffer Size

13
Buffer Size

In an OQ switch, let Qi(t) be the length of the
queue for output i at time t.
Let M be the total buffer size in the shared
memory switch.
Is a shared-memory switch more buffer-efficient
than an OQ switch?

14
Buffer Size

Answer Depends on the buffer management policy
Static queues
Same as OQ switch
For no loss, needs Qi(t) M/N for all i
Dynamic queues
Better than OQ switch (multiplexing effects)
Needs

15
How fast can we make a centralized shared memory
switch?
5ns SRAM
Shared Memory

5ns per memory operation
Two memory operations per packet
Therefore, upper-bound of

1
2
N
200 byte bus
16
Outline

Output Queued Switches
Terminology Queues and Arrival Processes.
Deterministic Queueing Analysis
Output Link Scheduling

17
Queue Terminology

Arrival process, A(t)
In continuous time, usually the cumulative number
of arrivals in 0,t,
In discrete time, usually an indicator function
as to whether or not an arrival occurred at time
tnT.
l is the arrival rate the expected number of
arriving packets (or bits) per second.
Queue occupancy, Q(t)
Number of packets (or bits) in queue at time t.

18
Queue Terminology

Service discipline, S
Indicates the sequence of departures e.g.
FIFO/FCFS, LIFO,
Service distribution
Indicates the time taken to process each packet
e.g. deterministic, exponentially distributed
service time.
m is the service rate the expected number of
served packets (or bits) per second.
Departure process, D(t)
In continuous time, usually the cumulative number
of departures in 0,t,
In discrete time, usually an indicator function
as to whether or not a departure occurred at time
tnT.

19
More terminology

Customer
Queueing theory usually refers to queued entities
as customers. In class, customers will usually
be packets or bits.
Work
Each customer is assumed to bring some work which
affects its service time. For example, packets
may have different lengths, and their service
time might be a function of their length.
Waiting time
Time that a customer waits in the queue before
beginning service.
Delay
Time from when a customer arrives until it has
departed.

20
Arrival Processes

Deterministic arrival processes
E.g. 1 arrival every second or a burst of 4
packets every other second.
A deterministic sequence may be designed to be
adversarial to expose some weakness of the
system.
Random arrival processes
(Discrete time) Bernoulli i.i.d. arrival process
Let A(t) 1 if an arrival occurs at time t,
where t nT, n0,1,
A(t) 1 w.p. p and 0 w.p. 1-p.
Series of independent coin tosses with p-coin.
(Continuous time) Poisson arrival process
Exponentially distributed interarrival times.

21
Adversarial Arrival ProcessExample for
Knockout Switch
Memory write bandwidth k.R lt N.R
1
R
R
2
R
R
3
R
R
N
R
R

If our design goal was to not drop packets, then
a simple discrete time adversarial arrival
process is one in which
A1(t) A2(t) Ak1(t) 1, and
All packets are destined to output t mod N.

22
Bernoulli arrival process
Memory write bandwidth N.R
1
A1(t)
R
R
2
R
R
A2(t)
3
A3(t)
R
R
N
AN(t)
R
R

Assume A i(t) 1 w.p. p, else 0.
Assume each arrival picks an output
independently, uniformly and at random.
Some simple results follow
1. Probability that at time t a packet arrives
to input i destined to output j is p/N.
2. Probability that two consecutive packets
arrive to input i probability that packets
arrive to inputs i and j simultaneously p2.

23
Outline

Output Queued Switches
Terminology Queues and Arrival Processes.
Deterministic Queueing Analysis
Output Link Scheduling

24
Simple Deterministic Model
Cumulative number of bits that arrived up until
time t.
A(t)
A(t)
Cumulative number of bits
D(t)
Q(t)
R
Service process
time
D(t)

Properties of A(t), D(t)
A(t), D(t) are non-decreasing
A(t) D(t)

Cumulative number of departed bits up until time
t.
25
Simple Deterministic Model
Cumulative number of bits
d(t)
A(t)
Q(t)
D(t)
time
Queue occupancy Q(t) A(t) - D(t). Queueing
delay d(t) time spent in the queue by a bit that
arrived at time t (assuming that the queue is
served FCFS/FIFO).
26
Discrete-Time Queueing Model

Discrete-time at each time-slot n, first a(n)
arrivals, then d(n) departures.
Cumulative arrivals
Cumulative departures
Queue size at end of time-slot n
Q(n)A(n)-D(n)

27
Work-Conserving Queue
28
Work-Conserving Queue
29
Work-Conserving Queue

We saw that an output queue in an OQ switch is
work-conserving it is always busy when there is
a packet for it.
Let A(n), D(n) and Q(n) denote the arrivals,
departures and queue size of some output queue.
Let R be the queue departure rate (amount of
traffic that can depart at each time-slot).
After arrivals at start of time-slot n, this
output link contains Q(n-1)a(n) amount of
traffic.

30
Work-Conserving Output Link

Case 1 Q(n-1)a(n) R
) everything is serviced, nothing is left in the
queue.
Case 2 Q(n-1)a(n) gt R
) exactly R amount of traffic is serviced,
Q(n)Q(n-1)a(n) - R.
Lindleys Equation Q(n) max(Q(n-1)a(n)-R,0)
(Q(n-1)a(n)-R)
Note to find cumulative departures,
use D(n)A(n)-Q(n)

31
Outline

Output Queued Switches
Terminology Queues and Arrival Processes.
Deterministic Queueing Analysis
Output Link Scheduling

32
The problems caused by FIFO output-link
scheduling

A FIFO queue does not take fairness into account
) it is unfair. (A source has an incentive to
maximize the rate at which it transmits.)
It is hard to control the delay of packets
through a network of FIFO queues.

Fairness
Delay Guarantees
33
Fairness
10 Mb/s
0.55 Mb/s
A
1.1 Mb/s
100 Mb/s
C
R1
e.g. an http flow with a given (IP SA, IP DA, TCP
SP, TCP DP)
0.55 Mb/s
B
What is the fair allocation (0.55Mb/s,
0.55Mb/s) or (0.1Mb/s, 1Mb/s)?
34
Fairness
10 Mb/s
A
1.1 Mb/s
R1
100 Mb/s
D
B
0.2 Mb/s
What is the fair allocation?
C
35
Max-Min FairnessA common way to allocate flows

N flows share a link of rate C. Flow f wishes to
send at rate W(f), and is allocated rate R(f).
Pick the flow, f, with the smallest requested
rate.
If W(f) lt C/N, then set R(f) W(f).
If W(f) gt C/N, then set R(f) C/N.
Set N N 1. C C R(f).
If Ngt0 goto 1.

36
Max-Min FairnessAn example
1
W(f1) 0.1
W(f2) 0.5
C
R1
W(f3) 10
W(f4) 5

Round 1 Set R(f1) 0.1
Round 2 Set R(f2) 0.9/3 0.3
Round 3 Set R(f4) 0.6/2 0.3
Round 4 Set R(f3) 0.3/1 0.3

37
Water-Filling Analogy
10
Resource Requested/ Allocated
5
0.5
0.1
0.3
0.3
0.3
Customers (sorted by requested amount)
38
Max-Min Fairness

How can an Internet router allocate different
rates to different flows?
First, lets see how a router can allocate the
same rate to different flows

39
Fair Queueing

Packets belonging to a flow are placed in a FIFO.
This is called per-flow queueing.
FIFOs are scheduled one bit at a time, in a
round-robin fashion.
This is called Bit-by-Bit Fair Queueing.

Flow 1
Bit-by-bit round robin
Classification
Scheduling
Flow N
40
Bit-by-Bit Weighted Fair Queueing (WFQ)

Likewise, flows can be allocated different rates
by servicing a different number of bits for each
flow during each round.
Also called Generalized Processor Sharing (GPS)
(with infinitesimal amount of flow instead of
bits)

41
GPS Guarantees

An output link implements GPS with k sessions,
allocated rates R(f1), , R(fk).
Assume session i is continually backlogged.
For all j, let Sj(t1,t2) be the amount of service
received by session j between times t1 and t2.
Then
Si(t1,t2) R(fi) (t2-t1)
For all j?i,

42
Packetized Weighted Fair Queueing (WFQ)

Problem We need to serve a whole packet at a
time.
Solution
Determine at what time a packet p would complete
if we served flows bit-by-bit. Call this the
packets finishing time, Fp.
Serve packets in the order of increasing
finishing time.

Also called Packetized Generalized Processor
Sharing (PGPS)
43
Understanding Bit-by-Bit WFQ 4 queues,
sharing 4 bits/sec of bandwidth, Equal Weights
44
Understanding Bit-by-Bit WFQ 4 queues,
sharing 4 bits/sec of bandwidth, Equal Weights
Round 3
45
Understanding Bit-by-Bit WFQ 4 queues,
sharing 4 bits/sec of bandwidth, Weights 3221
46
Understanding Bit-by-Bit WFQ 4 queues,
sharing 4 bits/sec of bandwidth, Weights 3221
Round 1
Round 2
Weights 1111
47
WFQ is complex

There may be hundreds to millions of flows the
linecard needs to manage a FIFO per flow.
The finishing time must be calculated for each
arriving packet,
Packets must be sorted by their departure time.
Naively, with m packets, the sorting time is
O(logm).
In practice, this can be made to be O(logN), for
N active flows

1
Egress linecard
2
Calculate Fp
Find Smallest Fp
Departing packet
Packets arriving to egress linecard
3
N
48
Deficit Round Robin (DRR) Shreedhar Varghese,
95An O(1) approximation to WFQ
Step 1
Active packet queues
200
100
100
600
0
400
400
0
600
150
50
0
340
400
60
Quantum Size 200

It is easy to implement Weighted DRR using a
different quantumsize for each queue.
Often-adopted solution in practice

49
The problems caused by FIFO output-link
scheduling

A FIFO queue does not take fairness into account
) it is unfair. (A source has an incentive to
maximize the rate at which it transmits.)
It is hard to control the delay of packets
through a network of FIFO queues.

Fairness
Delay Guarantees
50
Deterministic analysis of a router queue
Cumulative bytes
FIFO delay, d(t)
A(t)
D(t)
Q(t)
m
time
51
So how can we control the delay of packets?

Assume continuous time, bit-by-bit flows for a
moment
Lets say we know the arrival process, Af(t), of
flow f to a router.
Lets say we know the rate, R(f) that is
allocated to flow f.
Then, in the usual way, we can determine the
delay of packets in f, and the buffer occupancy.

52
WFQ Scheduler
Flow 1
R(f1), D1(t)
A1(t)
Classification
WFQ Scheduler
Flow N
AN(t)
R(fN), DN(t)

Assume a WFQ scheduler

53
WFQ Scheduler
Cumulative bytes
Af(t)
Df(t)
R(f)
time

We know the allocated rate R(f)) If we knew the
arrival process, we would know the packet delay
Key idea constrain the arrival process

54
Lets say we can bound the arrival process
r
Cumulative bytes
Number of bytes that can arrive in any period of
length t is bounded by This is called (s,r)
regulation
A1(t)
s
time
55
(?,?) Constrained Arrivals and Minimum Service
Rate
Cumulative bytes
A1(t)
D1(t)
r
s
R(f1)
time
Theorem Parekh,Gallager 93 If flows are
leaky-bucket constrained,and routers use WFQ,
then end-to-end delay guarantees are possible.
56
The leaky bucket (s,r) regulator
Tokens at rate, r
Token bucket size, s
Packets
Packets
One byte (or packet) per token
Packet buffer
57
Making the flow conform to (s,r) regulationLeaky
bucket as a shaper
Tokens at rate, r
Token bucket size s
To network
Variable bit-rate compression
C
r
bytes
bytes
bytes
time
time
time
58
Checking up on the flowLeaky bucket as a
policer
Router
Tokens at rate, r
Token bucket size s
From network
C
r
bytes
bytes
time
time
59
QoS Router
Per-flow Queue
Scheduler
Per-flow Queue
Per-flow Queue
Scheduler
Per-flow Queue
Remember These results assume that it is an OQ
switch!
60
References

GPS A. K. Parekh and R. GallagerA Generalized
Processor Sharing Approach to Flow Control in
Integrated Services Networks The Single Node
Case, IEEE Transactions on Networking, June
1993.
DRR M. Shreedhar and G. VargheseEfficient
Fair Queueing using Deficit Round Robin, ACM
Sigcomm, 1995.

61
Questions

Do packets always finish in Packetized WFQ
earlier than (or as late as) in Bit-by-Bit WFQ?
Is DRR with quantum size 1 equal to Packetized
WFQ?

62
Answer NOExample 2 queues, equal weights, 1b/s
link
6 5 4 3 2
1 0
Time
A 6
B 1

In packetized WFQ A is serviced from 0 to 6
(packets cannot be broken), then B is serviced
from 6 to 7
In bit-by-bit WFQ A is serviced from 0 to 2,
then B from 2 to 3, then A from 3 to 7.

63
Outline

Do packets always finish in Packetized WFQ
earlier than (or as late as) in Bit-by-Bit WFQ?
Is DRR with quantum size 1 equal to Packetized
WFQ?

64
Answer NO
Quantum Size 1, with n flows
65
Answer NO
The scheduler starts serving the first flow
but the scheduler continues to service other
flows
3