EE384x: Packet Switch Architectures I - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

EE384x: Packet Switch Architectures I

Description:

Size: For TCP to work well, the buffers need to hold one RTT (about 0.25s) of data. ... If instead a different address is used for each memory, and packets in the 320B ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 25
Provided by: nickmc
Category:

less

Transcript and Presenter's Notes

Title: EE384x: Packet Switch Architectures I


1
EE384x Packet Switch Architectures I
  • Parallel Packet Buffers

Nick McKeown Professor of Electrical Engineering
and Computer Science, Stanford
University nickm_at_stanford.edu http//www.stanford.
edu/nickm
2
The Problem
  • All packet switches (e.g. Internet routers, ATM
    switches) require packet buffers for periods of
    congestion.
  • Size For TCP to work well, the buffers need to
    hold one RTT (about 0.25s) of data.
  • Speed Clearly, the buffer needs to store
    (retrieve) packets as fast as they arrive
    (depart).

Linerate, R
Linerate, R
Memory
1
1
Memory
Linerate, R
Linerate, R
Memory
N
N
3
An ExamplePacket buffers for a 40Gb/s router
linecard
10Gbits
Buffer Memory
Buffer Manager
4
Memory Technology
  • Use SRAM?
  • Fast enough random access time, but
  • Too low density to store 10Gbits of data.
  • Use DRAM?
  • High density means we can store data, but
  • Cant meet random access time.

5
Cant we just use lots of DRAMs in parallel?
Read/write 320B every 32ns
Buffer Memory
Buffer Memory
Buffer Memory
Buffer Memory
Buffer Memory
Buffer Memory
Buffer Memory
Buffer Memory
40-79
Bytes 0-39





280-319
320B
320B
Write Rate, R
Buffer Manager
One 40B packet every 8ns
6
Works fine if there is only one FIFO
Buffer Memory
40-79
Bytes 0-39





280-319
320B
320B
Write Rate, R
Read Rate, R
Buffer Manager
40B
320B
40B
320B
One 40B packet every 8ns
One 40B packet every 8ns
7
Works fine if there is only one FIFO
Variable Length Packets
Buffer Memory
320B
320B
320B
320B
320B
320B
320B
320B
320B
320B
40-79
Bytes 0-39





280-319
320B
320B
Write Rate, R
Read Rate, R
Buffer Manager
?B
320B
?B
320B
One 40B packet every 8ns
One 40B packet every 8ns
8
In practice, buffer holds many FIFOs
1
320B
320B
320B
320B
  • e.g.
  • In an IP Router, Q might be 200.
  • In an ATM switch, Q might be 106.

How can we writemultiple variable-lengthpackets
into different queues?
2
320B
320B
320B
320B
Q
320B
320B
320B
320B
40-79
Bytes 0-39





280-319
9
Problems
  • A 320B block will contain packets for different
    queues, which cant be written to, or read from
    the same location.
  • If instead a different address is used for each
    memory, and packets in the 320B block are written
    to different locations, how do we know the memory
    will be available for reading when we need to
    retrieve the packet?

10
Hybrid Memory Hierarchy
11
Some Thoughts
  • What is the minimum SRAM needed to guarantee that
    a byte is always available in SRAM when
    requested?
  • What algorithm should we use to manage the
    replenishment of the SRAM cache memory?

12
An Example Q 5, w 9, b 6
13
An Example Q 5, w 9, b 6


14
The size of the SRAM cache
Bytes
Q
w
w
  • Necessity
  • How large does the SRAM cache need to be under
    any MMA?
  • Theorem wQ gt Q(b - 1)(2 lnQ)
  • Sufficiency
  • For a specific MMA, and for any pattern of
    arrivals, what is the smallest SRAM cache needed
    so that a byte is always available when
    requested?
  • For one particular algorithm wQ Qb(2 lnQ)

15
Some Definitions
  • Occupancy X(q,t)
  • The number of bytes in FIFO q (in SRAM) at time
    t.
  • Deficit D(q,t) w - X(q,t)

Q
w
w
deficit
occupancy
16
Smallest SRAM cacheNecessity
17
Smallest SRAM cacheNecessity
  • In addition, each queue needs to hold (b 1)
    bytes in case it is replenished with b bytes when
    only 1 byte has been removed.
  • Therefore, SRAM size must be at least Qw gt Q(b
    1)(2 lnQ).

18
Most Deficit Queue First MMASufficiency
  • Algorithm Every b timeslots, MDQF-MMA
    replenishes the queue with the largest deficit.
  • Theorem With MDQF-MMA, an SRAM cache of size Qw
    gt Qb(2 lnQ) is sufficient.

19
Reducing the size of the SRAM
  • Intuition
  • If we use a lookahead buffer to peek at the
    requests in advance, we can replenish the SRAM
    cache only when needed.
  • This increases the latency from when a request is
    made until the byte is available.
  • But because it is a pipeline, the issue rate is
    the same.

20
The ECQF-MMA Algorithm
Queues
Q
  • Lookahead Next Q(b 1) 1 arbiter requests
    are known.

b - 1
Queues
  • Replenish Fetch b bytes for the troubled queue.

Q
b - 1
21
Example of ECQF-MMA Q4, b4
22
Theorem
  • Patient Arbiter An SRAM cache of size Q(b 1)
    bytes is sufficient to guarantee that a requested
    byte is available within Q(b 1) 1 request
    times. Algorithm is called ECQF-MMA (Earliest
    Critical Queue first).

23
Maximum Deficit Queue First with Latency
(MDQFL-MMA)
  • What if application can only tolerate a latency
    lmax lt Q(b 1) 1 timeslots?
  • Algorithm Maximum Deficit Queue First with
    latency (MDQFL-MMA) services a queue, once every
    b timeslots in the following order
  • If there is an earliest critical queue, replenish
    it.
  • If not, then replenish the queue that will have
    the most deficit lmax timeslots in the future.

24
Queue length vs. Pipeline depthQ1000, b 10
Queue Length for Zero Latency
SRAM Size
Queue Length for Maximum Latency
Pipeline Latency, x
Write a Comment
User Comments (0)
About PowerShow.com