Algorithmic Architecture Tradeoffs in Network Processor Design presentation

About This Presentation

Transcript and Presenter's Notes

Title: Algorithmic Architecture Tradeoffs in Network Processor Design

1
Algorithmic - Architecture Trade-offs in
Network Processor Design

Introduction
IP Packet processing
Requirements Existing solutions
Algorithm Architecture Tradeoffs
Exploitation of RAM resources
Conclusion

2
Network Processing definition

An integrated circuit
software programmable
optimized for conducting fast processing tasks on
data streams in packets at wire speed
offload path management and control tasks to
other components
From Grey Bird, NC State Univ, USA

Main() for i1ilt100i
3
Network Processing layers
OSI/ISO()
Title
Description
Layer 7
Application
Appl protocols, user-interface
Layer 6
Presentation
Appl specific format transfer
Layer 5
Session
Connection to process, billing
Layer 4
Transport
Flow control point to point
NP
Layer 3
Network
Connection, switching of links
Layer 2
Data link
Signaling, block transfer
Layer 1
Physical
Transmission, coding, modul.
() OSI/ISO open system interconnection from
International Standard Organization
4
Network Processing layers
NP
5
Network Processing packets
bit
Packet 1
Packet 2
Packet 3
Packet 4
Packet 4

Statistics
From
To

NP
Pass
Pass
Pass
Drop
Store
Packet 4
Packet 3
Packet 4
Packet 1
Network Instruction access control list (ACL)
for packet classification
6
Network Processing packets
Network Data Base
Channel 1 OC192 (2.5Gb/s)
Packet 1
Packet 2
Packet 3
Packet 4
Packet 4
Packet 1
Firewall
Redundant
Encrypt
Packet 3
Packet 4
1a
4a
1b
4b
Needs for fast look-up table Encryption
instruction Must adapt to mixed protocols Must be
able to remove redundancy
Channel B specific 1Gb/s
Channel A OC12 (600Mb/s)
7
Performances
µP Microprocessor FPGA Field Programmable Gate
Array DSP Digital Signal Processing NP
Network Processor
NP is 10x faster than µP
8
Network Processing where is the difference
Network Processor Focused on packets Decision
pipeline Network instruction set Fast binary
decision Real-time
Microprocessor General purpose Cache based
pipeline Wide instruction set Various
mathematics Multi-task
9
(No Transcript)
10
(No Transcript)
11
Filtering Classification

The classifier determines the flow an incoming
packet belongs to looking at one or more fields
of the packet header.
The classification problem can be solved by
several search approaches
bitmap intersection
fat inverted segment trees
heap on trie data structures

12
Link Scheduling

A link scheduler is a kind of arbiter that must
decide which of the buffered packets will be
transferred next through an outgoing link of a
networking node.
There are several features by which schedulers
may be distinguished
Fairness
Efficiency
Worst-case behavior
Quality of service guarantees
Utilization

13
(No Transcript)
14
Queuing

After a packet has been admitted for a possible
transmission,it must be buffered in the system
until it will be either chosen by the link
scheduler for transmission or be discarded in
case of a congested link.
In order to balance the separation of flows and
the number of flows that can be managed,
different approaches are possible
Single Queue
Separate Queue
Shared Memory

15
(No Transcript)
16
(No Transcript)
17
Node Architecture
18
Evaluation Models

Algorithm and hardware blocks as well as the
traffic traces are discussed which are used for
the evaluation of options for QoS preservation in
multi-service access networks.
Required models
Algorithm Models
Architecture Models
Traffic Generation Models

19
Algorithm Models

Reproduce the behavior of algorithms for packet
processing tasks.
Behavior is not bounded by assuming any
properties of computing resources

20
Architecture Models

Imitate the timing behavior of hardware building
blocks which can be used to implement a network
processor for multi-service access networks. The
timing together with the statistics generated by
algorithm models are used to estimate the load of
hardware resources.

21
Traffic Generation Models

Network traffic must be modeled to stimulate the
network processor.
Inter-arrival time of packets determines the
frequency of packet processing events.
Length of the packet and other packet header
information decide the QoS a packet will receive.

22
Performance models of Algorithms

It is assumed Packets have passed a header
parser as well as filter, forwarding, and
classifier stages when they enter the policer.
These stages are not modeled for the evaluation
since
their outcome header fields, next hop
address/link, and a QoS class identifier is
constant independently of the chosen algorithms.
Do not affect the QoS preservation behavior of
the packet processor in terms of packet delay and
buffer space.
Candidates for special hardware blocks

23
Performance models of Algorithms

Policer
Nested token bucket policer for green profiles
and a single yellow profile
Link Scheduler
WFQ-based scheduling SCFQ , SPFQ , MD-SCFQ
Deficit Round-Robin packet scheduling
Queue Manager
CYQ QM with central yellow queue and tail-drop
CYQ-enh. Enhanced QM with central yellow queue
and tail-drop
CYQ-RED CYQ-enh. with RED congestion avoidance
for yellow traffic
YQ-Fair Fair QM with per-flow yellow queues

24
Performance models of Algorithms Statistics

Information output by Algorithm Models
Information about operation performed
CPU- like
Other CPU like instructions
Branch, Register Copy, Address Offset
Calculations
Memory Accesses
Priority Queue Operations
Dynamic Memory Allocation

25
Performance models of Algorithms Statistics

Asses QoS properties of System
Queue Lengths
Dropped packets
Bucket levels
Virtual Time Evolution
Delay due to queuing scheduling

26
Counting methodology

Off-line counting of operations) For a given
elementary packet processing
task (dequeue, enqueue, policing, etc.) specified
by a programming language
Detect the basic blocks.
Determine the control flow between the basic
blocks.
Beginning from the entry point of the control
flow of the overall task, determine the sets of
active variables at the entry point and at the
branch point of every basic block.
For every basic block
Extract code dealing with priority queues and
dynamic memory management.Count priority queue
and dynamic memory allocation operations.
I n the remaining code fragment
Detect variables and constants which belong to
the context informationand which are not active.
Count the required memory accesses and address
offset calculations to read these variables from
memory.
Count the required CPU operations.
Detect the context variables which have been set
in the basic block.
Count the required write accesses and address
offset calculations to write these variables back
to memory at the end of the basic block.
Assign the determined counter values to the
basic block.

27
Architecture Models

CPU Timing Model
RAM timing model
Priority Queue model
Dynamic Memory Allocation Model

Exploitation of RAM Resources
Specific characteristics and Application areas
for different RAM types
Impact of memory controller on overall system
performance
current DRAM performance influence
Throughput of interface to processor
Delay properties of underlying memory core

Performance Bottlenecks of RAMs
DelayRead Write Memory Access
SRAM
Type of Access
DRAM
State of RAM
Placement of data in RAM
Order of Accesses

30
DRAM Timing
31
(No Transcript)
32
SRAM Organization Operation

Same operation modes as SDRAM
Optimized for speed
Implementation of caches
Increased pin count
More Silicon Area
Compared to DRAM
Worst case power dissipation higher
More expensive

33
Synchronous DRAMS

VC SDRAM
Enhanced SDRAM
Synchronous Graphic RAM
DDR SDRAM and SGRAM
Direct Rambus DRAM

34
Memory Controller

Other design choices
Integrated memory controller
Stand Alone Controller
Synthesizable Block

35
Memory Modelling
36
Stimuli

Choice of traffic patterns
Public Traffic Traces from the internet
Real Traffic Sources
Statistical source models

37
System for Evaluation
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)

Write a Comment

User Comments (0)

About PowerShow.com

Algorithmic Architecture Tradeoffs in Network Processor Design PowerPoint PPT Presentation