Title: Algorithmic Architecture Tradeoffs in Network Processor Design
1 Algorithmic - Architecture Trade-offs in
Network Processor Design
- Introduction
- IP Packet processing
- Requirements Existing solutions
- Algorithm Architecture Tradeoffs
- Exploitation of RAM resources
- Conclusion
2Network Processing definition
- An integrated circuit
- software programmable
- optimized for conducting fast processing tasks on
data streams in packets at wire speed - offload path management and control tasks to
other components - From Grey Bird, NC State Univ, USA
Main() for i1ilt100i
3Network Processing layers
OSI/ISO()
Title
Description
Layer 7
Application
Appl protocols, user-interface
Layer 6
Presentation
Appl specific format transfer
Layer 5
Session
Connection to process, billing
Layer 4
Transport
Flow control point to point
NP
Layer 3
Network
Connection, switching of links
Layer 2
Data link
Signaling, block transfer
Layer 1
Physical
Transmission, coding, modul.
() OSI/ISO open system interconnection from
International Standard Organization
4Network Processing layers
NP
5Network Processing packets
bit
Packet 1
Packet 2
Packet 3
Packet 4
Packet 4
NP
Pass
Pass
Pass
Drop
Store
Packet 4
Packet 3
Packet 4
Packet 1
Network Instruction access control list (ACL)
for packet classification
6Network Processing packets
Network Data Base
Channel 1 OC192 (2.5Gb/s)
Packet 1
Packet 2
Packet 3
Packet 4
Packet 4
Packet 1
Firewall
Redundant
Encrypt
Packet 3
Packet 4
1a
4a
1b
4b
Needs for fast look-up table Encryption
instruction Must adapt to mixed protocols Must be
able to remove redundancy
Channel B specific 1Gb/s
Channel A OC12 (600Mb/s)
7Performances
µP Microprocessor FPGA Field Programmable Gate
Array DSP Digital Signal Processing NP
Network Processor
NP is 10x faster than µP
8Network Processing where is the difference
Network Processor Focused on packets Decision
pipeline Network instruction set Fast binary
decision Real-time
Microprocessor General purpose Cache based
pipeline Wide instruction set Various
mathematics Multi-task
9(No Transcript)
10(No Transcript)
11Filtering Classification
- The classifier determines the flow an incoming
packet belongs to looking at one or more fields
of the packet header. - The classification problem can be solved by
several search approaches - bitmap intersection
- fat inverted segment trees
- heap on trie data structures
12Link Scheduling
- A link scheduler is a kind of arbiter that must
decide which of the buffered packets will be
transferred next through an outgoing link of a
networking node. - There are several features by which schedulers
may be distinguished - Fairness
- Efficiency
- Worst-case behavior
- Quality of service guarantees
- Utilization
13(No Transcript)
14Queuing
- After a packet has been admitted for a possible
transmission,it must be buffered in the system
until it will be either chosen by the link
scheduler for transmission or be discarded in
case of a congested link. - In order to balance the separation of flows and
the number of flows that can be managed,
different approaches are possible - Single Queue
- Separate Queue
- Shared Memory
15(No Transcript)
16(No Transcript)
17Node Architecture
18Evaluation Models
- Algorithm and hardware blocks as well as the
traffic traces are discussed which are used for
the evaluation of options for QoS preservation in
multi-service access networks. - Required models
- Algorithm Models
- Architecture Models
- Traffic Generation Models
19Algorithm Models
- Reproduce the behavior of algorithms for packet
processing tasks. - Behavior is not bounded by assuming any
properties of computing resources
20Architecture Models
- Imitate the timing behavior of hardware building
blocks which can be used to implement a network
processor for multi-service access networks. The
timing together with the statistics generated by - algorithm models are used to estimate the load of
hardware resources.
21Traffic Generation Models
- Network traffic must be modeled to stimulate the
network processor. - Inter-arrival time of packets determines the
frequency of packet processing events. - Length of the packet and other packet header
information decide the QoS a packet will receive.
22Performance models of Algorithms
- It is assumed Packets have passed a header
parser as well as filter, forwarding, and
classifier stages when they enter the policer. - These stages are not modeled for the evaluation
since - their outcome header fields, next hop
address/link, and a QoS class identifier is
constant independently of the chosen algorithms. - Do not affect the QoS preservation behavior of
the packet processor in terms of packet delay and
buffer space. - Candidates for special hardware blocks
23Performance models of Algorithms
- Policer
- Nested token bucket policer for green profiles
and a single yellow profile - Link Scheduler
- WFQ-based scheduling SCFQ , SPFQ , MD-SCFQ
- Deficit Round-Robin packet scheduling
- Queue Manager
- CYQ QM with central yellow queue and tail-drop
- CYQ-enh. Enhanced QM with central yellow queue
and tail-drop - CYQ-RED CYQ-enh. with RED congestion avoidance
for yellow traffic - YQ-Fair Fair QM with per-flow yellow queues
24Performance models of Algorithms Statistics
- Information output by Algorithm Models
- Information about operation performed
- CPU- like
- Other CPU like instructions
- Branch, Register Copy, Address Offset
Calculations - Memory Accesses
- Priority Queue Operations
- Dynamic Memory Allocation
25Performance models of Algorithms Statistics
- Asses QoS properties of System
- Queue Lengths
- Dropped packets
- Bucket levels
- Virtual Time Evolution
- Delay due to queuing scheduling
26Counting methodology
- Off-line counting of operations) For a given
elementary packet processing - task (dequeue, enqueue, policing, etc.) specified
by a programming language - Detect the basic blocks.
- Determine the control flow between the basic
blocks. - Beginning from the entry point of the control
flow of the overall task, determine the sets of
active variables at the entry point and at the
branch point of every basic block. - For every basic block
- Extract code dealing with priority queues and
dynamic memory management.Count priority queue
and dynamic memory allocation operations. - I n the remaining code fragment
- Detect variables and constants which belong to
the context informationand which are not active. - Count the required memory accesses and address
offset calculations to read these variables from
memory. - Count the required CPU operations.
- Detect the context variables which have been set
in the basic block. - Count the required write accesses and address
offset calculations to write these variables back
to memory at the end of the basic block. - Assign the determined counter values to the
basic block.
27Architecture Models
- CPU Timing Model
- RAM timing model
- Priority Queue model
- Dynamic Memory Allocation Model
28- Exploitation of RAM Resources
- Specific characteristics and Application areas
for different RAM types - Impact of memory controller on overall system
performance - current DRAM performance influence
- Throughput of interface to processor
- Delay properties of underlying memory core
29- Performance Bottlenecks of RAMs
- DelayRead Write Memory Access
- SRAM
- Type of Access
- DRAM
- State of RAM
- Placement of data in RAM
- Order of Accesses
30DRAM Timing
31(No Transcript)
32SRAM Organization Operation
- Same operation modes as SDRAM
- Optimized for speed
- Implementation of caches
- Increased pin count
- More Silicon Area
- Compared to DRAM
- Worst case power dissipation higher
- More expensive
33Synchronous DRAMS
- VC SDRAM
- Enhanced SDRAM
- Synchronous Graphic RAM
- DDR SDRAM and SGRAM
- Direct Rambus DRAM
34Memory Controller
- Other design choices
- Integrated memory controller
- Stand Alone Controller
- Synthesizable Block
35Memory Modelling
36Stimuli
- Choice of traffic patterns
- Public Traffic Traces from the internet
- Real Traffic Sources
- Statistical source models
37System for Evaluation
38(No Transcript)
39(No Transcript)
40(No Transcript)
41(No Transcript)
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46(No Transcript)