Title: CSC457 Seminar
1About Network Processor
CSC457 Seminar YongKang Zhu December 6th, 2001
2Outline
- 1. What is a NP, why we need it and its features
- 2. Benchmarks for NP evaluation
- 3. Several issues on NP design
- (a). Processing unit architecture
- (b). Handling I/O events
- (c). Memory (buffer) organization and management
3What is a network processor?
A network processor is a highly programmable
processor, which is suitable for performing
intelligent and flexible packet processing and
traffic management functions at line speed in
various networking devices, such as routers and
switches, etc.
4A typical router architecture
5Why NP and their features?
- Fast growth in transmission technology
- Advanced packet processing functions
- Traditional methods using ASIC or
off-the-shelf CPU - Performance
- Programmability, flexibility
- Design and implementation complexity
- Value proposition
6Benchmarks for NP evaluation
- Major metrics include
- Throughput bps, pps, connections per second,
- transactions per second
- Latency time for a packet passing through NP
- Jitter variation in latency
- Loss Rate ratio of lost packets
7Commbench - by Mark Franklin
- 1. Two categories of typical applications
- Header processing applications RTR, FRAG, DRR,
TCP - Payload processing applications CAST, REED,
ZIP, JPEG - 2. Selecting appropriate input mix to represent
- different workload and traffic pattern
- 3. Design implications (computational complexity)
8Importance of selecting input mix
9Some Issues on NP design
- Processing unit architecture
- Fast handling I/O events
- Memory organization and management
10Processing unit architecture
Four architecture reviewed 1. a super scalar
microprocessor (SS) 2. a fine-grained
multithreading microprocessor (FGMT) 3. a chip
multiprocessor (CMP) 4. a simultaneous
multiprocessor (SMP)
11Comparison among four architectures
1. CMP and SMP can explore more instruction level
parallelism and packet level parallelism 2.
However, other problems are introduced, as how to
efficiently handling cache coherency and memory
consistency
12Handling I/O
- Make equal sized internal flits
- Higher level pipeline for packet processing
- Using coprocessor
13Higher (task) level pipeline
14Memory organization management
- 1. Using novel DRAM architectures
- ? page mode DRAM
- ? Synchronous DRAM
- ? Direct Rambus DRAM
- 2. Using slow DRAM in parallel
- ? Ping-pong buffering
- ? ECQF-MMA (earliest critical queue first)
15Ping-pong buffering
Buffer Usage
Buffer Organization
16ECQF-MMA (earliest critical queue first)
- Using slow DRAM and fast SRAM to organize buffer
structure - total Q FIFO queues
- memory bus width is b cells
- memory random access time is 2T
- the size of each SRAM is bounded to Q (b - 1)
cells - Arbiter selects which cells from which FIFO queue
will depart in future - requests to DRAM for replenishing SRAM FIFOs are
sent after being accumulated to a certain amount - guarantee a maximum latency experienced by each
cell
17Intel's IXP1200
- 1 StrongArm core and 6 RISC micro engine
- can manage up to 24 independent threads
- two interfaces IX bus and PCI
- IX bus for connecting MAC ports
- PCI bus for connecting master processor
- register files replicated in each micro engine
- on-chip scratch SRAM and I/O buffers
- two sets of register files each micro engine
- 128 GPRs and 128 transfer registers
- instruction set architecture
- specified field for context switch
- specified instruction for reading on-chip scratch
SRAM
18One application of Intel's IXP1200
19Conclusions
1. what is a NP, why we need it and its
features 2. benchmarks 3. processing unit
architectures CMP or SMP 4. fast handling I/O
task pipeline, coprocessor 5. memory
architectures -- only a small part of a
huge design space