Programming Model for Network Processing on FPGAs - PowerPoint PPT Presentation

About This Presentation
Title:

Programming Model for Network Processing on FPGAs

Description:

Programming model for implementing network processing applications on an FPGA ... Handel-C, Forge. Domain Specific Languages. Cliff, Snort, Ponder. 6. Cliff ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 45
Provided by: Jan8155
Category:

less

Transcript and Presenter's Notes

Title: Programming Model for Network Processing on FPGAs


1
Programming Model for Network Processing on FPGAs
  • Eric Keller
  • October 8, 2004
  • M.S. Thesis Defense

2
Abstract
  • Programming model for implementing network
    processing applications on an FPGA
  • Present an API to higher level tools
  • Programming Language Presents an abstraction in
    terms of resources more suitable to the
    networking domain
  • Compiler Generate hardware from this description
  • Demonstrate through four applications
  • Aurora to GigE Bridge, RPC, IP Router, NAT

3
Outline of Talk
  • Background
  • Design Flow
  • User Interface
  • Compilation to Hardware
  • High Level Tools
  • Experiments/Results
  • Conclusions

4
Outline of Talk
  • Background
  • Design Flow
  • User Interface
  • Compilation to Hardware
  • High Level Tools
  • Experiments/Results
  • Conclusions

5
Tools for FPGAs
  • Hardware Description Languages
  • Verilog, VHDL
  • Structural High-Level Languages
  • JHDL, JBits
  • Behavioral High-Level Languages
  • Handel-C, Forge
  • Domain Specific Languages
  • Cliff, Snort, Ponder

6
Cliff
Input
  • Maps Click to Xilinx FPGAs
  • Click is a domain specific language for
    Networking
  • Modular router on Linux
  • Elements of common operations
  • e.g. Decrement TTL
  • Elements written in Verilog
  • Script to put system together

Lookup
Simple op
Queue
Output
7
Networking on FPGAs
  • Routing and Switching
  • MIR, IP Lookup, Crossbar Switch
  • Protocol Boosters
  • Error coding, encryption, compression
  • Security
  • Virus Scanning, Firewall
  • Web Server
  • TCP/IP in Hardware
  • 50-300x speedup over Sun/Intel based workstations

8
Outline of Talk
  • Background
  • Design Flow
  • User Interface
  • Compilation to Hardware
  • High Level Tools
  • Experiments/Results
  • Conclusions

9
Motivation
  • Goal Create a design environment that allows
    networking experts to use FPGAs
  • Several point solutions have shown FPGAs to be a
    good solution
  • Domain specific languages
  • There is not a standard high-level tool
  • Use MIR as a starting framework
  • Collaborating threads processing a message
  • Flexible architecture for memory and communication

10
Design API
  • Present an API to higher level tools
  • No leading high-level design entry for networking
    domain
  • Presents an abstraction in terms of resources
    suitable to the networking domain
  • e.g. threads
  • Allow specification of architecture as well as
    functionality
  • Generate hardware from this description
  • Generate VHDL
  • rely on existing back-end tools for mapping to
    FPGA
  • Present an intermediate textual format
  • XML

11
Design Hierarchy
. . .
High Level Tools
Teja
Click
Novalit
Programming Interface
soft architecture - mapping
Back-end tools
Platform FPGAs
12
Design Flow
  • Main Focus XML to VHDL to bit

XML Description (programming language)
API (Compiler)
Hardware description
Back-end tools
Configuration Bitstream
13
Outline of Talk
  • Background
  • Design Flow
  • User Interface
  • Compilation to Hardware
  • High Level Tools
  • Experiments/Results
  • Conclusions

14
Abstraction Primitives
Intellectual Property
Interface to External System
Thread
communication
synchronization
Thread
Memory
15
Threads
  • Micro-engines with instruction level parallelism
  • Instruction set and conditionals used to program
  • User defined variables
  • Implemented as custom hardware
  • Not a microprocessor with fetch, decode, execute
  • Synchronization
  • Activate, Deactivate
  • Communication
  • lightweight, channels

16
Intellectual Property
  • Allow for users to make use of pre-designed
    intellectual property (also called cores)
  • Not all algorithms are best expressed as a finite
    state machine
  • e.g. encryption, compression
  • User must
  • define the interface
  • instantiate using an include type statement
  • associate with a thread

17
Interfaces
  • Perimeter of the defined system
  • System can be whole FPGA or part of larger design
  • Exists as pre-defined netlist
  • Gigabit Ethernet, Aurora
  • Interface includes
  • Grouping of signals into ports
  • Extra functionality
  • e.g. perform framing and error detection
  • Protocol to get the message
  • Threads interact with the interface
  • Instantiate involves an include type statement

18
Memory
  • Provide buffering of messages, tables for lookup,
    storage of state
  • Parameterizable
  • Selection of different memories
  • exists as pre-defined netlist (for now)
  • each possibly being parameterizable
  • Instantiate through include type statement
  • Associate a memory port with a thread

19
Memory (contd)
  • FIFO
  • PutGet
  • Queue of objects, commit mechanism
  • SharedMemory
  • Single memory shared by multiple accessors
  • locking mechanism via BRAMs READ_FIRST
  • DPMem
  • Multiple memories shared by multiple accessors
  • Allocation mechanism

20
Outline of Talk
  • Background
  • Design Flow
  • User Interface
  • Compilation to Hardware
  • High Level Tools
  • Experiments/Results
  • Conclusions

21
Hardware Generation
  • Process of mapping between system resources to
    the hardware
  • Generate VHDL
  • One module per thread
  • Top level module hooking all components together
  • Memories, interfaces, channels exist as
    predefined netlists
  • Rely on back-end tools to create bitstream

22
Top Level
entity SYSTEM is port ( -- interface ) end
SYSTEM architecture struct of SYSTEM is --
signals begin -- synchronization logic --
instantiate each component -- (interfaces,
memories, threads, externally defined IP,
channels) end struct
23
Clocks
  • Interfaces determine clock domains

24
Thread
entity THREAD is port ( -- interface ) end
THREAD architecture behavioral of THREAD is --
signals begin -- control logic --
combinatorial process -- synchronous process --
special circuitry for memory reads and channel
gets end behavioral
25
Special Case Circuitry
  • Memory
  • READ(var, address)
  • User wants to work with var, not the memory
    signals
  • Need extra circuitry to enable this
  • Channels
  • CHAN_GET(var, address)
  • Extra conditional testing to see when address
    matches
  • START(thread, offset)
  • Extra circuitry to align the data
  • e.g. Ethernet header is 14 bytes

26
Outline of Talk
  • Background
  • Design Flow
  • User Interface
  • Compilation to Hardware
  • High Level Tools
  • Experiments/Results
  • Conclusions

27
Click
  • Click is a language for creating modular software
    routers
  • CLIFF is a tool that will map to FPGAs
  • Using XML instead
  • Create a base system
  • each element is a thread
  • each thread connects to one port of a DPMem
  • each thread can have state storage through
    SharedMemory memory element
  • Series of optimizations
  • some pre-base system, some post-base system

28
Click (contd)
Sub-graph match and replace
Split Paths
Click graph
Move elements
.clk
.clk
.clk
Create base System
Run Elements in parallel
Merge Elements
system.xml
Lib. Of elements (XML)
29
Teja
  • Teja is a development environment for NPUs
  • SW Lib - define constructs
  • Events, Data Structures, Components (state
    machine)
  • SW Arch - instantiate constructs
  • HW Arch - define the hardware resources
  • import for fixed defined (like NPUs)
  • create new one for FPGA target
  • HW Mapping
  • map constructs from SW arch to resources in HW
    Arch

30
Teja (contd)
Data Struct. Library (XML)
State Machine GUI (C code)
Thread Library (XML)
compile
Software Arch file (internal format)
Software Arch. GUI
(next slide)
31
Teja (contd)
(prev slide)
Thread, DPMem, Aurora, etc.
Hardware Arch.GUI
Hardware Arch file (internal format)
System.xml
Hardware Mapping GUI
Map
Insert lib code
32
Outline of Talk
  • Background
  • Design Flow
  • User Interface
  • Compilation to Hardware
  • High Level Tools
  • Experiments/Results
  • Conclusions

33
Gigabit Ethernet to Aurora Bridge
  • Two flows that will convert a frame from one
    protocol to the other
  • Ethernet
  • broadcast protocol (needs addressing)
  • Coarse grain flow control
  • Aurora
  • Xilinx proprietary protocol for point to point
    communication over multi-gigabit transceivers
  • Fine grain flow control

34
Bridge Architecture
Aurora
Aurora RX thread
GMAC
Put16Get8 Memory
GMAC TX thread
RX
TX
TX
RX
Put8Get16 Memory
Aurora TX thread
GMAC RX thread
35
Bridge Test Setup
36
Bridge Results
  • Compared result to VHDL code from XAPP777
  • latency time from last bit received to first
    bit sent

37
Remote Procedure Call
  • Mechanism to invoke a procedure on a remote
    computer
  • used in NFS
  • Almost exclusive to workstations
  • Message with the parameters to the function as
    well as information about the function being
    called
  • Implement an RPC server with the functions
    add(x,y) and mult(x,y)

38
RPC Architecture
GMAC
RX
TX
TX thread
RX thread
ADD
MULT
Put/Get Memories
broadcast thread
ETH thread
IP thread
UDP thread
RPC thread
39
RPC Test Setup
Workstation to Workstation
Workstation to FPGA
40
FPGA vs Workstation
  • Perform several RPC calls to each from client
    workstation
  • Each server system connected directly to the
    client through an optical gigabit Ethernet cable

41
Click Based Applications
IP Router - 2 Port (shown) - 16 Port (not shown)
To Device
queue
NAT
From Device
Drop
IPFilter
IPaddr rewriter
To Device
queue
From Device
42
Click Results
43
Outline of Talk
  • Background
  • Design Flow
  • User Interface
  • Compilation to Hardware
  • High Level Tools
  • Experiments/Results
  • Conclusions

44
Conclusions
  • Presented a programming model for mapping
    networking applications to FPGAs
  • An API of abstractions (user interface)
  • Generate VHDL from the description (compiler)
  • Summary
  • Domain specific languages as a target design
    entry
  • FPGAs as a target for implementation
  • Platform based on threads and flexible memory
    architecture
  • MIR as a starting framework
  • Demonstrate efficient mappings/designs through
    four application examples
Write a Comment
User Comments (0)
About PowerShow.com