TCP-Splitter: A Reconfigurable Hardware Based TCP Flow Monitor - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

TCP-Splitter: A Reconfigurable Hardware Based TCP Flow Monitor

Description:

Gigabit Kits Workshop. 4. Why work with TCP? Over 85% on internet traffic is TCP based ... Gigabit Kits Workshop. 5. Why not implement a full TCP stack in ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 32
Provided by: davidvsc
Category:

less

Transcript and Presenter's Notes

Title: TCP-Splitter: A Reconfigurable Hardware Based TCP Flow Monitor


1
TCP-Splitter A Reconfigurable Hardware Based
TCP Flow Monitor
  • David V. Schuehler
  • dvs1_at_arl.wustl.edu

2
Outline
  • Motivation
  • Design
  • Results
  • Simulation
  • Execution on FPX

3
MOTIVATION
4
Why work with TCP?
  • Over 85 on internet traffic is TCP based
  • Internet is growing
  • TCP is a proven reliable transport for data
    delivery
  • Provide high speed active networks the ability
    work with TCP flows

5
Why not implement a full TCP stack in hardware?
  • Complex protocol stack
  • Several interactions on client interface
    (sockets?)
  • Difficult to achieving high performance
  • Large memories required for reassembly
  • Limited number of simultaneous connections

6
Solution
  • Develop TCP flow monitor TCP-Splitter
  • Utilize existing hardware infrastructure (FPX)
  • Expand upon Layered Protocol Wrappers

7
DESIGN
8
Goals
  • High Speed Design
  • Small FPGA Footprint
  • Simple Client Interface
  • Support Large Number of Flows

9
Challenges
  • Dealing with dropped frames
  • Packet reordering
  • Maintaining state for large number of flows
  • Developing an efficient implementation
  • Processing data at line rates
  • Minimizing resource requirements

10
Assumptions/Limitations
  • All frames must flow through switch
  • Frames traversing in opposite direction handled
    as separate flow
  • In-order processing of frames for each flow

11
TCP-Splitter Data Flow
12
Input Processing
  • Flow Classification
  • TCP Checksum Engine
  • Input State Machine
  • Control FIFO
  • Frame FIFO
  • Output State Machine

13
Layout
14
Packet Routing Decisions
  • Forward to outbound IP stack only
  • Forward to both Client App and outbound IP stack
  • Discard packet

15
Packet Routing
  • Non-TCP packets ? IP stack
  • Invalid TCP checksum ? drop
  • TCP SYN packets ? IP stack
  • (Seq lt Expected Seq ) ? IP stack
  • (Seq gt Expected Seq ) ? drop
  • Else ? client AND IP stack

16
Client Interface
Client Application
  • 1 bit Clock
  • 1 bit Reset
  • 32 bit Data Word
  • 2 bit Data Enable
  • 3 bit Start/End of Data Signals
  • 2 bit Valid Data Bytes
  • N bit Flow Identifier
  • 2 bit Start/End of Flow Signals
  • 1 bit TCA

17
RESULTS
18
Synthesis Results for Xilinx XCV1000E-7
TCPSplitter Full Wrappers (Cell Frame IP TCP Client)
Space/LUTs 617 (2) 4954 (20)
Register bits 503 (2) 4933 (20)
Input processing delay 7 clock cycles 44-68 clock cycles
  • Plus length of packet in 32 bit words

19
Current State of Research
  • Developed and simulated design
  • Handles 256 k simultaneous flows
  • Synthesizes at 101MHz
  • Simple ByteCount test client
  • Executes in hardware with simulated frames

20
Current Limitations
  • Non-passive solution
  • Hash table collisions not handled
  • All packets considered start of new flow
  • No support for IP fragments

21
Future Directions
  • Process real TCP flows
  • Develop more elaborate client applications
  • Improve processing performance
  • Implement frame reassembly
  • Improve flow classifier
  • Add memory management utilities
  • Enhance frame generation utility (Eliot)
  • TCP based FPX programmer (Harvey)

22
SIMULATION
23
To Run Simulation
  • UNZIP ByteCount project files
  • H\bytecount4ws.zip
  • Extract to D\
  • Create Cygwin command window
  • Start-gtEngineering-gtFPGA Tools-gtCygwin Bash Shell
  • Move to ByteCount directory
  • cd d/ByteCount
  • Compile vhdl files (approx. 1 min)
  • make compile
  • Run simulation (approx. 5 min)
  • make sim

24
Project Directory Structure
  • ByteCount
  • sim
  • iptestbench
  • - C routines to build input cells
  • testbench
  • - vhdl test harness for simulation
  • Work
  • - compiled vhdl files
  • vhdl
  • CCP_MODULE
  • - Control Cell Processor
  • SRAM_CONTROLLER
  • - SRAM interface
  • wrappers
  • - cell, frame, IP, and TCP wrappers

25
ByteCount Application
26
Memory Usage
  • SRAM0 contains byte count info
  • First 20 memory locations are used
  • Bits 15-0 contain byte count
  • Bits 33-16 contain flow ID
  • SRAM1 contains flow classification table
  • Bits 31-0 contain last TCP sequence number
  • Bit 32 indicates active flow

27
Sample Run
TCP data enable
Start of frame
IP payload
End of frame
Byte count
SRAM write
Flow ID
28
EXECUTION
29
Execute on FPX
  • Using the NCHARGE web site
  • http//fpx.arl.wustl.edu
  • SWITCH CONFIGURATION
  • Restart All
  • CONFIGURATION MEMORY UPDATES
  • Complete Configuration
  • Filename bytecount.bit
  • VCI UPDATES AND STATUS
  • Write VCI (control)
  • VPI0 VCI23 SWRAD_SW RAD_SWRAD_LC
  • Write VCI (data)
  • VPI0 VCI32 SWRAD_SW RAD_SWRAD_LC

30
Execute on FPX (continued)
  • CREATE CELLS
  • Custom Payloads Hello DATA
  • Select TCP
  • Fill in TCP parameters
  • Set VCI 144
  • Create Cell
  • Send cell
  • Set Receive VCI 154
  • Sent Generated Cell(s)

31
Verify Execution
  • Using the NCHARGE web site
  • http//fpx.arl.wustl.edu
  • RAD MEMORY UPDATES
  • Read RAD memory
  • Memory Width 36 bit Number 2
  • Memory Device 0 Mod_number 0 Address 0
  • FlowID bits 33-16 Count bits 15-0
  • Convert FlowID to decimal
  • Use decimal FlowID as address of Device 1
  • Read should return TCP sequence number
Write a Comment
User Comments (0)
About PowerShow.com