Reconfigurable Computing: HPC Network Aspects - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Reconfigurable Computing: HPC Network Aspects

Description:

GRIM Project at Georgia Tech. Add multimedia devices to cluster. Message layer connects ... GRIM. FPGA Organization. Frame. Incoming. Message. Queues. Outgoing ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 20
Provided by: cdul
Category:

less

Transcript and Presenter's Notes

Title: Reconfigurable Computing: HPC Network Aspects


1
Reconfigurable Computing HPC Network Aspects
Craig Ulmer (8963) cdulmer_at_sandia.gov
Pete Dean RD Seminar December 11, 2003
  • Mitch Sukalski (8961)
  • David Thompson (8963)

2
FPGAs are promising
  • But whats the catch?
  • There are three main challenges that need to be
    addressed in order to apply to practical,
    scientific computing.

3
RC Challenge 1 Floating Point
  • Most FPGAs fine grained
  • Floating point units are large
  • 32b FP occupies 1,000 CLBs
  • Commercial capacity improving
  • 2000 6,000 CLBs
  • 2003 40,000 CLBs (Max 220,000)
  • Keith Underwood at Sandia/NM
  • LDRD Working on high-speed 64b floating-point
    cores

32b FP in Xilinx V2P7
4
RC Challenge 2 Design Tools
  • Hardware design is non-trivial
  • Micromanage computations, clock-by-clock
  • Not appropriate for most scientists
  • Need languages, APIs that are easy to use
  • Maya Gokhale at LANL
  • Streams-C C-like language for HW design
  • Pipeline/unroll loops
  • Schedules access to external memory

5
RC Challenge 3 High-speed I/O
  • FPGAs have large internal computational power
  • How do we get data into/out of FPGA?
  • How do we connect to our existing HPC machines?
  • Mitch Sukalski, David Thompson, Craig Ulmer
  • LDRD Connect FPGAs to high-performance SANs

?
FPGA
FPGA
6
Outline
  • Where we have been
  • Networking FPGAs using external NI cards
  • Where we are going
  • Networking FPGAs using internal transceivers
  • Project status
  • Early details

7
Previous Work
  • Where weve been..

8
Networking Earlier FPGAs
  • Previous generation of FPGAs were like blank
    ASICs
  • Configurable logic and pins
  • Attach a network card to an FPGA card
  • Communication over PCI
  • Examples
  • Virginia Tech Myrinet
  • Washington U. in St. Louis ATM (inline)
  • Clemson University Gigabit Ethernet
  • Georgia Tech Myrinet

9
GRIM Project at Georgia Tech
  • Add multimedia devices to cluster
  • Message layer connects CPUs, memory, and
    peripherals
  • Myrinet between hosts,PCI within hosts
  • Celoxica RC-1000 FPGA
  • Virtex FPGA (1M logic gates)
  • Four SRAM banks
  • PCI w/ PMC

10
FPGA Organization

FPGA Card Memory
Frame
Circuit Canvas
FPGA
11
Lessons Learned
  • Frame provides simple OS
  • Isolates users from board
  • Portability
  • Dynamically manage resources
  • Card memory
  • Computational circuits
  • PCI bottleneck
  • Distance between NI and FPGA
  • PCI difficult to work with

Host CPU
Page A
Page C
SRAM 1
Page B
FPGA
SRAM 2
NIC
12
Network Features of Recent FPGAs
  • Where were going

13
FPGA Network Improvements
  • Recent FPGAs have special, built-in cores
  • High-speed transceivers, dedicated processors
  • Idea Build our NI inside the FPGA
  • FPGA becomes a networked, compute resource
  • Removes the PCI bottleneck

14
Xilinx Virtex-II/Pro FPGA
  • Up to 4 PowerPC405 cores
  • Embedded version of PPC
  • 300-400MHz
  • Multiple gigabit transceivers
  • Run at 600Mbps to 3.125Gbps
  • Up to twenty-four transceivers
  • Additional cores
  • Distributed internal memory
  • Arrays of 18b multipliers
  • Digital clock multipliers, PLLs

Xilinx V2P20
15
Multi-Gigabit Transceivers Rocket I/O
  • Flexible, high-speed transceivers
  • Can be configured to connect with different
    physical layers
  • InfiniBand, GigE, FC, 10GigE, Aurora
  • Note low-level interface (commas, disparity,
    clock mismatches)

16
Why MGTs are Important
  • Direct connection to networks
  • Same chip, different network
  • Remove PCI from equation
  • Fast connections between FPGAs
  • Reduces analog design issues
  • Chain FPGAs together
  • Reduce pin count
  • Update Virtex II/ProX
  • Now 2.488 Gbps 10.3125 Gbps
  • Chips have either 8 or 20 transceivers

3.125 Gbps over 44 FR4
From Xilinx, http//www.xilinx.com/products/vi
rtex2pro/mgtcharacter.htm
17
Hard PowerPC Core
  • PowerPC 405
  • 16KB Instruction / 16KB Data caches
  • Real and Virtual memory modes
  • GCC is available
  • Multiple memory ports for core
  • On-chip memory (OCM)
  • Processor Local Bus (PLB)
  • User-defined memory map
  • Connect memory blocks or cores
  • External memory cores available

PowerPC
I-Cache
D-Cache
Processor Local Bus (PLB)
On-Chip Memory (OCM) Interface
18
System on a Chip (SoC)
  • Commercial SoC
  • Designing with cores
  • Customize system
  • New tools
  • Rapidly connect cores
  • Library of cores buses
  • Saves on wiring legwork

Xilinx Platform Studio
19
Current Status
  • Exploring V2P
  • New architecture, new tools
  • Two reference boards
  • ML300 (V2P7-6)
  • Avnet (V2P20-6)
  • Transceiver work
  • Raw transmission over fiber
  • Working towards IB

http//cdulmer.ran.sandia.gov
Write a Comment
User Comments (0)
About PowerShow.com