Reconfigurable Computing: FPGAs for Ultrascale Science Sandia National Laboratories - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Reconfigurable Computing: FPGAs for Ultrascale Science Sandia National Laboratories

Description:

For commodity clusters, should we be nervous? Significant ... QuickSilver, Pact XPP, ClearSpeed. LB. LB. LB. LB. LB. LB. LB. LB. LB. LB. LB. LB. LB. LB. LB. LB ... – PowerPoint PPT presentation

Number of Views:113
Avg rating:3.0/5.0
Slides: 22
Provided by: craig83
Category:

less

Transcript and Presenter's Notes

Title: Reconfigurable Computing: FPGAs for Ultrascale Science Sandia National Laboratories


1
Reconfigurable Computing FPGAs for Ultrascale
ScienceSandia National Laboratories
Craig Ulmer SNL/CA cdulmer_at_sandia.gov
SOS-8 Workshop April 14, 2004
  • Keith Underwood SNL/NM

2
Motivation CPU Efficiency Trend
While CPU performance has been increasing.. ..proc
essing efficiency has been decreasing.
Efficiency MFLOPS/MHz/Mtransistors
Efficiency
Processors
3
Looking Ahead
  • For commodity clusters, should we be nervous?
  • Significant increases in technology effort
  • Diminishing returns
  • Should we depend on CPU manufacturers for HPC?
  • Sandia has many HPC interests
  • Investigate computing alternatives and
    accelerators
  • FPGAs Modern Reconfigurable Computing

4
Outline
  • Reconfigurable computing
  • Use FPGAs to accelerate computations
  • Strategy and examples
  • Approaches to scientific computing
  • Challenges for ultrascale science
  • Double-precision floating-point performance
  • System integration and network aspects

5
Reconfigurable Computing Background
  • Soft Hardware

6
Computing Spectrum
7
Reconfigurable Hardware Devices
Devices that can be programmed to emulate
hardware circuitry
  • Tile architecture
  • Logic blocks (LBs)
  • Routing elements
  • Field-Programmable Gate Arrays
  • Fine granularity
  • LBs are bit-level operators
  • Commercial trend
  • Coarse granularity
  • LBs are ALUs, FPUs
  • QuickSilver, Pact XPP, ClearSpeed

8
Common Acceleration Techniques
Key Designing in Hardware
  • Processing concurrency
  • Hardware pipelines
  • Custom memory interactions
  • Partial evaluation

9
Reconfigurable Computing for Ultrascale
ScienceHPC Strategy and Examples
  • Enhancing HPC Performance

10
HPC Strategy at Sandia for RC
  • RC resources work best as accelerators in HPC
  • Clusters are inexpensive work well for many
    applications
  • Add RC devices to enhance performance
  • Port key portions of algorithms to RC hardware
  • Focus on hotspots and inner loops
  • Move data to/from FPGAs in pipelined fashion

11
Scientific Computing Examples
  • Pattern recognition
  • ATLAS project at CERN
  • Reduced 2500 CPUs to 120 nodes with FPGAs
  • Visualization
  • Vizard II project at University of Tübingen
  • Direct volume rendering for 5123 datasets
  • Molecular dynamics (MD)
  • Preliminary work at Los Alamos National
    Laboratory
  • 20 Cells in an FPGA yields 5.69 GFLOPS
  • Computational fluid dynamics (CFD) analysis for
    jet engines
  • Smith and Schnore at GE Global Research

12
Challenges
  • Hard to program
  • Hardware design
  • Must be significant parallelism
  • Limited chip capacity
  • Lack of HPC building blocks
  • Our users need DP-FP
  • System integration
  • How do we add to our clusters?

13
Reconfigurable Computing for Ultrascale
ScienceDouble-Precision Floating-Point Cores
  • Addressing the need for HPC building blocks

14
Double-Precision Floating-Point Cores
  • Floating point has been historical weakness for
    FPGAs
  • FP cores consume significant amounts of hardware
  • Previous FPGAs lacked capacity
  • Significant improvements in recent commercial
    FPGAs
  • Increased capacity, faster clocks, and better
    building blocks
  • Keith Underwood at SNL/NM
  • Re-evaluating FP performance in FPGAs
  • Constructing high-speed DP-FP cores

15
Peak Performance Results
From Underwoods, FPGAs vs. CPUs Trends in Peak
Floating-Point Performance, in FPGA04
16
Double-Precision Multiply Performance Trends
17
Reconfigurable Computing for Ultrascale
ScienceNetworking Aspects
  • Addressing capacity and system integration issues

18
Data ExchangeMulti-Gigabit Transceivers (MGTs)
  • How do we rapidly move data into/out of FPGA?
  • Xilinx Virtex-II/Pro FPGA has MGTs
  • Channel data rates 3.125 Gbps
  • Up to 24 channels
  • V2/ProX twenty 10Gbps channels
  • Configured for different physical layers
  • InfiniBand, FC, GigE, 10GigE
  • S-ATA, PCI-Express, HT

19
Importance of MGTs
  • Increase Raw Capacity
  • Connect FPGAs together
  • MGTs provide fat pipes
  • Cables, not PCB traces
  • System Integration
  • Connect FPGA to SAN
  • Implement NI in FPGA
  • FPGA is global resource

20
Recent Sandia Work SNL OpenTOE
  • At Sandia we are interested in connecting FPGAs
    to SANs
  • Main target InfiniBand
  • Must implement network protocols for reliable
    transfer
  • Initial work GigE and TCP
  • Implemented GigE core and basic TCP offload engine

21
Concluding Remarks
  • Improvements in commercial FPGAs make RC
    attractive
  • FPGAs provide better sustained performance than
    CPUs
  • FPGA performance growing faster than Moores Law
  • Near-term strategy accelerator-based approach
  • Offload key operations into hardware
  • Sandia National Labs investigating RC for HPC
    acceleration
  • Enabling scientific computing through fast DP FP
    cores
  • Addressing system integration/capacity issues via
    network
Write a Comment
User Comments (0)
About PowerShow.com