Speedups of Inspiral Searches with GPUs - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Speedups of Inspiral Searches with GPUs

Description:

... in the CPUs are done on a Dell Inspiron 530 computer with a 2.5 GHz Intel Core ... Several Graphics Processing Units (GPU) cards are tested. ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 9
Provided by: shin87
Category:

less

Transcript and Presenter's Notes

Title: Speedups of Inspiral Searches with GPUs


1
Speed-ups of Inspiral Searches with GPUs
Shin Kee Chung Linqing Wen
2
In Collaboration with
  • Amitava Datta and David Blair (UWA)?
  • The Caltech LIGO group, special thanks to Chad,
    Kipp, Drew, Alan, Phil, and Stuart
  • Shin Kee Chung is supported in part by the
    Caltech SURF program

3
Hardware Used
  • Timing for inspiral searches in the CPUs are
    done on a Dell Inspiron 530 computer with a 2.5
    GHz Intel Core 2 Quad 9300 CPU, and 4 GB of RAM.
  • Several Graphics Processing Units (GPU) cards are
    tested. Results from Nvidia GeForce 8800 Ultra
    are presented

4
Method
  • GPU programs are written in a C-like language
    CUDA. Use CUDA FFT library
  • Use lalapps_inspiral with fixed parameters
  • Performance comparisons
  • speed-ups of CUDA-FFT vs CPU-FFT (FFTW)
  • speed-ups of lalapps_inspiral with FFTW replaced
    with CUDA-FFT
  • speed-ups of lalapps_inspiral chi-square test
  • Use CUDA FFT, bundle up CUDA-FFT for one loop
  • Implement data parallelism for the 2-loops where
    chi-square are calculated

5
Comparison of CUDA-FFT and FFTW
1 million data points 5x speed-up 4 million
data points 8x consistent with known benchmarks
6
Up to 4x Speedup is Achieved For Inspiral Search
using CUDA FFT
- lalapps_inspiral uses a fixed 1-million point
FFT - more speed-ups are expected for more data
points
7
16x Speedup is Achieved with Chi-square
Implementations
6-hrs CPU time can be reduced to 20 mins with a
GPU
8
Conclusion and Future Works
  • 16x speedup can already be achieved with a small
    modules being implemented.
  • The Chi-square calculations still occupies about
    80 of total execution time of the
    lalapps_inspiral for analysis of about 700
    templates. Therefore we aim to further optimize
    the Chi-square calculations.
  • Suggestions are welcome
Write a Comment
User Comments (0)
About PowerShow.com