Title: GeneralPurpose Computation on Graphics Hardware
1General-Purpose Computation on Graphics Hardware
2Introduction
- David Luebke
- University of Virginia
3Course Introduction
- The GPU on commodity video cards has evolved into
an extremely flexible and powerful processor - Programmability
- Precision
- Power
- This course will address how to harness that
power for general-purpose computation
4Motivation Computational Power
- GPUs are fast
- 3.0 GHz dual-core Pentium4 24.6 GFLOPS
- NVIDIA GeForceFX 7800 165 GFLOPs
- 1066 MHz FSB Pentium Extreme Edition 8.5 GB/s
- ATI Radeon X850 XT Platinum Edition 37.8 GB/s
- GPUs are getting faster, faster
- CPUs 1.4 annual growth
- GPUs 1.7(pixels) to 2.3 (vertices) annual
growth
Courtesy Kurt Akeley,Ian Buck Tim Purcell, GPU
Gems (see course notes)
5Motivation Computational Power
Courtesy Ian Buck, John Owens
6An Aside Computational Power
- Why are GPUs getting faster so fast?
- Arithmetic intensity the specialized nature of
GPUs makes it easier to use additional
transistors for computation not cache - Economics multi-billion dollar video game market
is a pressure cooker that drives innovation
7Motivation Flexible and Precise
- Modern GPUs are deeply programmable
- Programmable pixel, vertex, video engines
- Solidifying high-level language support
- Modern GPUs support high precision
- 32 bit floating point throughout the pipeline
- High enough for many (not all) applications
8Motivation The Potential of GPGPU
- In short
- The power and flexibility of GPUs makes them an
attractive platform for general-purpose
computation - Example applications range from in-game physics
simulation to conventional computational science - Goal make the inexpensive power of the GPU
available to developers as a sort of
computational coprocessor
9The Problem Difficult To Use
- GPUs designed for driven by video games
- Programming model unusual
- Programming idioms tied to computer graphics
- Programming environment tightly constrained
- Underlying architectures are
- Inherently parallel
- Rapidly evolving (even in basic feature set!)
- Largely secret
- Cant simply port CPU code!
10Course goals
- A detailed introduction to general-purpose
computing on graphics hardware - We emphasize
- Core computational building blocks
- Strategies and tools for programming GPUs
- Tips tricks, perils pitfalls of GPU
programming - Case studies to bring it all together
11Why a SIGGRAPH Course?
- Why SIGGRAPH, not (say) Supercomputing?
- Many graphics applications can benefit from GPGPU
- Hot topic examples shadows, level sets, fluids
- Keeping computation on-card!
- Many graphics applications strive for visual
plausibility rather than rigorous scientific
realism - Better tolerate GPU limitations in precision,
memory - Well suited as GPGPU early adopters
- GPGPU programming still requires expertise of
SIGGRAPH audience
12Course Prerequisites
- We assume
- Familiarity with interactive graphics and
computer graphics hardware - Ideally, some experience programming vertex and
pixel shaders - Target audience
- Researchers interested in GPGPU
- Graphics and games developers interested in
incorporating these techniques into their work - Attendees wishing a survey of this exciting field
13Course Topics
- GPU building blocks
- Languages and tools
- Effective GPU programming
- GPGPU case studies
14Course Topics Details
- GPU building blocks
- Linear algebra
- Sorting and searching
- Geometric Computing
- Languages and tools
- High-level languages
- Debugging tools
15Course Topics Details
- Effective GPU programming
- Efficient data-parallel programming
- GPU memory resources data layout approaches
- GPU computation strategies tricks
- Data structures
- Case studies in GPGPU Programming
- Databases and data mining operations on GPUs
- Particles grids on GPUs
- Adaptive shadow maps octree textures on GPUs
16Speakers
- In order of appearance
- David Luebke, University of Virginia
- Mark Harris, NVIDIA
- Jens Krüger, TU-Munich
- Tim Purcell, NVIDIA
- Naga Govindaraju, University of North Carolina
- Ian Buck, NVIDIA
- Cliff Woolley, University of Virginia
- Aaron Lefohn, University of California Davis
17Schedule
Luebke Harris Krüger Purcell
- 830 Introduction
- Welcome, overview, the graphics pipeline
- GPU Building Blocks
- 850 Computational concepts CPU?GPU
- Streaming, resources, CPU-GPU analogies,
branching - 915 Linear algebra
- Representations, operations, example algorithms
- 950 Sorting Searching
- Bitonic sort, Binary k-nearest neighbor search
- 1015 Break
18Schedule
Govindaraju Buck Purcell
- 1030 Geometric computation
- Visibility, collision proximity, reliable
computation - Languages and Tools
- 1100 High-level languages
- Cg/HLSL/GLslang, Sh, Brook
- 1130 Debugging tools
- imdebug, DirectX/OpenGL shader IDEs, ShadeSmith
19Schedule
Woolley Lefohn Buck Lefohn
- Effective GPGPU Programming
- 1150 GPU program optimization
- Computational frequency, profiling, load
balancing - 1215 Lunch break
- 145 GPU memory models
- Memory objects, layout of data structures, FBOs
- 215 GPU computation strategies tricks
- Precision, performance, scatter, branching
- 255 GPU data structures
- High-level data structures
- 330 Break
20Schedule
Govindaraju Krüger Lefohn All
- Case Studies
- 345 Databases data mining on GPUs
- Queries, aggregation, mining frequencies
quantiles - 415 Geometry processing on GPUs
- Particles, grids, PBO/VBO vs. FBO vs. VTF/SM3.0
- 445 Applications of adaptive data
structures - Adaptive shadow maps, octree textures
- Conclusion
- 515 Question-and-answer session
- 530 Wrap!
21GPU Fundamentals The Graphics Pipeline
GPU
CPU
Graphics State
Xformed, Lit Vertices (2D)
Screenspace triangles (2D)
Fragments (pre-pixels)
Final Pixels (Color, Depth)
Application
Transform Light
Rasterize
Shade
AssemblePrimitives
Vertices (3D)
VideoMemory(Textures)
Render-to-texture
- A simplified graphics pipeline
- Note that pipe widths vary
- Many caches, FIFOs, and so on not shown
22GPU Fundamentals The Modern Graphics Pipeline
GPU
CPU
Graphics State
FragmentProcessor
VertexProcessor
Xformed, Lit Vertices (2D)
Screenspace triangles (2D)
Fragments (pre-pixels)
Final Pixels (Color, Depth)
Transform
Application
Rasterize
Shade
AssemblePrimitives
Vertices (3D)
VideoMemory(Textures)
Render-to-texture
- Programmable vertex processor!
- Programmable pixel processor!
23The Coming Soon Graphics Pipeline
GPU
CPU
Graphics State
GeometryProcessor
Xformed, Lit Vertices (2D)
Screenspace triangles (2D)
Fragments (pre-pixels)
Final Pixels (Color, Depth)
Application
VertexProcessor
Rasterize
FragmentProcessor
AssemblePrimitives
Vertices (3D)
VideoMemory(Textures)
Render-to-texture
- Programmable primitive assembly!
- More flexible memory access!
24GPU Pipeline Transform
- Vertex processor (multiple in parallel)
- Transform from world space to image space
- Compute per-vertex lighting
25GPU Pipeline Rasterize
- Rasterizer
- Convert geometric rep. (vertex) to image rep.
(fragment) - Fragment image fragment
- Pixel associated data color, depth, stencil,
etc. - Interpolate per-vertex quantities across pixels
26GPU Pipeline Shade
- Fragment processors (multiple in parallel)
- Compute a color for each pixel
- Optionally read colors from textures (images)
27Coming Up
- Next Mapping computational concepts to the GPU
- Also coming up
- Core building blocks for GPGPU computing
- Memory layout, data structures, and algorithms
- Detailed advice on writing high performance
GPGPU code - Lots of examples
28Course Evaluation Form
- Please help us improve the GPGPU Course
- Fill out the SIGGRAPH evaluation form
- http//www.siggraph.org/cgi-bin/cgi/exCoursesEval.
html - Choose Course 39 GPGPU