Real-Time Computer Graphics - PowerPoint PPT Presentation

About This Presentation
Title:

Real-Time Computer Graphics

Description:

RealTime Computer Graphics – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 70
Provided by: kunz2
Category:
Tags: computer | daw | graphics | real | time

less

Transcript and Presenter's Notes

Title: Real-Time Computer Graphics


1
Real-Time Computer Graphics
Kun Zhou State Key Lab of CADCG Zhejiang
University
www.kunzhou.net
2
The Graphics Process
Lighting
3D Modeling
Image Storage and Display
Rendering
3D Animation
real-time rendering
3
The Graphics Process
Lighting
3D Modeling
Image Storage and Display
Rendering
3D Animation
real-time computer graphics
GPU
4
GPU Data-Parallel Computing Device
  • Multiple cores, very high memory bandwidth

GF GTX 280 933 GFLOPS 141.7 GB/s
Floating-point operations per second for the
CPU and GPU NVIDIA 2008
5
GPU Stream Processors
GF GTX 280 30 x 8 240 processors
6
Outline
  • Data structures algorithms
  • Modeling surface reconstruction
  • Animation surface deformation
  • Rendering ray tracing, refraction
  • Programming tools
  • BSGP bulk-synchronous GPU programming

7
Outline
  • Data structures algorithms
  • Modeling surface reconstruction
  • Animation surface deformation
  • Rendering ray tracing, refraction
  • Programming tools
  • BSGP bulk-synchronous GPU programming

8
Modeling Surface Reconstruction
  • Parallel Surface Reconstruction
    Technical Report, 2008

A set of 3D points
Triangular mesh
9
Modeling Surface Reconstruction
  • Parallel Surface Reconstruction
    Technical Report, 2008
  • Octrees on GPUs
  • nodes, faces, edges, vertices
  • neighborhood info

Kazhdan06
10
Modeling Surface Reconstruction
  • Parallel Surface Reconstruction
    Technical Report, 2008
  • Octrees on GPUs
  • nodes, faces, edges, vertices
  • neighborhood info
  • Bottom-up, breadth-first order
  • Precompute look-up tables to compute neighbors

11
Modeling Surface Reconstruction
  • Parallel Surface Reconstruction
    Technical Report, 2008

Our GPU algorithm 5 FPS for 512K points CPU
algorithm Kazhdan06 42 seconds
12
Modeling Surface Reconstruction
  • Parallel Surface Reconstruction
    Technical Report, 2008

User-guided surface reconstruction
13
Modeling Surface Reconstruction
  • Parallel Surface Reconstruction
    Technical Report, 2008

User-guided surface reconstruction
14
Modeling Surface Reconstruction
  • Parallel Surface Reconstruction
    Technical Report, 2008

On-the-fly conversion of dynamic point clouds
15
Animation Surface Deformation
  • Direct Manipulation of Subdivision Surfaces on
    the GPU, ACM TOG (siggraph), 2007

16
Animation Surface Deformation
  • Direct Manipulation of Subdivision Surfaces on
    the GPU, ACM TOG (siggraph), 2007

vertex positions
Laplacian matrix
Laplacian coordinates
positional constraint matrix
constrained positions
17
Animation Surface Deformation
  • Direct Manipulation of Subdivision Surfaces on
    the GPU, ACM TOG (siggraph), 2007

18
Animation Surface Deformation
  • Direct Manipulation of Subdivision Surfaces on
    the GPU, ACM TOG (siggraph), 2007

19
Animation Surface Deformation
  • Direct Manipulation of Subdivision Surfaces on
    the GPU, ACM TOG (siggraph), 2007

nonlinear least-squares optimization
20
Animation Surface Deformation
  • Direct Manipulation of Subdivision Surfaces on
    the GPU, ACM TOG (siggraph), 2007

Inexact Gauss-Newton iterative solver
21
Animation Surface Deformation
  • Direct Manipulation of Subdivision Surfaces on
    the GPU, ACM TOG (siggraph), 2007
  • Precompute on the CPU
  • Compute on the GPU
  • Subdivision Shiue05, Laplacian coordinates
  • Compute on the GPU
  • Matrix-vector multiplication Boltz03, Kruger03

22
Animation Surface Deformation
  • Direct Manipulation of Subdivision Surfaces on
    the GPU, ACM TOG (siggraph), 2007

Real-time MOCAP animation
23
Animation Surface Deformation
  • Direct Manipulation of Subdivision Surfaces on
    the GPU, ACM TOG (siggraph), 2007

24
Rendering Ray Tracing
  • Real-time KD-Tree Construction on Graphics
    Hardware, ACM TOG (siggraph asia), 2008
  • Interactive frame rates
  • Shadows, textures
  • Multi-bounce reflection/refraction

25
Rendering Ray Tracing
  • Real-time KD-Tree Construction on Graphics
    Hardware, ACM TOG (siggraph asia), 2008

KD-tree
Generate Eye Rays
Traverse Acceleration Structure
Intersect Triangles
Shade Hits Generate Secondary Rays
Andrew Morres slides
26
Rendering Ray Tracing
  • Real-time KD-Tree Construction on Graphics
    Hardware, ACM TOG (siggraph asia), 2008

Constructing kd-tree on GPUs
Generate Eye Rays
  • Maximize parallelism
  • Build trees in BFS order
  • Parallelize computation over primitives at upper
    tree levels
  • Preserve high quality
  • New schemes for node splitting

Traverse Acceleration Structure
Intersect Triangles
Shade Hits Generate Secondary Rays
27
Rendering Ray Tracing
  • Real-time KD-Tree Construction on Graphics
    Hardware, ACM TOG (siggraph asia), 2008

Scene Wald07 1 core Shevtsov07 4 cores Our algorithm GF8800 Ultra
10.5 FPS 23.5 FPS 32.0 FPS
2.30 FPS 5.84 FPS 6.40 FPS
28
Rendering Ray Tracing
  • Real-time KD-Tree Construction on Graphics
    Hardware, ACM TOG (siggraph asia), 2008

photon mapping for caustic rendering
29
Rendering Dynamic Refraction
  • Interactive Relighting of Dynamic Refractive
    Objects, ACM TOG (siggraph), 2008
  • Interactions
  • Geometry, lighting, materials, viewpoint
  • Rendering effects
  • Refraction, reflection, single scattering
  • Shadows, caustics

30
Rendering Dynamic Refraction
  • Interactive Relighting of Dynamic Refractive
    Objects, ACM TOG (siggraph), 2008

object voxelization
octree construction
photon generation
adaptive photon tracing
view pass
31
Rendering Dynamic Refraction
  • Interactive Relighting of Dynamic Refractive
    Objects, ACM TOG (siggraph), 2008

object voxelization
object voxelization
octree construction
photon generation
octree construction
photon generation
adaptive photon tracing
adaptive photon tracing
view pass
view pass
all performed on the GPU !
32
Rendering Dynamic Refraction
  • Interactive Relighting of Dynamic Refractive
    Objects, ACM TOG (siggraph), 2008

object voxelization
  • Dense 3D array instead of sparse tree
  • Accounts for refractive index and extinction
    coefficients
  • Construction is similar to mipmap

octree construction
photon generation
octree construction
adaptive photon tracing
view pass
33
Rendering Dynamic Refraction
  • Interactive Relighting of Dynamic Refractive
    Objects, ACM TOG (siggraph), 2008

GPU Octree Construction
pyramid of min max values
index of refraction values
octree
index of refraction values
pyramid of hierarchy levels
34
Rendering Dynamic Refraction
  • Interactive Relighting of Dynamic Refractive
    Objects, ACM TOG (siggraph), 2008

Adaptive Photon Tracing
35
Rendering Dynamic Refraction
  • Interactive Relighting of Dynamic Refractive
    Objects, ACM TOG (siggraph), 2008

Surface manipulation
36
Rendering Dynamic Refraction
  • Interactive Relighting of Dynamic Refractive
    Objects, ACM TOG (siggraph), 2008

Volume painting
37
Rendering Dynamic Refraction
  • Interactive Relighting of Dynamic Refractive
    Objects, ACM TOG (siggraph), 2008

Simulation visualization
38
Outline
  • Data structures algorithms
  • Modeling surface reconstruction
  • Animation surface deformation
  • Rendering ray tracing, refraction
  • Programming tools
  • BSGP bulk-synchronous GPU programming, ACM
    TOG (siggraph), 2008

39
Programming the GPU
  • Cg Mark03, GLSL, HLSL graphics oriented
  • Stream processing
  • Brook Buck04 streams and kernels
  • Sh McCool04 meta-programming lib
  • NVIDIA CUDA scattering, local communication
  • AMD CAL, Brook
  • OpenCL
  • DirectX11 Compute Shader
  • Cg Mark03, GLSL, HLSL graphics oriented
  • Stream processing
  • Brook Buck04 streams and kernels
  • Sh McCool04 meta-programming lib
  • NVIDIA CUDA scattering, local communication
  • AMD CAL, Brook
  • OpenCL
  • DirectX11 Compute Shader

40
Stream Processing Model
  • Data centric uniform streams
  • Applying individual kernels in parallel to all
    stream elements

41
Stream Processing Model
  • Supplies high performance, but makes GPU
    programming hard
  • Program readability and maintenance
  • Bundle independent processes to reduce temporary
    streams and kernel launches
  • Manual dataflow management
  • Recycle temporary streams
  • Inefficient code reuse
  • Primitives with broken integrity
  • Supplies high performance, but makes GPU
    programming hard
  • Program readability and maintenance
  • Bundle independent processes to reduce temporary
    streams and kernel launches
  • Manual dataflow management
  • Recycle temporary streams
  • Inefficient code reuse
  • Primitives with broken integrity

42
BSGP Model
  • Programmer specifies barriers, compiler deduces
    supersteps Valiant 1990

43
BSGP Model
  • Programmer specifies barriers, compiler deduces
    supersteps Valiant 1990
  • Implicit data dependencies through local variables

44
BSGP Model
  • Programmer specifies barriers, compiler deduces
    supersteps Valiant 1990
  • Implicit data dependencies through local
    variables
  • Allows collective operation
  • Parallel primitives are called as a whole in a
    single statement

45
BSGP Model
  • Easy to read, write and maintain
  • Similar or better performance than native
    languages
  • i.e., CUDA...
  • Complex programs
  • i.e., X3D parser

46
Example one-ring neighborhood
  • Compute the one-ring neighboring triangles of
    each vertex of a triangular mesh

v1 t1 , t2 , t3 , t4 , t5
v2 t4 , t5 , t6 , t7 , t8 , t9
v3
t6
t7
t5
t1
v2
t8
t4
t2
v1
t3
t9
v3
47
One-ring neighborhood BSGP version
48
One-ring neighborhood BSGP version
  • Sorting the triplicated triangles

49
One-ring neighborhood BSGP version
  • Sorting the triplicated triangles
  • Compute each vertexs head pointer

50
One-ring neighborhood CUDA version
Dataflow management
Kernels
51
BSGP Language Constructs
  • spawn and barrier
  • Insert CPU code require
  • Thread manipulation fork and kill
  • Communication thread.get and thread.put
  • Reducing barriers par
  • Parallel primitive operations, including reduce,
    scan and sort

52
BSGP Language Constructs
  • spawn and barrier
  • Insert CPU code require
  • Thread manipulation fork and kill
  • Communication thread.get and thread.put
  • Reducing barriers par
  • Parallel primitive operations, including reduce,
    scan and sort

53
BSGP Language Constructs
  • spawn and barrier
  • Insert CPU code require
  • Thread manipulation fork and kill
  • Communication thread.get and thread.put
  • Reducing barriers par
  • Parallel primitive operations, including reduce,
    scan and sort

54
BSGP Language Constructs
  • spawn and barrier
  • Insert CPU code require
  • Thread manipulation fork and kill
  • Communication thread.get and thread.put
  • Reducing barriers par
  • Parallel primitive operations, including reduce,
    scan and sort

55
BSGP Language Constructs
  • spawn and barrier
  • Insert CPU code require
  • Thread manipulation fork and kill
  • Communication thread.get and thread.put
  • Reducing barriers par
  • Parallel primitive operations, including reduce,
    scan and sort
  • spawn and barrier
  • Insert CPU code require
  • Thread manipulation fork and kill

56
BSGP Language Constructs
  • spawn and barrier
  • Insert CPU code require
  • Thread manipulation fork and kill
  • Communication thread.get and thread.put
  • Reducing barriers par
  • Parallel primitive operations, including reduce,
    scan and sort
  • spawn and barrier
  • Insert CPU code require
  • Thread manipulation fork and kill
  • Communication thread.get and thread.put

57
BSGP Language Constructs
  • spawn and barrier
  • Insert CPU code require
  • Thread manipulation fork and kill
  • Communication thread.get and thread.put
  • Reducing barriers par
  • Parallel primitive operations, including reduce,
    scan and sort
  • spawn and barrier
  • Insert CPU code require
  • Thread manipulation fork and kill
  • Communication thread.get and thread.put
  • Reducing barriers par


58
BSGP Language Constructs
  • spawn and barrier
  • Insert CPU code require
  • Thread manipulation fork and kill
  • Communication thread.get and thread.put
  • Reducing barriers par
  • Parallel primitive operations, including reduce,
    scan and sort

59
Sample Applications
Recursive ray tracer
Particle simulation
X3D parser
Adaptive tessellation
60
Recursive Ray Tracer
  • Both BSGP and CUDA are Implemented and optimized
    by the same programmer

61
Recursive Ray Tracer
  • Both BSGP and CUDA are Implemented and optimized
    by the same programmer
  • Clear advantage in code complexity
  • Similar performance and memory usage

CUDA BSGP
Render fps 4.00 4.61
Mem usage 144M 150M
Code lines 815 475
GPU funcs 10 3
Coding days 23 1
Tuning days 45 23
62
Particle Simulation
  • CUDA SDK demo
  • Rewrote simulation module in BSGP, reused GUI
    code

63
Particle Simulation
CUDA BSGP
Render fps 187 290
Module lines - 154
Total lines 2113 1579
Coding time - 1 hour
  • CUDA SDK demo
  • Rewrote simulation module in BSGP, reused GUI
    code
  • Simpler and faster
  • Integration and sort preparation arent bundled
  • Sort isnt bundled with sort preparation
  • Sort calls unbundled scan

64
X3D Parser
  • BSGP implementation
  • Incremental development
  • 16 GPU functions, compiled into 82 kernels, 19k
    lines of assembly
  • 15x faster than CPU parser
  • Extremely difficult in CUDA

An 7.03MB X3D scene Loaded in 183ms
65
Adaptive Tessellation
  • A displacement map based terrain renderer

66
Adaptive Tessellation
  • Without thread manipulation
  • Parallelized over all input triangles
  • With thread manipulation
  • Parallelized over output vertices using
    thread.fork

View no thread man. no thread man. with thread man. with thread man. vert output
View Ttess FPS Ttess FPS vert output
Side 43.9ms 21.0 3.62ms 142 1.14M
Top 5.0ms 144 2.1ms 249 322k
2x10x speedup
67
Try BSGP Now!
  • BSGP compiler, programming guide, primitive
    library, editor and all example code
  • http//www.kunzhou.net/BSGP

68
Summary
  • GPUs are fast and cheap, and are getting faster
    and cheaper
  • General-purpose computing
  • Re-think your algorithms to be massively parallel
  • Data structures quadtree, octree, kd-tree
  • Algorithms nonlinear/linear optimization,
    matrix-vector operations, parallel primitives
  • Programming the GPU
  • BSGP makes programmers life much easier

69
Questions?Kun Zhou
kunzhou_at_acm.org
70
Other Real-Time Applications
Dynamic BRDF (sig2007)
Soft shadow (sig2006)
Smoke (sig2008)
Skinning (sig2008)
Write a Comment
User Comments (0)
About PowerShow.com