GPU Accelerated Image Aligned Splatting

About This Presentation

Title:

GPU Accelerated Image Aligned Splatting

Description:

Splat everything on the image plane and composite front-to-back ... When a splat is processed, all vertices are transformed according to current ... – PowerPoint PPT presentation

Number of Views:237

Avg rating:3.0/5.0

Slides: 38

Provided by: meli149

Category:

more less

Transcript and Presenter's Notes

Title: GPU Accelerated Image Aligned Splatting

1
GPU Accelerated Image Aligned Splatting

Neophytos Neophytou,Klaus MuellerCenter for
Visual Computing,
Stony Brook University (SUNY)

2
Motivation

Until recently
Gaming industry drove development of 3D graphics
boards.
Graphics Architecture didnt really address
scientific visualization needs.
Much work needed to circumvent architectural
limitations.
Now
Games still driving force, but far more
sophisticated
Programmable GPU can be used for many things by
Visualization community

3
Motivation

Direct Volume Rendering on GPU
Using previous generation NVidia FX and ATI
equiv.
3D Texture-based volume rendering
Raycasting
Extensive use of fragment shaders for per-pixel
programming

4
Motivation

Why not Splatting?
Scatter Vs. Gather
Vertex Processor Vs. Fragment processor
Need Visibility Sorting
Pre-Shaded splatting
XRay Splatting
Image Aligned Splatting ? Challenging
Requires FP blending, auxiliary buffers, early
culling
Now available on Nvidia 6800 and equivalent ATI

5
Previous Work

Splatting Westover 90
Volume representation ?overlapping basis
functions
Projection of pre-calculated footprints for each
point
Splat everything on the image plane and composite
front-to-back
Main Advantage Implicit space leaping
Initial problems ? Some color bleeding and
sparkling
Solution Sheet Buffered splatting

6
Previous Work

Westovers compositing ? Axes aligned Sheet
buffers ?causes popping in animated viewing

Solution ?image aligned sheet buffers Mueller
98
?Slice, accumulate, and composite along viewing
direction
?Use pre-computed kernel sections instead of
whole footprint

7
Previous Work

Post-classified Rendering Mueller 99
Sheet buffers accumulate opacities
Classification and shading per-pixel
Gradient calculation per-pixel at sheet buffers

Pre-shaded
Post-shaded
8
Previous Work

Other optimizations (for software based systems)
Fast footprint rasterization Huang 00
Post-Convolved Rendering Neophytou 03
Hierarchical Splatting Laur 91
3D Adjacency structures Orchard 01
Optimal sampling grids Theussl 01,Neophytou
02
Anti-aliasing issues addressed by
Swan 97, Mueller 98, Zwicker 01, 02

9
Previous Work

GPU Accelerated X-Ray Splatting
Efficient point-convolution for X-Ray Xue 04
High throughput with previous generation HW
Similar to our throughput, 2 generations earlier.
Why?
Image aligned splatting renders points at least
4x
Post-Shading incurs additional costs/overhead
Speedup gained from Moores law consumed by cost
of producing high quality images
GPU Accelerated EWA Splatting Chen 04
High speedups
Retained mode Splatting
Axis aligned buffer approach
May produce some popping
Cannot do Post-Shading

10
Image Aligned Splatting

(1) Sort front-to-back (2) Create density
slices(3) Apply transfer Function and shade each
slide(4) Composit front-to-back

11
Image Aligned Splatting

(1) Sort front-to-back (2) Create density
slices(3) Apply transfer Function and shade each
slide(4) Composit front-to-back

12
Image Aligned Splatting

(1) Sort front-to-back (2) Create density
slices(3) Apply transfer Function and shade each
slide(4) Composit front-to-back

13
Image Aligned Splatting

(1) Sort front-to-back (2) Create density
slices(3) Apply transfer Function and shade each
slide(4) Composit front-to-back

14
Challenge 1 Increased Vertex traffic

Splatting ? need textured quad for each point
4X vertices
Image Aligned Slicing approach
Multiple slices per point ? 4x points
Obvious solutions
Use point Sprites
Use Vertex arrays
Improvement not significant (5)
Image aligned splatting is mainly
Rasterization-Bound.

15
Challenge 2 Excessive Overdraw

Overdraw One point is rasterized ?multiple
buffers.

Initial approach
Splat gradients (Nx,Ny,Nz, D)
Alternative approach
Splatting different density buffers using all
RGBA channels
Rasterize each point only once
Compute gradient on-the-fly
Use 2D texture for (x,y) and modulate with 1D
kernel along z
Alternative is 3 times faster.

16
Challenge 2 Excessive Overdraw

Use color masks to cycle through channel/buffers

Arrange z-coefficients in the 4 color components
Assuming ordering of R,G,B,A,R,G,B,A,R,G,B,A
Possible orderings for (i,i1,i2,i3) will be
RGBA, GBAR, BARG, ARGB
Fragment Program for each of these cases
Computes gradient/classification/shading per
pixel.

17
Challenge 2 Excessive Overdraw

Use color masks to cycle through channel/buffers

Arrange z-coefficients in the 4 color components
Assuming ordering of R,G,B,A,R,G,B,A,R,G,B,A
Possible orderings for (I,i1,i2,i3) will be
RGBA, GBAR, BARG, ARGB
Fragment Program for each of these cases
Computes gradient/classification/shading per
pixel.

18
Challenge 2 Excessive Overdraw

Use color masks to cycle through channel/buffers

Arrange z-coefficients in the 4 color components
Assuming ordering of R,G,B,A,R,G,B,A,R,G,B,A
Possible orderings for (I,i1,i2,i3) will be
RGBA, GBAR, BARG, ARGB
Fragment Program for each of these cases
Computes gradient/classification/shading per
pixel.

19
Challenge 2 Excessive Overdraw

Use color masks to cycle through channel/buffers

Arrange z-coefficients in the 4 color components
Assuming ordering of R,G,B,A,R,G,B,A,R,G,B,A
Possible orderings for (I,i1,i2,i3) will be
RGBA, GBAR, BARG, ARGB
Fragment Program for each of these cases
Computes gradient/classification/shading per
pixel.

20
Challenge 3 Shading empty regions

Empty Space Skipping Implicit for Splatting.
But, slice-based splatting on GPU?

All this area is useless but it is will be
processed with expensive shading / compositing
anyway

Utilize the early-z culling GPU optimization to
disable processing of empty space
NOTE All temp buffers share the SAME depth
buffer. Use aux buffers as multiple surfaces of
the same p-buffer

21
Challenge 3 Shading empty regions

Early z-culling with GL_DEPTH_RANGE_TEST
Eliminates fragments of depth out-of allowed
range
We use the depth buffer to tag newly splatted
pixels

Frame Buffer
Depth Buffer
22
Challenge 3 Shading empty regions

Early z-culling with GL_DEPTH_RANGE_TEST
Eliminates fragments of depth out-of allowed
range
We use the depth buffer to tag newly splatted
pixels

Frame Buffer
Depth Buffer
Different shades of grey represent different
slices. We set DEPTH_RANGE to allow only current
slice.
23
Challenge 4 Shading of opaque regions

Early splat elimination applied by culling
splats that project to opaque regions of the image

Useless Processing
Actual slice contribution

Utilize the early-z culling GPU optimization to
disable processing of opaque image regions

24
Challenge 4 Shading of opaque regions

Early z-culling with normal depth-test
Eliminates fragments with 0 depth (use of
hierarchical z-buffer also eliminates whole quads
if small enough)
We set the depth0 for all opaque fragments in
the compositing program.Yes, we do have to read
the image buffer as a texture

Frame Buffer
Depth Buffer
25
Overall Inefficiencies Improvements

Bucket tossing ?done on CPU up to 0.1 sec
Overdraw from slicing points to image aligned
buffers
Treat RGBA channels as buffers ? 300 faster
Processing of empty regions in
shading/compositing
Early-z-culling for empty-space-slipping ? 30
gains
Splatting and Processing of opaque regions
Early-z-culling for early-splat-elimination ?
200 gains

26
Overall Inefficiencies Improvements

Early-z extension temperamental on NVidia boards
Avoid clearing the z-buffer
Avoid changing z-test direction (GL_GREATER
??GL_LESS)
Cannot write to depth component in fragment
program
BUT using GL_DEPTH_RANGE_TEST seems to allow
this!

27
Results
Volume Size 128x128x128, Image size 400x400
Semi-transparent FPS 7.2 Iso-surface FPS 7.6
28
Results
Volume Size 128x128x128 Image size 400x400
Semi-transparent FPS 4.9 Head, BCC FPS 3.1
Volume Size 256x256x128, Image size 400x400
Iso-surface FPS 5.1 BCC FPS 5.3
Volume Size 128x128x128, Image size 400x400
FPS 9
29
Conclusions

We have provided a GPU accelerated implementation
of the Image Aligned Splatting technique
Addressed main inefficiencies
Vertex traffic
Excessive overdraw
Empty-space leaping
Early splat elimination
High quality at acceptable frame-rates
Not faster than 3D slicing based approaches

30
Current work Splatting irregular volumes on GPU

Main primitive generalized to Ellipsoidal kernels
Ellipsoid expressed as rotated/scaled sphere
Slice ellipsoids with same sheet buffered
approach
Use texture-mapping to rasterize kernel
Use similar Fragment programs for shading

31
Thank to

NSF CAREER grant
DOE
More to come on this project athttp//fytos.net/
splatting

32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
Splatting irregular volumes on GPU

Splatting algorithm inefficiencies are the same
with regular splatting
Cannot use the RGBA as separate slices approach
In irregular splatting we need to keep a separate
weight buffer during the splatting phase and then
use it for normalization.
We use depth-buffer and early-z-culling
All temp buffers are surfaces of the same
pBuffer, in order to share the same depth-buffer.
Empty-space skipping and
Early splat elimination

36
Splatting irregular volumes on GPU

Volume data is first organized into a flat cubic
structure
Provides a list of intersecting splats for every
cell.
Cells are accessed front to back in pre-defined
order, same way as rendering a fixed octree.
Render the associated Vertex-Array of each cell
At the beginning of the frame, all splat slicing
parameters are computed. Result is
Initial slicing polygon, and advancing vectors
When a splat is processed, all vertices are
transformed according to current-step and
advancing vectors.
The polygon is then texture-mapped to the right
kernel.

37
Splatting irregular volumes on GPU

Problem several splats will appear on different
cells.
How do you keep track of the ones that are
already being sliced?
CPU would just hold a status array
GPU is notorious for not having global variables
Solution Have the Current-Cell be a uniform
variable, and every vertex will know whether it
is his turn to draw, by comparing to his
pre-computed starting-cell var.

Write a Comment

User Comments (0)

About PowerShow.com

GPU Accelerated Image Aligned Splatting - PowerPoint PPT Presentation

GPU Accelerated Image Aligned Splatting

Splat everything on the image plane and composite front-to-back ... When a splat is processed, all vertices are transformed according to current ... – PowerPoint PPT presentation