GPU Accelerated Image Aligned Splatting - PowerPoint PPT Presentation

About This Presentation
Title:

GPU Accelerated Image Aligned Splatting

Description:

Splat everything on the image plane and composite front-to-back ... When a splat is processed, all vertices are transformed according to current ... – PowerPoint PPT presentation

Number of Views:237
Avg rating:3.0/5.0
Slides: 38
Provided by: meli149
Category:

less

Transcript and Presenter's Notes

Title: GPU Accelerated Image Aligned Splatting


1
GPU Accelerated Image Aligned Splatting
  • Neophytos Neophytou,Klaus MuellerCenter for
    Visual Computing,
  • Stony Brook University (SUNY)

2
Motivation
  • Until recently
  • Gaming industry drove development of 3D graphics
    boards.
  • Graphics Architecture didnt really address
    scientific visualization needs.
  • Much work needed to circumvent architectural
    limitations.
  • Now
  • Games still driving force, but far more
    sophisticated
  • Programmable GPU can be used for many things by
    Visualization community

3
Motivation
  • Direct Volume Rendering on GPU
  • Using previous generation NVidia FX and ATI
    equiv.
  • 3D Texture-based volume rendering
  • Raycasting
  • Extensive use of fragment shaders for per-pixel
    programming

4
Motivation
  • Why not Splatting?
  • Scatter Vs. Gather
  • Vertex Processor Vs. Fragment processor
  • Need Visibility Sorting
  • Pre-Shaded splatting
  • XRay Splatting
  • Image Aligned Splatting ? Challenging
  • Requires FP blending, auxiliary buffers, early
    culling
  • Now available on Nvidia 6800 and equivalent ATI

5
Previous Work
  • Splatting Westover 90
  • Volume representation ?overlapping basis
    functions
  • Projection of pre-calculated footprints for each
    point
  • Splat everything on the image plane and composite
    front-to-back
  • Main Advantage Implicit space leaping
  • Initial problems ? Some color bleeding and
    sparkling
  • Solution Sheet Buffered splatting

6
Previous Work
  • Westovers compositing ? Axes aligned Sheet
    buffers ?causes popping in animated viewing
  • Solution ?image aligned sheet buffers Mueller
    98
  • ?Slice, accumulate, and composite along viewing
    direction
  • ?Use pre-computed kernel sections instead of
    whole footprint

7
Previous Work
  • Post-classified Rendering Mueller 99
  • Sheet buffers accumulate opacities
  • Classification and shading per-pixel
  • Gradient calculation per-pixel at sheet buffers

Pre-shaded
Post-shaded
8
Previous Work
  • Other optimizations (for software based systems)
  • Fast footprint rasterization Huang 00
  • Post-Convolved Rendering Neophytou 03
  • Hierarchical Splatting Laur 91
  • 3D Adjacency structures Orchard 01
  • Optimal sampling grids Theussl 01,Neophytou
    02
  • Anti-aliasing issues addressed by
  • Swan 97, Mueller 98, Zwicker 01, 02

9
Previous Work
  • GPU Accelerated X-Ray Splatting
  • Efficient point-convolution for X-Ray Xue 04
  • High throughput with previous generation HW
  • Similar to our throughput, 2 generations earlier.
    Why?
  • Image aligned splatting renders points at least
    4x
  • Post-Shading incurs additional costs/overhead
  • Speedup gained from Moores law consumed by cost
    of producing high quality images
  • GPU Accelerated EWA Splatting Chen 04
  • High speedups
  • Retained mode Splatting
  • Axis aligned buffer approach
  • May produce some popping
  • Cannot do Post-Shading

10
Image Aligned Splatting
  • (1) Sort front-to-back (2) Create density
    slices(3) Apply transfer Function and shade each
    slide(4) Composit front-to-back

11
Image Aligned Splatting
  • (1) Sort front-to-back (2) Create density
    slices(3) Apply transfer Function and shade each
    slide(4) Composit front-to-back

12
Image Aligned Splatting
  • (1) Sort front-to-back (2) Create density
    slices(3) Apply transfer Function and shade each
    slide(4) Composit front-to-back

13
Image Aligned Splatting
  • (1) Sort front-to-back (2) Create density
    slices(3) Apply transfer Function and shade each
    slide(4) Composit front-to-back

14
Challenge 1 Increased Vertex traffic
  • Splatting ? need textured quad for each point
  • 4X vertices
  • Image Aligned Slicing approach
  • Multiple slices per point ? 4x points
  • Obvious solutions
  • Use point Sprites
  • Use Vertex arrays
  • Improvement not significant (5)
  • Image aligned splatting is mainly
    Rasterization-Bound.

15
Challenge 2 Excessive Overdraw
  • Overdraw One point is rasterized ?multiple
    buffers.
  • Initial approach
  • Splat gradients (Nx,Ny,Nz, D)
  • Alternative approach
  • Splatting different density buffers using all
    RGBA channels
  • Rasterize each point only once
  • Compute gradient on-the-fly
  • Use 2D texture for (x,y) and modulate with 1D
    kernel along z
  • Alternative is 3 times faster.

16
Challenge 2 Excessive Overdraw
  • Use color masks to cycle through channel/buffers
  • Arrange z-coefficients in the 4 color components
  • Assuming ordering of R,G,B,A,R,G,B,A,R,G,B,A
  • Possible orderings for (i,i1,i2,i3) will be
    RGBA, GBAR, BARG, ARGB
  • Fragment Program for each of these cases
  • Computes gradient/classification/shading per
    pixel.

17
Challenge 2 Excessive Overdraw
  • Use color masks to cycle through channel/buffers
  • Arrange z-coefficients in the 4 color components
  • Assuming ordering of R,G,B,A,R,G,B,A,R,G,B,A
  • Possible orderings for (I,i1,i2,i3) will be
    RGBA, GBAR, BARG, ARGB
  • Fragment Program for each of these cases
  • Computes gradient/classification/shading per
    pixel.

18
Challenge 2 Excessive Overdraw
  • Use color masks to cycle through channel/buffers
  • Arrange z-coefficients in the 4 color components
  • Assuming ordering of R,G,B,A,R,G,B,A,R,G,B,A
  • Possible orderings for (I,i1,i2,i3) will be
    RGBA, GBAR, BARG, ARGB
  • Fragment Program for each of these cases
  • Computes gradient/classification/shading per
    pixel.

19
Challenge 2 Excessive Overdraw
  • Use color masks to cycle through channel/buffers
  • Arrange z-coefficients in the 4 color components
  • Assuming ordering of R,G,B,A,R,G,B,A,R,G,B,A
  • Possible orderings for (I,i1,i2,i3) will be
    RGBA, GBAR, BARG, ARGB
  • Fragment Program for each of these cases
  • Computes gradient/classification/shading per
    pixel.

20
Challenge 3 Shading empty regions
  • Empty Space Skipping Implicit for Splatting.
  • But, slice-based splatting on GPU?

All this area is useless but it is will be
processed with expensive shading / compositing
anyway
  • Utilize the early-z culling GPU optimization to
    disable processing of empty space
  • NOTE All temp buffers share the SAME depth
    buffer. Use aux buffers as multiple surfaces of
    the same p-buffer

21
Challenge 3 Shading empty regions
  • Early z-culling with GL_DEPTH_RANGE_TEST
  • Eliminates fragments of depth out-of allowed
    range
  • We use the depth buffer to tag newly splatted
    pixels

Frame Buffer
Depth Buffer
22
Challenge 3 Shading empty regions
  • Early z-culling with GL_DEPTH_RANGE_TEST
  • Eliminates fragments of depth out-of allowed
    range
  • We use the depth buffer to tag newly splatted
    pixels

Frame Buffer
Depth Buffer
Different shades of grey represent different
slices. We set DEPTH_RANGE to allow only current
slice.
23
Challenge 4 Shading of opaque regions
  • Early splat elimination applied by culling
    splats that project to opaque regions of the image

Useless Processing
Actual slice contribution
  • Utilize the early-z culling GPU optimization to
    disable processing of opaque image regions

24
Challenge 4 Shading of opaque regions
  • Early z-culling with normal depth-test
  • Eliminates fragments with 0 depth (use of
    hierarchical z-buffer also eliminates whole quads
    if small enough)
  • We set the depth0 for all opaque fragments in
    the compositing program.Yes, we do have to read
    the image buffer as a texture

Frame Buffer
Depth Buffer
25
Overall Inefficiencies Improvements
  • Bucket tossing ?done on CPU up to 0.1 sec
  • Overdraw from slicing points to image aligned
    buffers
  • Treat RGBA channels as buffers ? 300 faster
  • Processing of empty regions in
    shading/compositing
  • Early-z-culling for empty-space-slipping ? 30
    gains
  • Splatting and Processing of opaque regions
  • Early-z-culling for early-splat-elimination ?
    200 gains

26
Overall Inefficiencies Improvements
  • Early-z extension temperamental on NVidia boards
  • Avoid clearing the z-buffer
  • Avoid changing z-test direction (GL_GREATER
    ??GL_LESS)
  • Cannot write to depth component in fragment
    program
  • BUT using GL_DEPTH_RANGE_TEST seems to allow
    this!

27
Results
Volume Size 128x128x128, Image size 400x400
Semi-transparent FPS 7.2 Iso-surface FPS 7.6
28
Results
Volume Size 128x128x128 Image size 400x400
Semi-transparent FPS 4.9 Head, BCC FPS 3.1
Volume Size 256x256x128, Image size 400x400
Iso-surface FPS 5.1 BCC FPS 5.3
Volume Size 128x128x128, Image size 400x400
FPS 9
29
Conclusions
  • We have provided a GPU accelerated implementation
    of the Image Aligned Splatting technique
  • Addressed main inefficiencies
  • Vertex traffic
  • Excessive overdraw
  • Empty-space leaping
  • Early splat elimination
  • High quality at acceptable frame-rates
  • Not faster than 3D slicing based approaches

30
Current work Splatting irregular volumes on GPU
  • Main primitive generalized to Ellipsoidal kernels
  • Ellipsoid expressed as rotated/scaled sphere
  • Slice ellipsoids with same sheet buffered
    approach
  • Use texture-mapping to rasterize kernel
  • Use similar Fragment programs for shading

31
Thank to
  • NSF CAREER grant
  • DOE
  • More to come on this project athttp//fytos.net/
    splatting

32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
Splatting irregular volumes on GPU
  • Splatting algorithm inefficiencies are the same
    with regular splatting
  • Cannot use the RGBA as separate slices approach
  • In irregular splatting we need to keep a separate
    weight buffer during the splatting phase and then
    use it for normalization.
  • We use depth-buffer and early-z-culling
  • All temp buffers are surfaces of the same
    pBuffer, in order to share the same depth-buffer.
  • Empty-space skipping and
  • Early splat elimination

36
Splatting irregular volumes on GPU
  • Volume data is first organized into a flat cubic
    structure
  • Provides a list of intersecting splats for every
    cell.
  • Cells are accessed front to back in pre-defined
    order, same way as rendering a fixed octree.
  • Render the associated Vertex-Array of each cell
  • At the beginning of the frame, all splat slicing
    parameters are computed. Result is
  • Initial slicing polygon, and advancing vectors
  • When a splat is processed, all vertices are
    transformed according to current-step and
    advancing vectors.
  • The polygon is then texture-mapped to the right
    kernel.

37
Splatting irregular volumes on GPU
  • Problem several splats will appear on different
    cells.
  • How do you keep track of the ones that are
    already being sliced?
  • CPU would just hold a status array
  • GPU is notorious for not having global variables
  • Solution Have the Current-Cell be a uniform
    variable, and every vertex will know whether it
    is his turn to draw, by comparing to his
    pre-computed starting-cell var.
Write a Comment
User Comments (0)
About PowerShow.com