Title: A Hierarchical Shadow Volume Algorithm
1A Hierarchical Shadow Volume Algorithm
- Timo Aila1,2
- Tomas Akenine-Möller3
1Helsinki University of Technology 2Hybrid
Graphics 3Lund University
2Outline
- Brief intro to shadow volumes
- fillrate problem, existing solutions
- Our solution
- idea
- implementation
- Results
- QA
3Shadow volumes Crow77
- Shadow volumes define closed volumes of space
that are in shadow
infinitesimallight source
shadow caster light cap
dark cap
extrudedside quads
4Is point inside shadow volume?
- Pick reference point R outside shadow volume
- any such point is OK
- Span line from R to point to be classified
- Compute sum of enter (1) and exit (-1) events
P1
shadow volume
R
2D illustration
P2
P3
5Using graphics hardware
- R at 8 behind pixel (z-fail) BilodeauSongy,
Carmack - infinity always outside SVs robust
- must not clip to far plane of view frustum
- sum hidden events to stencil buffer,sign from
backface culling
visible samples (or pixels)
2D illustration
-
camera
R
-
view frustum
shadow volume
6Amount of pixel processing
Adapted from Chan and Durand 2004
7Fillrate problem
- 50 fps without shadows on ATI Radeon 9800XT at
1280x1024, 1 sample/pixel - 1 fps when shadow volumes rasterized
- 2.2 billion pixels per frame
8Existing solutions (1/2)
- CC shadow volumes Lloyd et al. 2004
- draw SVs only where receivers exist
- good when lots of empty space
- Hybrid shadow maps and volumes ChanDurand 2004
- use SVs only at shadow boundaries
- boundary pixels determined using shadow map
- artifacts due to limited shadow map resolution
9Existing solutions (2/2)
- Depth bounds Nvidia 2003
- application supplies min max depth values
separately for each shadow volume - rasterize shadow volume only when visible
geometry between min,max - optimal bounds hard to compute
camera
2D illustration
shadow volume
visible pixels
10Outline
- Brief intro to shadow volumes
- fillrate problem, existing solutions
- Our solution
- idea
- implementation
- Results
- QA
11Reference image
12Shadow volume algorithm executed once per 8x8
pixel tile
13Green tiles may contain shadow boundary - other
tiles were correct
14Low-res (gray) per-pixel computed boundaries
(dark)
15How to detect shadow boundaries?
- Two facts about shadow volumes
- always closed
- SV triangles mark potential shadow boundaries
- If 3D volume in scene not intersected by shadow
volume triangles - fully lit or fully in shadow
- single sample classifies entire volume
16Outline
- Brief intro to shadow volumes
- fillrate problem, existing solutions
- Our solution
- idea
- implementation
- Results
- QA
17Detecting boundary tiles
- Bound tile with axis-aligned bounding box
- 8x8 pixel region
- Zmin, Zmax
- Triangle vs. AA Box intersection test
- low-resolution rasterization
- Zmin and Zmax tests
18Fast update of non-boundary tiles
- Copy low-res shadows to stencil buffer
- writing 64 per-pixel values would be slow
- Two-level stencil buffer saves the day
- maintain Smin, Smax per tile
- always test the higher level first
- often no need to validate per-pixel values
- stencil values of non-boundary tiles are constant
19Implementation Stage 1
SV triangles
Low-res shadows
Boundary?
Low-resolution rasterizer
Per-tile operations
- Buffers built separately for each shadow volume
- Classifications ready when entire SV processed
- application marks begin/end of shadow volumes
20Implementation Stage 2
Boundary?
SV triangles
Low-resolution rasterizer
No
Copy to2-level stencil
boundary tile?
Yes
Per-pixel rasterizer
Stencil ops
Update 2-level stencil
21Alternative implementations
- Two pass
- Pass 1 Stage 1
- Pass 2 Stage 2
- How to keep pixel units busy during Stage 1?
- maybe assign per-tile operations to pixel
shaders? - Single pass
- Separate stages using delay stream Aila et al.
2003 - Stage 2 of current SV executes simultaneously
with next SVs Stage 1
22Hardware resources
- Two-level stencil buffer
- Per-tile operations
- Optionally
- delay stream
- duplicate low-res rasterizer Zmin/Zmax units
- cache for per shadow volume buffers
- multiple buffers for pipelined operation
- allocate from external memory
- If not already there for occlusion culling
purposes
23Outline
- Brief intro to shadow volumes
- fillrate problem, existing solutions
- Our solution
- idea
- implementation
- Results
- QA
24Results Simple scene (1280x1024)
25Results Knights (1280x1024)
26Results Powerplant (1280x1024)
27Summary
- Hierarchical rendering method for shadow volumes
- significant fillrate savings compared to other
hardware methods - also works for soft shadow volumes
- Future work
- would it make sense to extend programmability to
per-tile operations? - how many pipeline bubbles are created?
- requires chip-level simulations
28Thank you!
- Questions?
- Acknowledgements
- Ville Miettinen, Jacob Ström, Eric Haines, Ulf
Assarsson, Lauri Savioja, Jonas Svensson, Ulf
Borgenstam, Karl Schultz, 3DR group at Helsinki
University of Technology - The National Technology Agency of Finland, Hybrid
Graphics, Bitboys, Nokia and Remedy Entertainment - ATI for granting fellowship to Timo (2004-2005)