Title: Hierarchical Penumbra Casting
1Hierarchical Penumbra Casting
- Samuli Laine
- Timo Aila
- Helsinki University of Technology
- Hybrid Graphics, Ltd.
2Introduction
- Rendering soft shadows
- As usual, area light sources are sampled with a
number of light samples - Multiple receiver points to be shaded
- The main problem is solving the visibility
- which light samples are visible to which receiver
points
3Whats Happening?
Light source
Shadow caster
Visible surface
4On the Scale of the Problem
- With R receiver points and L light samples there
are RL visibility relations to solve - For example, 1024768 resolution and 256 light
samples gives over 200 million relations - Ray casting is the usual solution for solving the
visibility relations - With T triangles, the cost of casting one shadow
ray is O(log T) - Total cost becomes O(RL log T)
5About Ray Casting
- The standard ray casting approach considers only
one ray at a time - This inevitably leads to linear performance with
respect to RL - However, this is highly flexible
- We need to generate only one ray at a time
- Sub-linear complexity with respect to T is
achieved by placing the triangles into an
acceleration hierarchy
6Transposing the Algorithm
- Goal sub-linear complexity w.r.t. RL
- Requires rearranging the rendering loop
Ray Tracingfor each receiver point r for
each light sample l find triangle that
blocks ray l ? r
linear to R linear to L sub-linear to T
Our Approachfor each triangle T find all l ?
r rays blocked by T
linear to T sub-linear to RL
7About Our Algorithm
- Sub-linear complexity with respect to RL is
achieved by placing the receiver points and light
samples into acceleration hierarchies - Therefore, all receiver points must be gathered
before computing the shadows - We process one triangle at a time
- Good no need for triangle BSP
- Bad linear complexity with respect to T
8About Our Algorithm, part 2
- The full rendering process goes as follows
- 1. Ray-trace or rasterize the image without
shadows to get the receiver points - 2. Build the acceleration structures for receiver
points and light samples - 3. Process all triangles to solve the visibility
relations between light samples and receiver
points - 4. Perform shading
9The Acceleration Structures
- Fixed three-level bounding volume hierarchy is
used for the light samples - Assuming a polygonal light source, bounding
volumes are actually bounding polygons - Standard bounding volume hierarchy is used for
the receiver points - Axis-aligned boxes as bounding volumes
10Light Sample Hierarchy
- Three levels
- All nodes have a bounding volume
Root node
Middle nodes
Leaf nodes
Entire light source
Light sample groups
Light samples
11Storing the visibility information
- A bit mask with L bits is assigned for every
receiver point - bit 0 light sample is visible
- bit 1 light sample is occluded
- Initially, all bits are zero
- When a triangle is found to occlude a light
sample from a receiver point, the corresponding
bit is set to one
12Penumbra Volumes
- All points where a triangle may block a ray from
a bounding volume are inside the corresponding
penumbra volume
Triangle
Penumbra volume
Bounding volume in light hierarchy
13Processing a Triangle
- First build penumbra volumes for all nodes in the
light sample hierarchy - For individual light samples (leaf nodes) these
become hard shadow volumes
14Processing a Triangle
- Traverse down the receiver point hierarchy
- Step 1 Test intersection between main penumbra
volume and bounding volume of receiver node
Triangle
Main penumbra volume
Bounding volume of entire light source
Receiver node
15Processing a Triangle
- Step 2 Update the list of active light sample
groups - At beginning of traversal, all groups are active
Triangle
Bounding volumes of light sample groups
Receiver node
16Processing a Triangle
- Step 3 Recurse into child nodes in receiver
hierarchy - With pruned list of active light sample groups
Triangle
Main penumbra volume
Bounding volume of entire light source
Child nodes
17Processing a Triangle
- Step 4 In leaf node, test receiver points vs.
hard shadow volumes of light samples - Update the visibility relation bits
Triangle
Light samples in active groups
Receiver points
18Summary of Recursion
- Traverse down the receiver point hierarchy
- Maintain list of active light sample groups
- Initially all groups are active
- First ensure that receiver node intersects the
main penumbra volume, terminate otherwise - Then prune the active light sample group list by
intersecting receiver node vs. penumbra volumes
of active light sample groups - In leaf node, test receiver points against hard
shadow volumes of remaining light samples
19Optimizations
- Umbra bits for early traversal termination
- With receiver hierarchy rebuilding to ensure
balance - Active plane sets
- Lazy penumbra volume and hard shadow volume
construction - On-demand bit mask allocation
- Coarse blocker sorting
20Extensions
- Multiple light sample sets
- To remove banding artifacts
- Alpha matte textures
- Often used in e.g. vegetation textures
- Adaptive antialiasing
- Volumetric light sources
21Results
- Compared against Mental Ray 3.2
- Benchmarked only the solving of the visibility
relations - For Mental Ray, computed both with and without
shadows and took the difference - More detailed results in the paper
22Grids
32K triangles256 light samples
Resolution Peak mem usage Speedup factor
12809600 058M 13.5
25601920 228M 16.7
23Flowers
903K triangles256 light samples
Resolution Peak mem usage Speedup factor
12809600 039M 3.5
25601920 154M 7.8
24Sponza
1.27M triangles256 light samples
Resolution Peak mem usage Speedup factor
12809600 062M 8.2
25601920 244M 11.4
25Results Analysis
- Sub-linearity with respect to R
- Increasing output resolution gives better
relative performance - Due to hierarchical processing of receiver points
- Sub-linearity with respect to L
- Using more light samples gives better relative
performance (results in the paper) - Due to using analytic penumbra volumes that
represent many light samples at once
26Results More Analysis
- Somewhat high memory usage
- Depends on the output resolution
- Depends on the complexity of the shadows
- Does not depend on the number of triangles in the
scene - New problem dependence on the spatial size of
light source - Penumbra volumes become larger
- Leads to lower performance
27Conclusions
- Nice properties
- Exactly the same result as with ray casting
- No need to store all triangles at any point
- Sub-linear dependence on output resolution and
number of light samples - Not so nice properties
- Linear dependence on triangle count
- Memory usage can be high
- Dependence on the spatial size of light source
28Future Work
- Process multiple triangles at a time?
- Could experiment with full light sample
hierarchy, which should (in theory) have better
performance
29Thank You
Funding National Technology Agency of Finland,
Bitboys, Hybrid Graphics, Remedy Entertainment,
Nokia, ATI