Title: Shadow Volumes on Programmable Graphics Hardware
1Shadow Volumes on Programmable Graphics Hardware
- Stefan Brabec
- Hans-Peter Seidel
- MPI Informatik
2Overview
- Shadow Volumes
- Silhouette Generation in Hardware
- Motivation
- Our method
- Results
- Conclusion Future Work
3Shadow Volumes
- Crow77 Shadow algorithms for computer graphics
- Compute regions of shadow in 3D
- Object-space algorithm
- Per-pixel correct shadow information
- Cast shadows onto arbitrary receiver geometry
4Shadow Volumes
- Extend occluder polygons to form semi-infinite
volumes - Light source is center-of-projection
- Everything behind occluder is in shadow
occluder
light
lit
shadow region
5Shadow Volume Generation
- Trivial way
- One volume for each polygon
- Better
- Silhouette-based approach
- Need to check each edge with each light !
6Shadow Volume Generation
- An edge is a silhouette edge if it is an edge
shared by a front-facing and back-facing
triangle/polygon
Light
Light
n2
n1
n2
FF
BF
FF
FF
n1
Silhouette Edge (v0,v1) !
No Silhouette Edge !
7Shadow Volume Rendering
- Stencil-based shadow volumesHeidmann 91 Real
shadows real time - count in-out events using the stencil buffer
8Shadow Volume Rendering
- Number of problems
- near/far clipping plane
- fill rate
- non-closed volumes
- See work from Everitt/Kilgard !
-
9Motivation
- From some OpenGL discussion forum
10Motivation
- Silhouette detection is not trivial for complex
vertex shaders
11Our method
- Use graphics hardware to computesilhouettes
- State-of-the-art hardware allows computation in
object space - Floating point calculations
- Floating point textures
- Powerful, programmable vertex andfragment
processing units -
12Input Data
- Supported meshes
- Closed objects (no open edges)
- Two triangles meet at one edge
ExampleThis is not a closed mesh, but since we
are only focusing on one edge of this mesh its ok.
13Coordinate Transformation
- First step
- Need to transform all geometry to a
globalcoordinate system - Also need all light sources in this system
- Common choices
- World space
- View-independent
- Eye Space
- View-dependent ok for fully dynamic scenes with
moving viewer - Object Space
- Need to transform light sources to this
space(done by most CPU approaches)
14Transform to world/eye space
- Unique identifier per vertex
- Transform each vertex to world/eye space
- render mesh as points
- Store position at index slot
P1
P0
P2
P3
P4
15Transform to world/eye space
- Vertex program for transform / index store
- Simple for standard modelview transformation
- More complex programs modified by simple
source-to-source code transformation - Eliminate code relevant for additional
attributes(color, texcoords, etc.) - Replace output position register by attribute
registerresult.position -gt result.texcoord0 - Add code to move vertex index to output
positionMOV result.position, vertex.texcoord..
!!ARBvp1.0 ATTRIB iVertexPos
vertex.position ATTRIB iVertexIdx
vertex.texcoord0 PARAM mv4
state.matrix.modelview OUTPUT oPos
result.position OUTPUT oDumpPos
result.texcoord0 Transform the vertex to eye
coordinates. DP4 oDumpPos.x, mv0,
iVertexPos DP4 oDumpPos.y, mv1,
iVertexPos DP4 oDumpPos.z, mv2,
iVertexPos DP4 oDumpPos.w, mv3,
iVertexPos Vertex position is its index
(x,y,0,1) MOV oPos, iVertexIdx END
16Transform to world/eye space
- Storing the result
- Need full floating point precission
- RGBA float textures (RGBA x,y,z,w)
- One global texture to store all positions
- Index numbering is pre-processing step
- Multiple instances of the same object need
separateindex slots (use index offset) - Use per-object culling (in light view) to reduce
numberof vertices - Mapping indices to 2D (idx width, idx / width)
overcomestexture size limitation - 1D 2048 vertices
- 2D over 4M vertices ! (max 2D texture size
2048x2048)
17Process edges
- Mesh connectivity is known
- Computed in pre-processing step
- Each edge has unique identifier (index)
- Two vertices for the edge
- Two vertices for adjacent triangles
18Process edges
- Render point for each edge
- Edge index defines output position
- Assign four vertex indices as attributes
P1_idx
P0_idx
glBegin(GL_POINTS) glAttrib(P0_idx, P3_idx,
P1_idx, P4_idx) glVertex1f(E0_idx) glEnd()
E0_idx
P3_idx
P4_idx
19Process Edges
- Vertex shader
- Not needed during this stage(pass-through all
attributes) - Fragment shader
- Used to compute silhouette flag
- Position of light sources as global parameters
- Use position texture of previous pass
20Process edges
E0_idx
4 texture lookups for world space positions
P0_idx
P0
input data
P3_idx
P3
registers
P1
front back facing ?
P1_idx
P1
P0
N1 (P3P0)x(P1P0)
P4_idx
P4
P3
N2 (P4P0)x(P3P0)
P4
21Silhouette Detection
- What weve got so far
- One texture holding all world/eye spaceposition
(wh vertices) - One texture holding all silhouette flags forthe
edges (wh edges) - Theres also an additional flag (bit) for the
vertexordering (please see paper for details) - Next step
- Use this information to extrude rendershadow
volumes
22Rendering Shadow Volumes
for all lights for all edges if (is
silhouette edge for light i) get
edges vertices render extruded quad
23Rendering shadow volumes
- Vertex shader tex lookup
- Best solution, but not (yet) supported
- Read back textures
- Trivial solution, but not very fast
24Rendering shadow volumes
- Better keep all data on the card !
- Instead of storing edge flags, storeall quad
information
E0 silhouette
E0 silhouette extrude
quad texture
25Rendering Shadow Volumes
- Quad texture
- Instead of rendering one point per edge,render 4
points (line, 2x2 point) - XYZ-components used for world space position
(edge vertices) - W-component for silhouette/extrude flag
- Can be used as bitmask (number of lights)
26Rendering Shadow Volumes
- Quad texture used as vertex array
glBegin(GL_QUADS) glArrayElement(0) glArrayEleme
nt(1) glArrayElement(2) glArrayElement(3)
Vertex Array
P_xyz IN_xyz E IN_w if(E silhouette)
if(E extrude) out P-L else
out P else out somewhere outside view
27Rendering Shadow Volumes
- Problem
- Information stored as texture image
- Need information in vertex path
- Solution
- New OpenGL extension ARB_superbuffersallows
more generic memory objects
28 ARB_superbuffers
- General purpose memory objects
- Use the same memory object in different parts of
the pipeline - as a texture
- as a vertex array
- as a render target
glAllocMem2D(fmt, w, h, )
memory block
glVertexArrayMem( GL_VERTEX_ARRAY, 4, mem,
0)
glAttachMem( GL_DRAW_FRAMEBUFFER,
GL_AUX0, mem)
glAttachMem( GL_TEXTURE_2D, GL_IMAGES,
mem)
29 ARB_superbuffers
- In our application
- Create memory object for quad texture(widthheigh
t 4 edges) - During edge processing, use memory object as
render target - During shadow volume rendering, usememory object
as vertex array - Very fast, since objects are used byreference
(no data is copied)
30Summary
- Silhouette Algorithm
- Pre-process meshes
- Number all vertices (V)
- Number all edges (E)
- Several instances of the same object
neededge/vertex offsets - Compute position texture
- Render one point per vertex
- 4-component float texture for V vertices
- Compute quad texture
- 4 pixels for each edge
- check front/back facing condition for a number of
lights(fragment shader) - 4-component float texture for E 4 entries
31Summary
- Silhouette Algorithm (cont.)
- Assign quad texture as vertex array
- ARB_superbuffer
- Render shadow volumes (for each light)
- Send down E quads (E 4 array indices)
- Vertex shader checks silhouette/extrude caseIf
silhouette flag is false, move quads
verticesway outside of view frustum (early
clip)Otherwise, pass through vertices if extrude
flagis false, or extrude vertices to infinity
32Summary
- Execution of different stages
- Position texture needs to be computedwhen
objects change - Quad texture needs to be re-computedwhen light
position or objects change - Selective update
- E.g. only recompute position for those objects
that changed, no need to redothe complete texture
33Results
34Conclusions
- Silhouette detection in hardware
- No frame-to-frame work for CPU
- All dynamic data remains on the graphics card !
- Works with custom vertex shaders
- Deformation is no longer a problem
- Processes a number of light sources inparallel
(flag bitmask) - Full hardware shadow volumes implementation
- More CPU resources for non-graphics work
- No graphics / CPU sync requiered
35Future Work
- Shadow volume fill-rate problem
- Optimizations during edge processing
- Intersect with large occluder polygons (walls,
floor, etc.) - Reduce geometry work
- Work with connected primitives (quad strips)
- More general input geometry
- Meshes with open edges ?
- Other applications
- non-photorealistic rendering
36Thank you !