Title: Visibility Culling
1Visibility Culling
- Roger A. Crawfis
- CIS 781
- The Ohio State University
2Interactive Frame Rates Are Difficult To Achieve
3The Problem
- Two keys for an interactive system
- Interactive rendering speed too many polygons
difficult!! - Uniform frame rate varied scene complexity
difficult!!
4Possible Solutions
- Visibility Culling back face culling, frustum
culling, occlusion culling (might not be
sufficient) - Levels of Detail (LOD) hierarchical structures
and choose one to satisfy the frame rate
requirement
5LOD Selections
How to pick the Optimal ones??!!
6Occlusion Culling
- Hidden Surface Removal methods are not fast
enough for massive models on current hardware - Occlusion Culling avoids rendering primitives
that are occluded by another part of the scene - Occlusion Culling techniques are ideally output
sensitive runtime is proportional to the size
of exact visibility set
7Related Work
- Hierarchical Z-Buffer
- Image space occlusion culling method Greene93
- Build a layered Z-pyramid with a different
resolution of the Z-buffer at each level - Allows quick accept/reject
- Hierarchical LODs
- Simplification Culling Approximate entire
branch of the scene graph by an HLOD - Can we use HLODs as occluders/occludees?
8Visibility in Games
- What do we need it for?
- Increase of rendering speed by removing unseen
scene data from the rendering pipeline as early
as possible - Reduction of data transfers to the graphics
hardware - Current games would not be possible without
visibility calculations
9Visibility methods
- 2 very different categories
- Visibility from a region (Portals, PVS)
- (Quake, Unreal, Severance and co.)
- Visibility from a point (Z-Buffer, BFC,...)
- Racing games, outdoor scenes, sports games etc.
10Point-Visibility Occlusion
- Traditionally used
- Back-Face culling
- Z-Buffering
- View frustum culling
- Octree
- Quadtree
11A PSX Example
- Iron Soldier 3 on PSX
- View frustum culling based on a quad-tree
- Back-face culling
- Painters algorithm
Only culling to the leftand right sides of
theviewing frustum.
12New Occlusion Methods
- Image-space occlusion culling
- Hierarchical Z-Buffering
- Hierarchical Occlusion Maps
- Object-space occlusion culling
- Hierarchical View Frustum culling
- Hierarchical Back-Face culling
13Visibility Culling
- We will look at these
- Hierarchical Back-face culling
- View-frustum culling
- Occlusion culling
- Detail culling
14Hierarchical Back-Face Culling
- Partitions each model into clusters
- Primitives in one cluster are
- Facing into similar directions
- Lie close to each other
- If the cluster fails the visibility test, all
primitives in this cluster are culled
15Hierarchical Back-Face Culling
16Normal Maps
- Create a data structure that places each polygon
in the space according to its normal direction. - Partition this space and then simply look at
those partitions that might have visible polygons.
phi
theta
17View-Frustum Culling
- Remove objects that are outside the viewing
frustum
Mostly done in Application Stage
18View-Frustum Culling
- Culling against bounding volumes to save time
- Bounding volumes AABB, OBB, Spheres, etc.
easy to compute, as tight as possible
Sphere
OBB
AABB
19View-Frustum Culling
- Often done hierarchically to save time
In-order, top-down traversal and test
20View-Frustum Culling
- Two popular hierarchical data structures BSP
Tree and Octree
Axis-Aligned BSP
Polygon-Aligned BSP
Intersecting?
21View-Frustum Culling
- A parent has 8 childrens
- Subdivide the space until the
- number of primitives within
- each leaf node is less than a
- threshold
- In-order, top-down traversal
22Hierarchical Z-Buffer
- Z-Buffer is arranged in an image pyramid.
- Scene is partitioned in an octree.
- Octree nodes are tested against the Z-Pyramid
where pixels have the same size. - Visible nodes serve as input for the next frame.
- Relies on HW visibility query.
23HZB/Hierarchical occlusion maps
24Hierarchical occlusion maps
- Potential occluders are pre-selected
- These occluders are rendered to the occlusion
map. The hierarchy can be built with MIP-Mapping
HW - Depth test after occlusion test
- Separate depth estimation buffer
25Hierarchical View Frustum Culling
- Speeds up VFC by testing only 2 box corners of a
bounding box first. - Plane coherency during frame advancing
- Test against VF-octants.
- BB-Child masking
26Detail Culling
- A technique that sacrifices quality for speed
- Base on the size of projected BV if it is too
small, discard it. - Also often done hierarchically.
Always helps to create a hierarchical structure,
or scene graph.
27Occlusion Culling
- Discard objects that are occluded
- Z-buffer is not the smartest algorithm in the
world (particularly for high depth- - complexity scenes)
- We want to avoid the processing of invisible
objects
28Occlusion Culling
- G input graphics data
- Or occlusion representation
- The problem
- algorithms for isOccluded()
- Fast update Or
OcclusionCulling (G) Or empty For each object
g in G if (isOccluded(g, Or)) skip g
else render (g) update (Or) end
End
29Hierarchical Visibility
- Object-space octree
- Primitives in a octree node are hidden if the
octree node (cube) is hidden - A octree cube is hidden if its 6 faces are hidden
polygons - Hierarchical visibility test
30Hierarchical Visibility (obj-sp.)
- From the root of octree
- View-frustum culling
- Scan conversion each of the 6 faces and perform
z-buffering - If all 6 faces are hidden, discard the entire
node and sub-branches - Otherwise, render the primitives here and
traverse the front-to-back children recursively -
A conservative algorithm why?
31Hierarchical Visibility (obj-sp.)
- Scan conversion the octree faces can be expensive
cover a large number of pixels (overhead) - How can we reduce the overhead?
- Goal quickly conclude that a large polygon is
hidden - Method use hierarchical z-buffer !
32Hierarchical Z-buffer
- An image-space approach
- Create a Z-pyramid
1 value
¼ resolution
½ resolution
Original Z-buffer
33(No Transcript)
34Hierarchical Z-buffer (2)
Keep the maximum value
35Hierarchical Z-buffer
update
Visibility (OctreeNode N) if (isOccluded (N,
Zp) then return for each primitive p in N
render and update Zp end for
each child node C of N in front-to-back order
Visibility ( C ) end
36Some Practical Issues
- A fast software algorithm
- Lack of hardware support
- Scan conversion
- Efficient query of if a polygon is visible
(without render it) - Z feedback
37Combining with hardware
- Utilizing frame-to-frame coherence
- First frame regular HZ algorithm (software)
- Remember the visible octree nodes
- Second frame (view changes slightly)
- Render the previous visible nodes using OpenGL
- Read back the Z-buffer and construct Z-pyramid
- Perform regular HZ (software)
- What about the third frame?
- Utilizing hardware to perform rendering and
Z-buffering considerably faster
38Hierarchical Occlusion Map
Zhang et al SIGGRAPH 98
39Basic Ideas
- Choose a set of graphics objects from the scene
as Occluders - Use the occluders to define an Occlusion Map
(hierarchically) - Compare the rest of scene against the occlusion
map
40Example
Blue Occluders Red Occludees
41Algorithm Pipeline
422-Step Occlusion Test
- Overlap Test
- Overlap Test
Overlap Depth Occlusion
43Why decomposition?
- The occlusion test is done approximately
(conservatively) - We can afford to be more conservative in depth
test than overlap test
44Why Decomposition?
45Overlap Test Occlusion Map
- Representation of projection for overlap test
occlusion map - A gray scale image each pixel represents one
block of screen region - Generate by rendering occluders
46Occlusion Map (OM)
- Each pixel of the occlusion map has an opacity,
which represents the ratio of the sum of the
opaque areas in the block to the total area. - If fully covered, p 1, if anti-alised pixel, p
lt1) - Occlusion map the alpha channel of an image
47Overlap Test using OM
For each potential occludee, we can scan-convert
it and compare against the opacity of the pixels
it overlaps Expensive!!
- Conservative Approximation use the screen-space
- bounding box of the occludee (a superset of
the actual - covered pixels)
- If all the pixels inside the bounding box are
opaque, - the object is occluded.
48Hierarchical Occlusion Map
Like hierarchical Z-buffer, we can create a
hierachy to speed up the comparison (for large
objects)
The low resolution pixel is an average of the
high resolution pixels
49Overlap Test using HOM
Basic Algorithm
- Start from the lowest resolution
- If the pixel cover the bounding
- rectangle has a value 1,
- the object is occluded
- Otherwise traverse down the
- hierarchy
- If all children 1 occluded
- If all children 0 not occluded
- Otherwise, traverse down further
50Approximate Overlap Test
- Instead of concluding an object is occluded only
when the bounding box is within pixels with
opacity 1, we can use an threshold between 0,1 - Early termination in the high level of the
hierarchy - What does it mean when a block has high opacity
but not one?
This is the unique feature of HOM !!
51Depth Test
- Approximate Z (depth) test
- A single Z Plane
A single Z plane to separate the occluders
from occludees.
52Depth Test
- Break the screen into small regions
- Build at each frame
- Instead of using Z-buffer, use
- the occluders bounding
- volumes farthest Z
- Compare each potential
- occludees nearest Z (con-
- servative test)
53Occluder Selection
Ideal occluder the visible objects its a
joke View-dependent occluder too expensive
Solution
Estimate and build an occluder database
Discard objects that do not server as
good occluders
54Occluder Selection
- Size not too small
- Redundant detail polygons (clock on the wall)
- Complexity Complex polygons are not preferred
(why?) - Done at run time sort the occluders in depth,
add them in order until reach the polygon count.
55OPS
- View-independent Occluders
X
Z
56OPS
57Occludders
- In practice, use traditional, static LODs
- More restrictive view-independent OPS
- Well-studied and available
- Low run-time overhead
- Shared with final rendering, no extra memory
- Area-preserving Erikson 98
58Occluder selection
- At run time
- Distance-based selection with a polygon budget
- Temporal coherence
- Visibility sampling
- Pre-compute visible objects on a 3-D grid
- Facilitates run-time selection
59Implementation
Occluder
Rendering
LOD
Selection
View
Scene
Frustum
Build Occlusion
Database
Culling
Representation
Occlusion
LOD
Culling
60Results
61Results
- The city model
- 312,524 polygons
- Single CPU
- 5,000 occluder polygons
- Depth estimation buffer
- Opacity thresholds 1.0
- Lighting display lists no triangle strips
62Results
63Results
64Results
- Auxiliary Machine Room (AMR)
65Results
- AMR
- 632,252 polygons
- 3 CPUs
- 25,000 occluder polygons
- No-background z-buffer
- Approximate culling (0.85 for level 64x64)
- LOD
- Lighting display lists no triangle strips
66Results
67Results
68Results
69Results
70Results
- The power plant model
- 15 million triangles
- 3 CPUs
- Visibility pre-processing on a 20x20 grid
(15min) - No-background z-buffer
- 18,000 occluder polygons
- opacity thresholds from 0.85 and up
- LOD
71Results
72(No Transcript)
73Conclusion
- Goals achieved
- Generality
- Any model, any occluder
- Occluder fusion
- Speed-up
- Accelerate interactive graphics
- Ease of implementation
- Configurability
- Robustness
74HP hardware occlusion
- Extend OpenGL add an OCCLUSION_MODE
- The bounding box of an object is scan converted
- A flag is set if any pixel of the BB faces is
visible - Only need to read back one flag, instead of the
entire frame buffer - Tradeoff valuable rendering time is used to
render useless BB faces (need to be used wisely) - Reportedly 25-100 speedup were observed
75The Real World
- Scientific approaches often too complicated
- Science often uses models with hundreds of
thousands of vertices, games dont. (LOD) - Game developers pick ideas from different
algorithms - Research has impact on hardware design!
76Gaming Industry
- Parts of the Hierarchical Z-Buffer (HZB) are used
sometimes - Runtime-LOD is used as input for a simple HZB
- View Frustum Culling (VFC) is almost always used.
- Hierarchical Occlusion Maps introduce too much
overhead for games, and the z-buffer is there
anyway
77The Real World (3)
- PSX-One doesnt even have a z-buffer
- ATIs Radeon has parts of a HZB (Called Hyper-Z)
- GForce2 only has a z-buffer
- GForce3 similar to Radeon, but supports HZB
visibility query - Dreamcasts Power-VR2 works pretty different
(Infinite planes)
78Conclusions
- Visibility algorithms are used in many different
applications - Occlusion culling
- Shadow calculations
- Radiosity
- Volumetric lights
- All these fields benefit from advances in
visibility techniques
79(No Transcript)
80Recap
- Visibility culling dont render what cant be
seen - Off-screen view-frustum culling
- Z-buffered away occlusion culling
- Cells and portals
- Works well for architectural models
- Teller accurate, complex, a bit slow
- pfPortals fast, cheap, easy
81Hierarchical Z-Buffer
- Q What do you think this is?
- Replace Z-buffer with a Z-pyramid
- Lowest level full-resolution Z-buffer
- Higher levels each pixel represents what?
- A Maximum distance of geometry visible to the
four pixels underneath it - Q How is this going to help?
82Hierarchical Z-Buffer
- Idea test polygon against highest level first
- If polygon is further than distance recorded in
pixel, stop--its occluded - If polygon is closer, recursively check against
next lower level - Amounts to hierarchical rasterization of the
polygon, with early termination - Must update higher levels as we go
83Hierarchical Z-Buffer
- Z-pyramid exploits image-space coherence polygon
occluded in one pixel is probably occluded nearby - HZB also exploits object-space coherence
polygons near an occluded polygon are probably
occluded - Q How might you use object-space coherence?
84Hierarchical Z-Buffer
- Subdivide scene with an octree
- All geometry in an octree node is contained by a
cube - Before rendering the contents of a node, render
the faces of its cube - If cube faces are occluded, ignore the entire
node - Query Z-pyramid to render cubes
85Hierarchical Z-Buffer
- Exploit temporal coherence (What?)
- HZB operates at max efficiency when Z-pyramid is
already built - Idea most polygons affecting Z-buffer (nearest
polygons) are the same from frame to frame - So start by rendering the polygons (octree nodes)
visible last frame
86Hierarchical Occlusion Mapsstolen by Dave Luebke
from thePh.D. Defense presentation of
- Hansong Zhang
- Department of Computer Science
- UNC-Chapel Hill
87Visibility Culling
- Discard objects not visible to the viewer
View-frustum culling
Back-face culling
View
View Frustum
Point
Occlusion culling
88Hierarchical Occlusion Maps Overview
Blue parts occluders Red parts occludees
89Effective Algorithms
- Generality
- Arbitrary models
- Speed-up
- Significant, fast culling for interactive
graphics - Portability
- Few hardware assumptions
- Robustness
90Thesis Statement
- By properly decomposing the occlusion-culling
problem and efficiently representing occlusion,
we can obtain effective algorithms and systems
for occlusion culling.
91Observations
- Want to handle cumulative occlusion
A
B
View Point
92Observations
- Want an occlusion representation (OR)
- Fast to compute
- Fast to use
A
B
View Point
93Observations
- Progressive occlusion culling
- Initialize OR to null
- for each object
- Occlusion test against OR
- if culled
- Discard object
- else
- Render object
- Update OR
94Observations
- Multi-pass occlusion culling
- Initialize OR to null initialize PO to empty
- for each object
- Occlusion test against OR
- If culled
- Discard object
- else
- Render object
- Add object to PO
- if PO is large enough
- Update OR with objects in PO
The set of potential occluders
95Observations
- Special case one-pass occlusion culling
- Select occluders until PO is large enough
- Update (build) occlusion representation
- Occlusion culling final rendering
96Problem Decomposition
97Problem Decomposition
- Verifying occlusion
- Overlap tests
- Based on representations for projection
- Depth tests
- Based on representations for depth
98Occlusion Maps
Rendered Image
Occlusion Map
99Occlusion Maps
- An occlusion map
- Corresponds to a screen subdivision
- Records average opacity for each partition
- Can be generated by rendering occluders
- Record pixel opacities (pixel coverage)
- Merge projections of occluders
- Represent occlusion in image-space
100Occlusion Map Pyramid
64 x 64
32 x 32
16 x 16
101Occlusion Map Pyramid
102Occlusion Map Pyramid
- Analyzing cumulative projection
- A hierarchy of occlusion maps (HOM)
- Made by recursive averaging (low-pass filtering)
- Record average opacities for blocks of pixels
- Represent occlusion at multiple resolutions
- Construction accelerated by hardware
103Overlap Tests
- Problem is the projection of tested object
inside the cumulative projection of the
occluders? - Cumulative projection of occluders the pyramid
- Projection of the tested object
- Conservative overestimation
- Bounding boxes (BB)
- Bounding rectangles (BR) of BBs
104Overlap Tests
- Given HOM pyramid the object to be tested
- Compute BR and the initial level in the pyramid
- for each pixel touched by the BR
- if pixel is fully opaque
- continue
- else
- if level 0
- return FALSE
- else
- descend...
105Overlap Tests
- Evaluating opacity early termination
- Conservative rejection
- Aggressive approximate culling
- Predictive rejection
106Conservative Rejection
- A low-opacity pixel does not correspond to many
high-opacity pixels at finer levels - The transparency threshold
1
1
1
1
1
0.8
1
1
0.9
0.9
0.1
0
0.2
0.3
0
0
107Aggressive Approximate Culling
- Ignoring barely-visible objects
- Small holes in or among objects
- To ignore the small holes
- LPF suppresses noise holes dissolve
- Thresholding regard very high opacity as fully
opaque - The opacity threshold the opacity above which a
pixel is considered to be fully opaque
108Aggressive Approximate Culling
109Aggressive Approximate culling
- Further descent not necessary when fully opaque
- Tests terminated before holes are reached
- Need different opacity thresholds for each level
110Predictive Rejection
- Terminate the test knowing it must fail later...
111Summary Levels of Visibility
- The continuum between being visible and
non-visible
Occlusion Maps
Potential Occludees
Almost visible Almost non-visible
Almost transparent (low opacity) Almost
opaque (high opacity)
112Resolving Depth
- Whats left of the occlusion test?
A occludes B As projection contains Bs ?
B
A
Another interpretation...
113Resolving Depth
- Depth representations
- Define a boundary beyond which an object
overlapping occluders is definitely occluded - Conservative estimates
- A single plane
- Depth estimation buffer
- No-background z-buffer
114A single plane
- at the farthest vertex of the occluders
Image plane
The plane
Occluders
The point with nearest depth
Viewing direction
This object passes the depth test
A
115Depth Estimation Buffer
- Like a low-res depth buffer
- Uniform subdivision of the screen
- A plane for each partition
- Defines the far boundary
- Updates (i.e. computing depth representation)
- Occluder bounding rectangle at farthest depth
- Depth tests
- Occudee bounding rectangle at nearest depth
116Depth Estimation Buffer
Transformed view-frustum
D. E. B.
Image plane
Bounding rectangle at farthest depth
Bounding rectangle at nearest depth
Viewing direction
B
Occluders
A
117Depth Estimation Buffer
- Trade-off
- Advantages
- Removes need for strict depth sorting
- Speed
- Portability
- Disadvantages
- Conservative far boundary
- Requires good bounding volumes
118No-Background Z-Buffer
- The z-buffer from occluder rendering...
- is by itself an full occlusion representation
- has to be modified to support our depth tests
- Removing background depth values
- Replace them the foreground depth values
- Captures the near boundary
119No-Background Z-Buffer
Transformed view-frustum
Image plane
D. E. B
Occluders
N. B. Z
Viewing direction
A
Objects passing the depth tests
120No-Background Z-Buffer
- Trade-off
- Advantages
- Captures the near boundary
- Less sensitive to bounding boxes
- Disadvantages
- Assumes quickly accessible z-buffer
- Resolution same as occlusion maps (however)
121Occluder Selection
- Occlusion-preserving simplification (OPS)
- Run-time selection
- Visibility pre-processing
122OPS
X
Z
123OPS
124OPS
- In practice, use traditional, static LODs
- More restrictive view-independent OPS
- Well-studied and available
- Low run-time overhead
- Shared with final rendering, no extra memory
- Area-preserving Erikson 98
- Conservative OPS (COPS)...
125Occluder selection
- At run time
- Distance-based selection with a polygon budget
- Temporal coherence
- Visibility sampling
- Pre-compute visible objects on a 3-D grid
- Facilitates run-time selection
126Implementation
Occluder
Rendering
LOD
Selection
View
Scene
Frustum
Build Occlusion
Database
Culling
Representation
Occlusion
LOD
Culling
127Implementation
OccSelN1
OccSelN2
OccSelN3
FinalDrawN
OccDrawN1
OccDrawN
FinalDrawN1
OccDrawN2
CullN1
128Implementation
- Uses bounding volume hierarchy
- Active layers of the pyramid 4x4 - 64x64
- Resolutions
- Occluder rendering - 256x256
- D. E. B. - 64x64
- Test platforms
- SGI Onyx II, 4 195Mhz R10000, InfiniteReality
- SGI Onyx I, 4 250MHz R4400, InfiniteReality
129Results
130Results
- The city model
- 312,524 polygons
- Single CPU
- 5,000 occluder polygons
- Depth estimation buffer
- Opacity thresholds 1.0
- Lighting display lists no triangle strips
131Results
132Results
133Results
- Auxiliary Machine Room (AMR)
134Results
- AMR
- 632,252 polygons
- 3 CPUs
- 25,000 occluder polygons
- No-background z-buffer
- Approximate culling (0.85 for level 64x64)
- LOD
- Lighting display lists no triangle strips
135Results
136Results
137Results
138Results
139Results
- The power plant model
- 15 million triangles
- 3 CPUs
- Visibility pre-processing on a 20x20 grid
(15min) - No-background z-buffer
- 18,000 occluder polygons
- opacity thresholds from 0.85 and up
- LOD
140Results
141(No Transcript)
142Conclusion
- Goals achieved
- Generality
- Any model, any occluder
- Occluder fusion
- Speed-up
- Accelerate interactive graphics
- Ease of implementation
- Configurability
- Robustness
143Conclusion
- Main contributions
- Problem decomposition
- Overlap tests and depth tests
- Occlusion representations
- Occlusion maps
- Depth Estimation Buffer
- No-Background Z-Buffer
144Conclusion
- Main contributions
- Hierarchical occlusion maps
- Analysis of occlusion at multiple resolutions
- High-level opacity estimation
- Aggressive approximate culling
- Levels of visibility
- The first occlusion culling algorithm for general
models and interactive 3-D graphics
145Future Work
- Other implementations...
- PCs and games
- How much can be done in software?
- Integration into hardware
- More progressive updates to occlusion
representation - Less conservative culling
- Wide-spread use of occlusion culling
146Early Splat Elimination
- Need splat visibility test
- a voxel is only visible if the volume material in
front is not opaque
screen
occluded voxel does not pass visibility test
wall of occluding voxels
occlusion map opacity image
147Visibility Test - Naive
- Check opacity of every pixel within footprint
- number of pixels to be checked is large
voxel footprint
opaque area
voxel kernel
opacity buffer
148Visibility Test - Efficient
IEEE Trans. Vis. and Comp. Graph. 99
- Compute occlusion map after each sheet-buffer
compositing
project
do not project
opacity ? threshold
opacity lt threshold
occlusion map
opacity 0
149Early Splat Elimination - Results