Title: A Dynamic Adaptive Multiresolution GPU Data Structure Adaptive Shadow Maps, Octree 3D Paint, Adaptiv
1A Dynamic Adaptive Multi-resolution GPU Data
StructureAdaptive Shadow Maps, Octree 3D Paint,
Adaptive PDE Solver
- Aaron Lefohn
- University of California, Davis
2Problem Statement
- Goal
- Dynamic, adaptive, multi-resolution GPU data
structure - Efficient read, write, structure change
- Adaptive shadow maps, octree 3D paint, adaptive
PDE solver - Challenges
- All operations must be data-parallel
- Trees difficult to update and cause incoherent
accesses - Solution
- Leverage virtual memory research from
architecture - Page-table based structure
- Decouple levels of indirection from resolution
levels - Easy implementation with the Glift template
library
3Collaborators
- Joe KnissUniversity of Utah
- Robert StrzodkaCAESAR Research Institute
- Shubhabrata SenguptaUniversity of California,
Davis - John OwensUniversity of California, Davis
4Assumptions
- This talk heavily relies on the contents of the
Glift generic data structure talk
5Is This GPGPU Programming?
- Yes
- Inseparable mix of GPGPU stream programming and
traditional graphics - High-quality interactive rendering
- Updating complex GPU data structures
6Previous Work
- Binotto et al.
- Carr et al.
- Coombe et al.
- Ertl et al.
- Lefebvre et al.
- Purcell et al.
7Why A New Structure?
- Whats Missing?
- Fully GPU-based adaptive multi-resolution
structure - GPU based address translator
- GPU based updates of address translator
- Trilinear/Quadlinear mipmap filtering support
- Uniform, coherent memory accesses
8Applications
- Adaptive shadow maps
- Octree
- 3D paint
- Adaptive partial differential equation solver
- ...
9Adaptive Shadow Maps
Application
- Fernando et al., ACM SIGGRAPH 2001
- Elegant solution to shadow map aliasing
- Quadtree of small shadow maps
- Many recent (2004) shadow papers cite ASMs as
high quality solution but not possible on
graphics hardware
10ASM Data Structure Requirements
Application
- Adaptive
- Multiresolution
- Fast, parallel random-access read
- 2x2 native Percentage Closer Filtering (PCF)
- Trilinear interpolated mipmapped PCF
- Fast, parallel write
- Fast, parallel insert and erase
11Octree 3D Paint
- Problem
- Apply paint to non-parameterized surface
- Complex topology
- Implicit surface
- Solution
- Octree textures, brick maps, etc.
- Benson Davis and DeBry et al., SIGGRAPH 2002
12Octree 3D Paint Requirements
Application
- Adaptive
- Multiresolution
- Fast, parallel random-access read
- 3x3 native trilinear filtering
- Quadlinear interpolated mipmapping
- Fast, parallel write
- Fast, parallel insert and erase
13Adaptive PDE Solver
- WARNING Work in progress
- Problem
- Large 3D partial differential equation solvers
are slow - Solution
- Adaptive solver that focuses computation on
regions of interest - Octree simulation domain
- Losasso et al., SIGGRAPH 2004
14Adaptive PDE Solver Requirements
Application
- Adaptive
- Multiresolution?
- Fast, parallel neighborhood read
- Fast, parallel write
- Efficient stream processing of octree nodes
- Fast, parallel insert and erase
15GPU Dynamic, Adaptive Data Structure
- Three applications have nearly identical
requirements - Describe structure in 2D for ASM
16ASM Virtual Domain
(1,1)
(0,1)
(0,0)
(1,0)
17ASM Physical Domain
- Paged 2D texture memory
- All physical pages identical size (very
important!)
?
Physical Domain
Virtual Domain
18ASM Address Translator
Physical Domain
Virtual Domain
19ASM Address Translator
Application
- Start with page table
- Coarse, uniform discretization of virtual domain
- Very common in GPU structures
- LOTS of architecture literature
- O(N) memory, O(1) insert, O(1) computation, O(1)
eraseuniform consistency, partial mapping
(sparse)
20ASM Address Translator
Application
Physical Memory
Page Table
Virtual Domain
21ASM Data Structure Requirements
Application
- Adaptive
- Multiresolution
- Fast, parallel random-access read
- 2x2 native Percentage Closer Filtering (PCF)
- Trilinear interpolated mipmapped PCF
- Fast, parallel write
- Fast, parallel insert and erase
22ASM Address Translator
Application
- Adaptive Page Table
- Map multiple virtual pages to single physical page
Physical Memory
Virtual Domain
Page Table
23ASM Data Structure Requirements
Application
- Adaptive
- Multiresolution
- Fast, parallel random-access read
- 2x2 native Percentage Closer Filtering (PCF)
- Trilinear interpolated mipmapped PCF
- Fast, parallel write
- Fast, parallel insert and erase
24ASM Address Translator
Application
- Multiresolution Page Table
MipmapPage Table
Physical Memory
Virtual Domain
25ASM Data Structure Requirements
Application
- Adaptive
- Multiresolution
- Fast, parallel random-access read
- 2x2 native Percentage Closer Filtering (PCF)
- Trilinear interpolated mipmapped PCF
- Fast, parallel write
- Fast, parallel insert and erase
26ASM Data Structure Requirements
Application
- How support bilinear filtering?
- Duplicate 1 column and 1 row of texels in each
page - Mipmapped trilinear?
- By-hand interpolation between mipmap levels
27ASM Data Structure Requirements
Application
- Adaptive
- Multiresolution
- Fast, parallel random-access read
- 2x2 native Percentage Closer Filtering (PCF)
- Trilinear interpolated mipmapped PCF
- Fast, parallel write
- Fast, parallel insert and erase
28How Define ASM Structure in Glift?
Application
- Start with generic page table AddrTrans
- Use mipmapped PhysMem for page table
- Change template parameter to add adaptivity
- Write page allocator
- alloc_pages, free_pages
- Finally
- typedef PageTableAddrTransltgt PageTabletypedef
PhysMemGPUltvec2f, vec1sgt PMem2Dtypedef
VirtMemGPUltPageTable, PMem2Dgt VPageTabletypedef
AdaptiveMemltVPageTable, PageAllocatorgt ASM
29ASM Data Structure Usage
Application
- float4 main( uniform VMem2D asm,
- float3 shadowCoord,
- float4 litColor ) COLOR
-
- float isInLight asm.vTex2Ds( shadowCoord )
- return lerp( black, litColor, isInLight )
-
- asm.bind_for_read( )
- asm.bind_for_write( )
- asm.alloc_pages( )
- asm.free_page( )
30Adaptive Shadow Map Algorithm
Application
- Faithful to Fernando et al. 2001
- Refinement algorithm
- Identify shadow pixels w/ resolution mismatch
(GPU) - Compact pixels into small stream (GPU)
- CPU reads back compacted stream (GPU?CPU)
- Allocate pages
- Draw new PTEs into mipmap page tables (CPU?GPU)
- Draw depth into ASM for each new page (GPU)
31Stream Compaction
- Daniel Horn, GPU Gems II, ch. 36
32ASM Effective resolution 131,0722 (37 MB) SM
20482
Thanks to Yong Kil for the tree model
33Octree 3D Paint
Application
- 3D version of ASM data structure
- Differs from previous work
- Quadrilinear filtering
- O(1), uniform access
- Interactive witheffectiveresolutionsbetween643
and 20483 -
34Adaptive PDE Solver
- Work in progress
- Key feature is defining GPU iterators
- Iterator
- Vertex buffer object of quads (one per page)
- Create iterators with RTVA
35Demo
36ASM Performance Results
- Fernando Results
- 5 fps (asynchronous, incremental refinement)
- Fixed light
- 31K polys, 5122 image, 65K2 - 524K2 ASMs
- Our results
- 15-20 fps while moving camera including
refinement - 7-12 fps while moving light
- 45k polys, 5122 image, 131K2 ASM
- Lookup time compared to 20482 shadow map
- Bilinear filtered 90
- Trilinear filtered mipmapped 73
37Page Table Memory Coherency
- 1- and 2-level page tables bandwidth bound below
8 x 8 page - RGBA8 textures, NVIDIA GeForce 6800 GT, NVIDIA
driver 75.22, Cg 1.4a
38Data Structure Limitations
- Assume page-level coherency
- Page table memory consumption
- Trade more levels of indirection for memory
- Depth-limited tree
39Conclusions
- Dynamic adaptive multires data structure
- Coherent accesses if pages are larger than 8 x 8
- Decouple levels of indirection from levels of
resolution - Page table literature
- Continuum all the way from 1-level to full tree
- Based on assumption that accesses are coherent
within page
40Conclusions
- Adaptive Shadow Maps
- Interactive adaptive refinement
- Effective shadow map resolution up to 131,0722
- Octree 3D paint
- Interactive GPU-based octree 3D painting
- Effective paint resolution up to 20483
- Adaptive PDE solver
- Work in progress
41Acknowledgements
- Craig Kolb, Nick Triantos NVIDIA
- Fabio Pellacini Cornell/Pixar
- Adam Moerschell, Yong Kil UCDavis
- Serban Porumbescu, Chris Co, .
- National Science Foundation Graduate Fellowship
- Department of Energy
- Pixar Animation Studios
42More Information
- ACM SIGGRAPH Sketches 2005
- Dynamic Adaptive Shadow Maps
- Octree Textures on Graphics Hardware
- GPU Programming, Thursday, 145pm
- Upcoming ACM Transactions on Graphics paper
- Glift An Abstraction for Generic, Efficient
GPU Data Structures - Google Lefohn GPU