Title: The Ray Engine
1The Ray Engine
- University of Illinois Urbana-Champaign
- Nathan A. Carr, Jesse D. Hall, John C. Hart
2The Programmable GPU
- Programmability is the future of Graphics
Hardware - Many problems can be solved using Graphics
Hardware. - Emerging Research Question
- What role does the CPU play?
- What role does the GPU play?
3Why use the GPU?
- Is it faster? Does it scale?
Deep Pipeline
Process vertex
rasterize
Process fragment
Widen Pipeline
4The Ray Engine
5Ray Engine Models
- Existing Hardware Approach
- DirectX pixel shader 1.4
- Low precision 16 bit fixed point
- 5D ray-space partitioning data structure
- Future Hardware Approach via Simulation
- DirectX 9.0, OpenGL 2.0, Radeon 9700
- High precision 32 bit floating point
- Octree acceleration data structure
6Partitioning the Problem Domain
- Assume a SIMD model of computation for the GPU
- Let the CPU do what it does best
- Complex Branching
- Sorting ( traversal, batching of coherent items )
- Synchronous execution paths
- Let the GPU do what it does best
- SIMD computation ( intersection, shading )
- Parallel execution paths
- Keep both the GPU and CPU busy
- Avoid one stalling the other
7The Programmable Shading Crossbar
- Ray Intersection as a Crossbar
- Programmable Pixel Shading as a Crossbar
8SIMD GPU model
Single Instruction Multiple Data
- Texture Maps are arrays of data elements
- Fragment Program is an instruction
- Frame buffer holds the result of the SIMD
operation
m
Fragment Program
n
m
n
m
n
Frame Buffer (Result)
Screen Filling Quadrilateral
Texture Maps (Multiple Data)
Fragment Program (Single Instruction)
9Ray Engine Core
- Performs all pairs (N x M) ray-triangle
intersections
N rays to be queried
M triangles to be queried
RAY ENGINE CORE
hitRecords?gpuRayEngine(rays,triangles)
- N hit results
- Which triangle hit first (if any)
- Hit location ( barycentric coords )
10Ray Engine Core
normal
D
O
edge0
edge1
TEXTURES
GEOMETRY
normal edge0 edge1 vertex ID ( color)
Ray Origins
Ray Directions
Fragment Shader Program
Hit Location Triangle ID
Distance Along Ray that Triangle was hit
Frame Buffer
Z-Buffer
11Whats the Pay-Off
- Same computation can be done on the CPU
- Use partitioning scheme (BSP tree, Octree,
Bounding Volumes)
hitRecords?cpuRayEngine(rays,triangles)
12Finding N and M
- Answer depends on the scene, speed of GPU, and
speed of CPU. - Solution
- Use GPU processing only when we can collect N
rays that need to be intersected with M
triangles. For some N and M, where is is faster
to do GPU processing. - Use CPU processing in all other case
- Find ideal N and M experimentally.
13Octree Granularity
Example Choose M 10, N2
2
4
2
5
3
0
3
5
5
2
2
0
3
1
4
4
14CPU/GPU parallelism
- Both the CPU and GPU can be performing triangle
intersection tests at the same time. - Note A similar model may be taken to handle
shading, where the GPU and CPU work together, but
at different levels of granularity.
15Results
16Conclusions
- The GPU can be used to as a co-processor to
accelerate ray-casting and visibility queries.
8 - 52 speedup. - Performance improvements can be achieved by
interleaving and overlapping CPU and GPU
computation. - Asymmetric AGP bus transfer speeds greatly
inhibit the ability of the GPU and CPU to work in
cooperation.
17Future Work..
- More careful tuning of the CPU portion of the
Ray-Engine. - Run the Ray Engine on Real Hardware
- Test alternate schemes to speed up CPU
ray-traversal process - Implement SIMD GPU shading engine to work in
cooperation with the Ray Engine - Explore better methods for interleaving CPU and
GPU computation. - Asynchronous readback semantics
18CommentsQuestions?
- Thanks to
- NSF for project funding
- Nvidia, Matthew Papakipos
- Michael McCool
19Fragment Shader Example in OGL 2.0
- void main(void)
- uniform vec3 vert0, vert1, vert2
- vec3 origin, direction, edge1,edge2, tvec, pvec,
qvec, vec3 bary - float det
- origin texture( 0 , coords)
- direction texture( 1 , coords)
- edge1 vert1 - vert0
- edge2 vert2 - vert0
- pvec cross( dir, edge2)
- det dot( edge1, pvec )
- if( det gt -EPSILON det lt EPLISON )
- kill(0)
- float invDet 1.0f / det
- tvec origin - vert0
- bary.x dot(tvec, pvec) invDet
- if( bary.xlt0.0 bary.xgt1.0 )
- kill(0)
- qvec cross( tvec, edge1 )
- bary.y dot( direction, qvec) invDet
Look up ray origin and direction from textures
Kill the pixel if the ray does not hit the
triangle
Writing triangle ID and where the triangle was
hit to the fragment color
Writing the distance along the ray the triangle
was hit to z-buffer